Monarch geneset OGS2.0

DPOGS207912
TranscriptDPOGS207912-TA1263 bp
ProteinDPOGS207912-PA420 aa
Genomic positionDPSCF300478 + 43075-56789
RNAseq coverage1256x (Rank: top 10%)
Annotation
HeliconiusHMEL0101634e-9288.40% 
BombyxBGIBMGA014456-TA2e-7082.64% 
Drosophilagalectin-PA2e-1337.19% 
EBI UniRef50UniRef50_E2B3654e-3947.59%Galectin-12 n=10 Tax=Aculeata RepID=E2B365_HARSA
NCBI RefSeqXP_392379.22e-4250.34%PREDICTED: similar to Galectin-4 (Lactose-binding lectin 4) (L-36 lactose-binding protein) (L36LBP) [Apis mellifera]
NCBI nr blastpgi|3800257903e-4150.34%PREDICTED: uncharacterized protein LOC100871074 [Apis florea]
NCBI nr blastxgi|3072150204e-4731.19%Galectin-12 [Harpegnathos saltator]
Group
Gene OntologyGO:00055292.2e-58sugar binding
KEGG pathway 
InterPro domain[17-144] IPR0010792.2e-58Galectin, carbohydrate recognition domain
[2-145] IPR0133201.4e-48Concanavalin A-like lectin/glucanase, subgroup
[9-149] IPR0089853.3e-46Concanavalin A-like lectin/glucanase
Orthology groupMCL17057 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207912-TA
ATGGCCGCTCAGCCCATATACAACCCTGTAATCCCATGTGTCCACCCGATCCCTGGTGGCTTGTTCCCCGGTCGCATGATAAGGTTCCAAGGGAGTGTACCGCCCGGCGCCCAACGATTCGCGATCAATTTCCAATGCGGTCCGAACACTGATCCCCGGGACGACATCGCCCTCCATCTCAACTTCCGCTTCGTGGAGATGTGCGTCGTTAGGAACCACTTGACGGCGATGAGCTGGGGGGTGGAGGAGACCAACGGCGGCATGCCTCTAGTGCGAGGGGAGGCTTTCGAGGCCCTGGTTCTGTGTGAGCCGCAGTCCATCAAGGTCGCGCTGAATGGGGTGCACTTCTGTGAGTTTCCGCATCGTATACCCTTCCAAAGGATCAGTCACCTGACCGTGGACGGTGACGTCATGCTGCAGTTCGTCGGCTTCGAGGGAGCCCAGCCAAGCCAGATGTACATGGCGGAACCTCCATCATATGCCAGCTATGGCGCTCCGCCCTCGTACGGCGCCCCCGGCTATGGAGCACCCCAAGGTGGTTTCGGTGGAGCGGTACCCCCACAATACGCGGGCGCCCAAACAGTACCCCAGTATACTCAAGAGCGTCGTGGTATGGGAACCGGGGCGGCCGTGGGATTGGGCGTTGGGGCCCTGGCCGCTGGTGGGCTGGCGGGTTATGCACTAGGCGGGGGCTTCAGCAGCAATAGCCCTACCGAGGAGCCAGGCAGCCACAGACGACGAGGCCATGGTTCATACGATGGTCAAGGGCCCTACGGTGGTCAAGGTCCTTACGGTGGTCAAGGGCCCTTCGGTGGTCAAGGTCCTTACGGCGGTCAAGGTCCAGGGCTCTTTGGTGGTCCGAATACTGGCTATGGAAGTCCTGATCAAACTCGGAACTATGTGCAAGATCCTCTCAATCCTCACCAGGGGCCAGTTCAGCCTCCAGTCGCGAATGTCACACCACCTCCGTACGGACAAGGTCAAGACAGTCACTACCCCCCGGGATATCAGCTTCATAATCAAGGTTATCCTTACGGTCAAGGTTACCCACCACACGGCCAAGGTTACCCACCCCAAGGACAGGGATATCCACCTTACGGTCAAGACTACGGTCAAGGGTACCCACCTCAGGGCCCAGGGTATCCACCTCAAGGACCAGGATATCCGGGATATGGCCAAGGATACCCACCACAGGGCTACCCTGGTCAAGGACCTCCAGGCACGTTTTCACAGAATTATCAATTTTTGCTACATAATTTGTAA

Protein sequence:

>DPOGS207912-PA
MAAQPIYNPVIPCVHPIPGGLFPGRMIRFQGSVPPGAQRFAINFQCGPNTDPRDDIALHLNFRFVEMCVVRNHLTAMSWGVEETNGGMPLVRGEAFEALVLCEPQSIKVALNGVHFCEFPHRIPFQRISHLTVDGDVMLQFVGFEGAQPSQMYMAEPPSYASYGAPPSYGAPGYGAPQGGFGGAVPPQYAGAQTVPQYTQERRGMGTGAAVGLGVGALAAGGLAGYALGGGFSSNSPTEEPGSHRRRGHGSYDGQGPYGGQGPYGGQGPFGGQGPYGGQGPGLFGGPNTGYGSPDQTRNYVQDPLNPHQGPVQPPVANVTPPPYGQGQDSHYPPGYQLHNQGYPYGQGYPPHGQGYPPQGQGYPPYGQDYGQGYPPQGPGYPPQGPGYPGYGQGYPPQGYPGQGPPGTFSQNYQFLLHNL-