Monarch geneset OGS2.0

DPOGS212947
TranscriptDPOGS212947-TA2535 bp
ProteinDPOGS212947-PA844 aa
Genomic positionDPSCF300057 - 94076-99380
RNAseq coverage1773x (Rank: top 7%)
Annotation
HeliconiusHMEL0107371e-16151.78% 
BombyxBGIBMGA001695-TA1e-11153.49% 
DrosophilaCG9463-PA2e-11635.19% 
EBI UniRef50UniRef50_E9H2622e-11735.90%Putative uncharacterized protein n=2 Tax=Coelomata RepID=E9H262_DAPPU
NCBI RefSeqXP_002078701.13e-11635.12%GD22383 [Drosophila simulans]
NCBI nr blastpgi|3214630907e-11735.90%hypothetical protein DAPPUDRAFT_252189 [Daphnia pulex]
NCBI nr blastxgi|3320293891e-7743.30%Lysosomal alpha-mannosidase [Acromyrmex echinatior]
Group
Gene OntologyGO:00038242.4e-90catalytic activity
GO:00302462.4e-90carbohydrate binding
GO:00059752.4e-90carbohydrate metabolic process
GO:00159237.1e-60mannosidase activity
GO:00060137.1e-60mannose metabolic process
GO:00045531.5e-22hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00082701.5e-22zinc ion binding
GO:00431692.7e-22cation binding
GO:00045596.6e-21alpha-mannosidase activity
KEGG pathwaydsi:Dsim_GD223838e-116 
 K12311 (MAN2B1, LAMAN)maps-> Lysosome
    Other glycan degradation
InterPro domain[239-843] IPR0110132.4e-90Glycoside hydrolase-type carbohydrate-binding
[261-837] IPR0116827.1e-60Glycosyl hydrolases 38, C-terminal
[46-137] IPR0113302.1e-24Glycoside hydrolase/deacetylase, beta/alpha-barrel
[140-212] IPR0153411.5e-22Glycoside hydrolase, family 38, central domain
[247-340] IPR0137802.7e-22Glycosyl hydrolase, family 13, all-beta
[46-136] IPR0006026.6e-21Glycoside hydrolase, family 38, core
Orthology groupMCL10107 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212947-TA
ATGGCAGCGTTCATAAGGAAAGTGTACGGTATGTCGGAATCGTATAAAACCAACAACATCTTGGTGACTATGGGAGACGATTTCCAGTACCAAGACGCTAACATGTGGTTCAGTAACCTCGATAAACTGATAATAAACAAATTTCTGAAATCCGTTAACCAACAAGGCGAGTACTACAGAACGAATAATATTATACTAACGATGGGCGGCGACTTTACCTATCAAGATGCTGCGATGTGGTACATCAACCTCGATAAGTTGGTAGAATATACAAATCTAAAGGCCGCCAAGGACGGCTTGAACATTAAACTGTTCTACTCCACGCCCGATTGTTATCTGAAAGCCGTTAAAGATTCGAACCCGACTCTGCCTACTAAACAGGACGATTTCTTTCCTTACGCGAGTGACCCCACCGCCTACTGGACGGGGTACTTCACTTCGCGACCGACTACCAAATACTTCGAGAGACTCGGCAATAGATACTTACAGATGGTAAAGAATCTACAAGTGTTGGCCGGTTTGGAAGAACACAACAAATTTGTTATAGACGAGCTGAAGAGTGCTATGGGTGTGATGCAACATCACGACGCCATAACTGGCACAGAAAAGCAACACGTGGCTCACGACTACGAGAGGTTGCTGGACAACGCTGTCGAGGACGCGTTGCTGGTGGCGAGGCAGGGAATCAGTAAATTATCTAAATCCGAAAAATATCAATTCAATTACGAGAGATGTCGTCTGAACGAGTCGAGCTGCGCTACGAGTGAGGACAGCCAACAGTTTGTGGTGACTGTGTATAATCCACTCGGTTGGAATACCCTTGAACCGATACGTATCCCAGTCCTGGACGGAGAGTACGAGGTTTACGCACCTAATGGAGAGAAAGTAACCTCCCAGCTGCTAGACATACCGAACCCGGTTAGGAAAATACCGACCAGGAAGTCCGGAGCCACCCACGAGCTCACCTTCATAGCGAAACTGCACCCGCTGTCGATAAAATCGTTCTTCATCAAGAAGATGCAACGAACAAGACGAGGAATAGATTACCGGAGTATAAAAAATTACTGGAGTAATATCGGCAGCCCGTATATAGTGGAGAATATTGATTACGTAGGTAATATTGGGACCGATAAAATTGTCGTGCCCTCGCATGTGAGCGGAACTGGAAGCAATATTAATTTTGATGTGCTGACGGATGAGGATTTGAATGTGGGTAGAGCGGAGAATATGAGGAACAGAGATAAGGTCCCGAGGAACATGGAGAAGATGAGACATCCGAGTTTGACTGACGAGGAATACAGACTGCTGGCGGACGAACCGAGCGTCGTGCAGAGAAGTGAGGAGTATTATTTGGAGAATGAGTTCCTGAAACTCCGTTCTGATGATACAGGCGTCACTCACATGATTTTACCGGACAGGACAACCAACCTCAGGATACAGTTCCACTATTGGACCGGATGTTCCGGTGACAATACCAACACAACGACGCGATCTTCCGGCGCGTACATCTTCAGACCGGAAACGGACAAACCGTATCCGTTAGAATATAACAGTAATAGAATAGTTAAAGGGGAAGTAGTTCAGGAGATCCGGGCGGAGAGTGACACGGCGTCGAGTACGTTCAGGGTGTACGGCGGGCTGCCTTTTATAGAGCACGACTTTGTTGTCGGACCGATACCCGTGGACGACAAAGTCGGCAAAGAATACGTCATCCGATACGACACTAATGTCGGCAACGACGGCGTCTTCTTCACGGACAGCAACGGAAGACAAGTTCTCAAGAGGAAGTTGAATGAACGGCCCCAGTGGAACCTCACGTTAGCGGAGCCGATCGCGGGGAATTATTACCCGGTGACGAGCAAGGTGTTCTTAGAGGACGGGCATACGAGAATCACGGTGTTGGTGGACAGGTCCGAAGGGGGCACGTCGCTAGTGCAGGGCGGCATAGAGCTGATGGTACATCGACGACTGTTACACGACGATGCATTCGGAGTGGGAGAGGCTCTGAACGAAGTAGCCCAAGGCGAAGGACTGGTAGTAAGGGGGAGACACAGGCTCCTCAACCTGAATCCTAACGACGAACAAGAAACGCTGAGCGAAAAGAAGTATGCTCTGCAAACACACTACGAGCCGATAGTGTTTGTATCGAAAGCAGAGCATATATCCTACGAGAGCTGGCTAAAACTGAGTAATTGCTTCAAAGGCATGAAGCAGCTCCCGGACGGAGTTCACCTCCTGACGCTGGAGATCTGGAGGGATAAAATACTCTTAAGATTCGAGAACTATATCGATAGGGCTGTGCAAGTCGACCTTAATATCTTCAACACGATTAAGATTAAATCGGTGAAAGAGACAACCCTAGCGGCCAACCAATGGCTGGAGGATCATACCAAATGGAATTGGAACATCGAAGGTGAATTCCAAAATTCCGATTCCCAACCAAATCCGATTCCAGACGATTTGTCAGATCTCAAATGTACACTGAAAGCTAAACAAATAAGGACCTTCATAGCTGACTATGAATTAAATGCTTAA

Protein sequence:

>DPOGS212947-PA
MAAFIRKVYGMSESYKTNNILVTMGDDFQYQDANMWFSNLDKLIINKFLKSVNQQGEYYRTNNIILTMGGDFTYQDAAMWYINLDKLVEYTNLKAAKDGLNIKLFYSTPDCYLKAVKDSNPTLPTKQDDFFPYASDPTAYWTGYFTSRPTTKYFERLGNRYLQMVKNLQVLAGLEEHNKFVIDELKSAMGVMQHHDAITGTEKQHVAHDYERLLDNAVEDALLVARQGISKLSKSEKYQFNYERCRLNESSCATSEDSQQFVVTVYNPLGWNTLEPIRIPVLDGEYEVYAPNGEKVTSQLLDIPNPVRKIPTRKSGATHELTFIAKLHPLSIKSFFIKKMQRTRRGIDYRSIKNYWSNIGSPYIVENIDYVGNIGTDKIVVPSHVSGTGSNINFDVLTDEDLNVGRAENMRNRDKVPRNMEKMRHPSLTDEEYRLLADEPSVVQRSEEYYLENEFLKLRSDDTGVTHMILPDRTTNLRIQFHYWTGCSGDNTNTTTRSSGAYIFRPETDKPYPLEYNSNRIVKGEVVQEIRAESDTASSTFRVYGGLPFIEHDFVVGPIPVDDKVGKEYVIRYDTNVGNDGVFFTDSNGRQVLKRKLNERPQWNLTLAEPIAGNYYPVTSKVFLEDGHTRITVLVDRSEGGTSLVQGGIELMVHRRLLHDDAFGVGEALNEVAQGEGLVVRGRHRLLNLNPNDEQETLSEKKYALQTHYEPIVFVSKAEHISYESWLKLSNCFKGMKQLPDGVHLLTLEIWRDKILLRFENYIDRAVQVDLNIFNTIKIKSVKETTLAANQWLEDHTKWNWNIEGEFQNSDSQPNPIPDDLSDLKCTLKAKQIRTFIADYELNA-