Monarch geneset OGS2.0

DPOGS206161
TranscriptDPOGS206161-TA3087 bp
ProteinDPOGS206161-PA1028 aa
Genomic positionDPSCF300028 + 1846328-1854271
RNAseq coverage4x (Rank: top 89%)
Annotation
HeliconiusHMEL0107463e-12744.19% 
BombyxBGIBMGA001728-TA1e-13843.16% 
DrosophilaCG9463-PA3e-12932.17% 
EBI UniRef50UniRef50_Q8MS443e-12330.82%RE08556p n=30 Tax=Sophophora RepID=Q8MS44_DROME
NCBI RefSeqXP_002047646.16e-13431.87%GJ11812 [Drosophila virilis]
NCBI nr blastpgi|1954731479e-12531.16%GE10763 [Drosophila yakuba]
NCBI nr blastxgi|1951466822e-9336.54%GL19134 [Drosophila persimilis]
Group
Gene OntologyGO:00038243.4e-74catalytic activity
GO:00302463.4e-74carbohydrate binding
GO:00059753.4e-74carbohydrate metabolic process
GO:00159233e-58mannosidase activity
GO:00060133e-58mannose metabolic process
GO:00045591.3e-56alpha-mannosidase activity
GO:00431691.7e-19cation binding
GO:00045533.3e-05hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00082703.3e-05zinc ion binding
KEGG pathwaydvi:Dvir_GJ118122e-133 
 K12311 (MAN2B1, LAMAN)maps-> Lysosome
    Other glycan degradation
InterPro domain[416-1026] IPR0110133.4e-74Glycoside hydrolase-type carbohydrate-binding
[417-1019] IPR0116823e-58Glycosyl hydrolases 38, C-terminal
[2-290] IPR0006021.3e-56Glycoside hydrolase, family 38, core
[2-295] IPR0113305.2e-53Glycoside hydrolase/deacetylase, beta/alpha-barrel
[402-496] IPR0137801.7e-19Glycosyl hydrolase, family 13, all-beta
Orthology groupMCL26410 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206161-TA
ATGAAACAAATCCTGGATTCCACTATTAGTGAGCTGTGGGCCTACAAAGAGAGGAGGTTCATTATAGCAGACAGTGAACTGCCATATTTTTTTCACTGGTGGTCCAAAAGGGACGGGACAGTACGTAGAATGGTGTACGAGCTGGTCCGCCAGGGTCGGCTCGTGATCGTGGGCGGAGGCTGGGGCCTGCAGGACGAGACCACCACGTACTACCAGTCCGTCATAGACAGCTACACGTACTCCCTCAGGAAGATCAACGCTACCTTCCTGGAGTGCGGCCGGCCGCTGGTCGCGTGGCAGGCTGATAACTTCGGCCACTCCCGGGAGTTCGCGTCCCTGGTCGCCCTCATGGGCTTCGACGGACTCTTCATCAACCCCATCAGCTTCGACGACGAACTCATCAGGATGGAGAGGAAGGGACTCGAGTTCCTGTGGAGGGGCAGCGACGACCTGGGCGATAACCCGCTCCCCGCAGGCCCGGAGACTGACATATTCACCCACAAGCTGTTCGACGGCTACTGGTCGCCCCCCGGCTTCTGCTTCGGCAGCATGTGCTCGGACCCGCTGCTCGTCACCAGCGATACGCTTTTTAATAACGCTAAAGAGAGGGCCCAGCTGTTCATTGAGAAGATCCGTTTCCGCCAAGCTCCTAACTATCAGACCAAGCAGGTGATGGTGATGATGGGTCAGAGGATGGGCTACGCGGACTCCAAGCTCTGGTTCAATAACATCGAGAAACTTATAAGTTATGTCAACGAGGAAGCGTTCGAAGATAAAATGTACGCCATGTACTCGACTCCGATGTGCTACTTGCAGGCGGCCTACCAAGAGAACCCAATTTTGGAAACGAAACAGGACGACTTCATACCGTTCGCGTACGACCAGGACTCCTACATGACCGGCCTCTTCACCTCCAGACCTAGCTTCAAGTACTTGGTTAGAGAGGCCAACGTGTTCCTACAGATAGCGAAACAGTTGCAAGTCTTGACCAACTTGAGAAACAACGACGGAATATTCGAAGACTTCATTCCAGGCGTGGCGCAGGACCACAATATCATAACGGGCGCCATGCGGCCCTACGCCAAGAACTACTACACTAAGTACCTCAGCATCGCCATACAGAAGTCCACCATCGTCGCGAAGCAGGCCTTCAACAAAGTGCGCGCCAACAACCCTTCGCTGCTCACCGACTACACGCTGTGTTACCTGAACGAGTCTTCGTGTCCCAACACTAAGGTCCCGTATTTTTATATAACGGTATACAATCCCTTGGCTTGGAACGTCACGATGCCTGTGAGGGTGCCGGCCTTCAAAAGAAGATACAACGTCTACGATCCTCACGGTGAAGTAGTCCCCTCGGCGTTAATGCGAATACCGCAGCAGGTCCTGAGCATACCCGGCAGGTTCGCAGAACACGACCTGGAGCTGGTCTTCATCGCTCCAGAACTTCCAGCGCTCGGCTTCAGGTCCTACTACATAGAGGAGGTGAAGAGAAACAAACGATCACTCATCAAGAAGATCGGCAAGAACAAACAGAAGTACTTCATAAGACAAGCGCCGAGGACAGACAACGCCACGCTCCTAGACGATCCCGCGTACGACGAGGCTGAGACTCAGCCGGAAGACATCGGAGAAAATAGGGCTGAGGGCTCCGAGGACGGGCACGACGCCACGAGAAGACCGGAAGTGACGTACGAGGAGCTGGAACACACGGACGGTACGGCCGACACCACACCCACCACCACACGCACCACCACCAGAGAGACGCGGACAGGCGGCGACAGCAGCTGGGTGGAGTCGAGCGACACCTACATCGGAAACAAGTACATACGAATAAGCCTGGACAGTCACCGCAAAGTGTCGTCTATGAGTCTGGCCAACGGAGTCAACACGTCGCTGGACATACAATATTACTTCTACGTGTCCGACGACCCCGACACGGTCGAAAACCAGAAGCGACGGCCCGGAGCCTACATCTTCAGACCTCTGGACGTCAAGCCGGAGGCGATCATAGACTACATCGACACCAAGGTCTACAAGAGCGGCGAGGTGCAAGAGATACATTCCAGGTACTCGGAGCACGCGTCGTTCGTGTTGCGCTTGTACAGAGACAGCGTCGTGTGCGAACTGGACTGGATCCTCGGCCCGCTGCCCGCGGACGGGCTGGGCCGGGAGCTCTTCATACGGTACACCACCGACCTCGAGAACGACGGAGTGTTCTACACGGACGCGAACGGCAGGCAGGTCGTCAAGAGAATCAGACACACGAGACCCTTGTACCGACCGTACCACCTGGACCCTGTCGCAGGCAACATCTATCCCGTCACAACAAGAATATATATAGAAGATTTACGGAAGAATCTCCGCTTGTCCATATTCAATGACAGGTCACAAGGAGGGACCTCGCTCCTCGAGGGGTCGGTGGACCTCATGTTGGACAGACTCATCTACACCGACGACAGCGGAGTACAGACCTTCCTCAACGAGACCGTCGACGGCAAGGGAATAGTCGTCCGCGGAACGCATTACCTGTACCTCACCAGAGCCAGCCACAGACCTAATAGAGTCTTCGAAAAGAGATTCTCAAAGGAAATAGAACTGAAACCTCAGATATTCTTTTCACGAATCCGTCAAATGGTGAGGAAGGATCGCTGGCTCGGCAGGAGGAATGAGTACTCGGCCCTCAAGACGAAGCTGCCCATCGGCGTCCACATCCTGACGATACAGGAGTGGAACGAGAGGACTCTGCTGATACGGCTCGAGAACTACTTAGAGAAAGTCGACGTCATCAAGAGCGGCGTCAAGGAAGTGCAGCTGAAAGATTTGTTCGTGAACATAGTCCCGGACGAGGCGGTCGAGATGAAGCTGGCCGCGAACATCCGCCTGAAGGATTGGACGCAGATACAGTGGCAGAGGAACGGCTCGTTCGTGAGCAACTTCAACGACCACTACGGAACCACGAAGACCGCGGAATTCAGCTACGAGCGCATGAAGCCCTTGAAGAAGGTCGACGTCCGCGCCGGCATCCTGCTGTACCCGCAACAAATACGGACCTTCGTCGTGTCTTACCGCGCACTCCAGCCGTGA

Protein sequence:

>DPOGS206161-PA
MKQILDSTISELWAYKERRFIIADSELPYFFHWWSKRDGTVRRMVYELVRQGRLVIVGGGWGLQDETTTYYQSVIDSYTYSLRKINATFLECGRPLVAWQADNFGHSREFASLVALMGFDGLFINPISFDDELIRMERKGLEFLWRGSDDLGDNPLPAGPETDIFTHKLFDGYWSPPGFCFGSMCSDPLLVTSDTLFNNAKERAQLFIEKIRFRQAPNYQTKQVMVMMGQRMGYADSKLWFNNIEKLISYVNEEAFEDKMYAMYSTPMCYLQAAYQENPILETKQDDFIPFAYDQDSYMTGLFTSRPSFKYLVREANVFLQIAKQLQVLTNLRNNDGIFEDFIPGVAQDHNIITGAMRPYAKNYYTKYLSIAIQKSTIVAKQAFNKVRANNPSLLTDYTLCYLNESSCPNTKVPYFYITVYNPLAWNVTMPVRVPAFKRRYNVYDPHGEVVPSALMRIPQQVLSIPGRFAEHDLELVFIAPELPALGFRSYYIEEVKRNKRSLIKKIGKNKQKYFIRQAPRTDNATLLDDPAYDEAETQPEDIGENRAEGSEDGHDATRRPEVTYEELEHTDGTADTTPTTTRTTTRETRTGGDSSWVESSDTYIGNKYIRISLDSHRKVSSMSLANGVNTSLDIQYYFYVSDDPDTVENQKRRPGAYIFRPLDVKPEAIIDYIDTKVYKSGEVQEIHSRYSEHASFVLRLYRDSVVCELDWILGPLPADGLGRELFIRYTTDLENDGVFYTDANGRQVVKRIRHTRPLYRPYHLDPVAGNIYPVTTRIYIEDLRKNLRLSIFNDRSQGGTSLLEGSVDLMLDRLIYTDDSGVQTFLNETVDGKGIVVRGTHYLYLTRASHRPNRVFEKRFSKEIELKPQIFFSRIRQMVRKDRWLGRRNEYSALKTKLPIGVHILTIQEWNERTLLIRLENYLEKVDVIKSGVKEVQLKDLFVNIVPDEAVEMKLAANIRLKDWTQIQWQRNGSFVSNFNDHYGTTKTAEFSYERMKPLKKVDVRAGILLYPQQIRTFVVSYRALQP-