Monarch geneset OGS2.0

DPOGS209083
TranscriptDPOGS209083-TA2826 bp
ProteinDPOGS209083-PA941 aa
Genomic positionDPSCF300175 + 74786-80270
RNAseq coverage117x (Rank: top 58%)
Annotation
HeliconiusHMEL0140220.070.75% 
BombyxBGIBMGA005696-TA1e-17958.80% 
Drosophila% 
EBI UniRef50UniRef50_B2DD572e-17758.80%Beta-fructofuranosidase n=2 Tax=Bombycoidea RepID=B2DD57_BOMMO
NCBI RefSeqNP_001119721.14e-17858.80%beta-fructofuranosidase [Bombyx mori]
NCBI nr blastpgi|2607654510.062.06%beta-fructofuranosidase 2 [Manduca sexta]
NCBI nr blastxgi|2607654510.062.45%beta-fructofuranosidase 2 [Manduca sexta]
Group
Gene OntologyGO:00059757.7e-125carbohydrate metabolic process
GO:00045647.7e-125beta-fructofuranosidase activity
GO:00057377.7e-125cytoplasm
GO:00045531.1e-113hydrolase activity, hydrolyzing O-glycosyl compounds
KEGG pathwaybmd:BMD_15669e-104 
 K01193 (E3.2.1.26, sacA)maps-> Starch and sucrose metabolism
    Galactose metabolism
InterPro domain[491-914] IPR0062327.7e-125Sucrose-6-phosphate hydrolase
[50-448] IPR0013621.1e-113Glycoside hydrolase, family 32
[39-347] IPR0232961.3e-107Glycosyl hydrolase family 43, five-bladed beta-propellor domain
[50-346] IPR0131488.8e-95Glycosyl hydrolases family 32, N-terminal
[802-940] IPR0089851.5e-13Concanavalin A-like lectin/glucanase
Orthology groupMCL22104 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209083-TA
ATGTCTCTGATACAAAAGATTCTGTGTCTGTGCTTAATAGGGTTGTCGCAAACGGCAAATTTAACCTATGACGAATTAAAACGGGAGTTAGAAAGTTATATTGAAAATAAACAAAAGGAAATAAACCAAAGATATCGGCCTTTGTATCATGTGGCACCACCAGTTGGATGGATGAACGATCCCAATGGATTTGCTTATCACAATGGATTATTTCATCTTTTTTACCAATTCTACCCTTACGATAGTAAGTGGGGTCCAATGCATTGGGGCCATGTTACTAGTACTGATCTTGTTCATTGGAAACAGAAGCCGACTGCTTTAATACCTGAGGAGGAGCAATGTTTCTCAGGAAGTGGTATCAGCGATGACAATAAGCTGGTGTTAATGTATACTGGGCATGTTCTTACAGAAGAAGAACCTTTCTATAAACAAAGTCAATATCTTGCTTTAAGTGAAGATGGAGTTAACTTTGAAAAGTCCAAATCAAATCCAATTATTGAAAGCCCTCCTAACAATTCACCTGACTTTCGAGACCCAAAAATTTGGAAGCATGGAAATGAATTCTATGCGGTTATTGGCAGTAAAGACAATGGCAATGGAACTGTGCTTCTATATAAATCAAGTAATTTGACAACTTGGGAATATTTATCAGTACTCGGACAATCTACTGGGGAATTGGGATACATGTGGGAATGTCCAGACTTTTTCGAATTGGACGGGAAGTTTATTTTATTGATGTCACCTCAAGGAGTAAAAGCAAATGGTGATAGATATAAAAACGTTTATCAGACCGGCTACATGGTGGGAACTTTCGATTATAATACGAGTCGTTTTGAAACAGATGGTGTATTTCAAGAACTAGATTATGGTCACGATTTTTATGCTGCCACGACCACGGAAGCTAAGGGCAGTAGATACCTTTCAGCTTGGTTTAGTATGTGGGAAGTGACACATCCGGAGGATGTTGATGGTTGGATAGGCACAACTACGTTAGTTAGGGAGCTGAGGTTCGTTAACAATCGTTTAATTATGAAACCGGTTGCAGAAATTGTTGATCTAAGAGAAAGTTTAGCTCTCCAGGGGGATTTTAAAGCCAACCAGGTGCAAGAATTTGGTAAAGCAGTAGAAATTTTAATTGAAGGTAATTTAAAACAGAATATTGACTTGTTATTAGATGGACCAAACGGTGGTGGCCAAGTGCAGTTAAAATGGGACGCAAAAAATGGTACAGTTTCTGTGATCAGAGAAGAAGAGGTGCGACGAGTAGTGTGGGCACCTATTGATTCTCAAATTTGGCGAATTTTCCTGGATACAAGCAGTCTAGAGTTGTTTTGTGGAGAAGGTGAAGTAGTTTTCAGTAGTCGAGTTTATCCTCTCGCGTTCACATGCATCTTAGCGTTGGCCTATTCAAAAAGTGTAGATAATAATTATGAAAATGAGAAAGAAGATTTAGAAAATTATATACAAAATAAGAAGAATGAAATTAATCACAGATTCAGACCTCTCTACCACGTCGCACCTCCAGTCGGCTGGATGAATGACCCCAATGGATTTTCATATCACAACGGAGAGTTTCACCTTTTCTATCAATTCTATCCATATAAAAGCGAGTGGGGCCCTATGCATTGGGGACATGTTATCAGTTCTGATCTTGTTCACTGGAAACAAATGCCAACAGCTCTTCTACCGGGAACAGAACAATGTTTCTCTGGTAGTGCTATAAGCCAAGGGGATGTTTTGACATTAATTTACACTGGTCGCAGATCCATTGATGAGCAGCCTTATTTCAACGAAAGCCAGTACCTCGCGTTTAGTGATGATGCTGTAAATTTTTATAAATATGAAGGAAATCCGGTAATTCCCAATGCACCAAATAACGCTCCCGATTTTCGTGATCCTAAAGTTTGGAAATACGGAGATGAATATTACGTAATCATCGGTAGTAAGACTTCAGATGAAAGAGGAAGGGTTCTTTTGTACAAATCTAAAGATATGTTTGATTGGGAGTTTTTAACTGTATTAGGCGAATCGAATGGATCTTTGGGATACATGTGGGAATGTCCGGATTTTTTCGAATTAGATGGGAAATTTATTTTACTAATGTCACCTCAAGGAGTCTCACCACAAGGAGACAGATATAAGAACTCACACCAGACAGGATATATTGTTGGAAGTTTTGACTACGACACATTTCAATTTATTCCCGAAGTTGAGTTTCAGGAAATAGATTATGGTCATGATTTTTACGCAGCCACAACAACACAAGCGAACGGAAAACGATATCTGTTAGCTTGGTTTAGTATGTGGGATGTACCTTACCCTGAAGACGTCGACGGTTGGGCAGGTATGATGACTATTACTAGAGAGCTTAACTTGGTCAATAACAGAATACTTATGAAGCCAGTTTCTGATATGCTGAATTTGAGGAATGAAGTCGCTCTCAAGGATGAAGTGAAGCCTGGACAAGTACACCAATTCGGAAAAGCTGTGGAAATAATTATTGAGAGCGACCTTTATAACAAAATTGATTTATTATTAGATGGTCAGGAAGGCGGAGGCAAAGTTTGGATACGATGGGATCCTGATATCGGTAAAGTTGTAGTCGATAGAGGATCTGGAGATATAAGACAAGTAGAATGGAAACCTATCGGTTCAACTACATGGCGAATATTCTTGGATAGCTGCAGCTTAGAACTGTTTTGTGGTGAAGGTGAGGTTGTGCTCAGTAGTAGAGTATATCCATTGGGTGGATGGAGATTAAAAAATCAAAGTCCACAAACGATACGAGTGGAGGCTTACAATCTGCAAAGAAGTGTACCTGAATAG

Protein sequence:

>DPOGS209083-PA
MSLIQKILCLCLIGLSQTANLTYDELKRELESYIENKQKEINQRYRPLYHVAPPVGWMNDPNGFAYHNGLFHLFYQFYPYDSKWGPMHWGHVTSTDLVHWKQKPTALIPEEEQCFSGSGISDDNKLVLMYTGHVLTEEEPFYKQSQYLALSEDGVNFEKSKSNPIIESPPNNSPDFRDPKIWKHGNEFYAVIGSKDNGNGTVLLYKSSNLTTWEYLSVLGQSTGELGYMWECPDFFELDGKFILLMSPQGVKANGDRYKNVYQTGYMVGTFDYNTSRFETDGVFQELDYGHDFYAATTTEAKGSRYLSAWFSMWEVTHPEDVDGWIGTTTLVRELRFVNNRLIMKPVAEIVDLRESLALQGDFKANQVQEFGKAVEILIEGNLKQNIDLLLDGPNGGGQVQLKWDAKNGTVSVIREEEVRRVVWAPIDSQIWRIFLDTSSLELFCGEGEVVFSSRVYPLAFTCILALAYSKSVDNNYENEKEDLENYIQNKKNEINHRFRPLYHVAPPVGWMNDPNGFSYHNGEFHLFYQFYPYKSEWGPMHWGHVISSDLVHWKQMPTALLPGTEQCFSGSAISQGDVLTLIYTGRRSIDEQPYFNESQYLAFSDDAVNFYKYEGNPVIPNAPNNAPDFRDPKVWKYGDEYYVIIGSKTSDERGRVLLYKSKDMFDWEFLTVLGESNGSLGYMWECPDFFELDGKFILLMSPQGVSPQGDRYKNSHQTGYIVGSFDYDTFQFIPEVEFQEIDYGHDFYAATTTQANGKRYLLAWFSMWDVPYPEDVDGWAGMMTITRELNLVNNRILMKPVSDMLNLRNEVALKDEVKPGQVHQFGKAVEIIIESDLYNKIDLLLDGQEGGGKVWIRWDPDIGKVVVDRGSGDIRQVEWKPIGSTTWRIFLDSCSLELFCGEGEVVLSSRVYPLGGWRLKNQSPQTIRVEAYNLQRSVPE-