Monarch geneset OGS2.0

DPOGS200695
TranscriptDPOGS200695-TA1653 bp
ProteinDPOGS200695-PA550 aa
Genomic positionDPSCF300571 + 9569-18492
RNAseq coverage23x (Rank: top 78%)
Annotation
HeliconiusHMEL0144727e-12949.37% 
BombyxBGIBMGA014144-TA1e-12748.74% 
DrosophilaCG9701-PA1e-8838.40% 
EBI UniRef50UniRef50_Q16ET67e-9439.28%Glycoside hydrolases n=9 Tax=Neoptera RepID=Q16ET6_AEDAE
NCBI RefSeqXP_557100.21e-10141.86%AGAP006426-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1187880422e-10041.86%AGAP006426-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|2912313582e-8639.20%PREDICTED: cytosolic beta-glucosidase-like [Saccoglossus kowalevskii]
Group
Gene OntologyGO:00045532.7e-159hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059752.7e-159carbohydrate metabolic process
GO:00431693.3e-73cation binding
GO:00038243.3e-73catalytic activity
KEGG pathwaycfa:4838983e-83 
 K01229 (LCT)maps-> Galactose metabolism
InterPro domain[120-535] IPR0013602.7e-159Glycoside hydrolase, family 1
[130-542] IPR0178535.5e-120Glycoside hydrolase, superfamily
[284-537] IPR0137813.3e-73Glycoside hydrolase, subgroup, catalytic core
Orthology groupMCL16206 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200695-TA
ATGTTGAAGACGGCGTCCAGGCCCATCACGCCGGTGCCCCCTACACCCAGCGCTGCCTGCGACGAGGCCAGGCGTGCACATAGCACGCTCTCTGCCAAGTCGATGGCCTTCGCGAGGATCAGATTCCTGCACGAGCATGCACTGTTAGGAGTTGGGTTTTGTAGAAGATTTCCACCCGGGTTCAAATTTGGTGCAGCCACAGCTGCTTACCAGGTCGAGGGCGCCTGGAACGTCAGCGACAAATCCGCAAGTATCTGGGACACGTTCGTGCACACTAGACCAGAGATTATAGCAGATAGATCCAACGGGGACGTCGCCTGTGACAGCTACAACCAATGGATGAGAGACGTGGAAATAGCTTCGGAGTTGGGATTAGATTTCTACAGAAGATTTCCACCCGGGTTCAAATTTGGTGCAGCCACAGCTGCTTACCAGGTCGAGGGCGCCTGGAACGTCAGCGACAAATCCGCAAGTATCTGGGACACGTTCGTGCACACTAGACCAGAGATTATAGCAGATAGATCCAACGGGGACGTCGCCTGTGACAGCTACAACCAATGGATGAATGACGTGGAAATAGCTTCGGAGTTGGGATTAGATTTCTACAGATTTTCTCTCTCCTGGCCAAGAATTTTGCCATATGGTTTTGCAAATAAGATAAGTGAAGACGGAGTAAAATTTTACACAAATCTCATCGATGCTTTATTGGAGAGAGGAATTGAGCCTGTCGTAACAATTTATCACTGGGATTTGCCACAAAATTTACAAGATCTTGAATTCATCTCTCATGATCCGTCTATAGCGGCTTATACAGATTCCGTCAAAGTCAAAAATAAGTTGTTCATCATGTACAGAACTAAACGAGCAGGCAAGGTGTCCCTCACCAACCAAATCATGTGGTTTGAGGGAGCTGATGAAAATGACGGAGAAGCGGCTGAACTGGCTCTACAGTTAATGGGAGGAATGTACTCACACCCAATCTTCTCTAAGAAAGGCGGTTGGCCTAAGAAAGTAGAAAATCTAATAGCAGAAAAGAGTAAAAAAGAGGGTTACCCACAATCCAGATTGCCAGAATTTACAAAGGAAGAAAAAGAATTAATAAAAGGAACATATGACTTCTTCGGCTTGAACTACTATACGTCACGAATTGCTCGCCGTGCCCGAGGAGAAGTTGTTGGTCCTTGGCCTCTCAAAGGTGGACCAGACATTGATGTAAAAACATCAGTGCGTCCAGAATGGCCGCAGGCTGGCACCAGCTGGTTCTATGTACACCCGCAAGGTTTACGGAAACTAATTTCTTGGGTGAAAGAACAGTATGGGGACATAGAAATCTTCATAGCAGAGAACGGCTTTGCCACCCATGGCCAGGATTTAGACGATCAAGTCCGCGTGGATTACTATAAGAGCCATTTAGAACAGGTTCACCTCGCAATTGAAGAAGATAAGGCCAATGTCGTAGCATACACAGCTTGGACGATGATAGACAACTTTGAATGGAGCGATGGCTATCGTTCCAAATTCGGTTTGTACGAGGTGGACTTCAGCGACCCAGCCCGCGCCCGGCGCCCGAGAGCCTCCGCACACTACTACAAAGAGATTGTGAAAGCGAAATCATTAGATGTAGATAGTCATGTATTAAATGATGAATTATAG

Protein sequence:

>DPOGS200695-PA
MLKTASRPITPVPPTPSAACDEARRAHSTLSAKSMAFARIRFLHEHALLGVGFCRRFPPGFKFGAATAAYQVEGAWNVSDKSASIWDTFVHTRPEIIADRSNGDVACDSYNQWMRDVEIASELGLDFYRRFPPGFKFGAATAAYQVEGAWNVSDKSASIWDTFVHTRPEIIADRSNGDVACDSYNQWMNDVEIASELGLDFYRFSLSWPRILPYGFANKISEDGVKFYTNLIDALLERGIEPVVTIYHWDLPQNLQDLEFISHDPSIAAYTDSVKVKNKLFIMYRTKRAGKVSLTNQIMWFEGADENDGEAAELALQLMGGMYSHPIFSKKGGWPKKVENLIAEKSKKEGYPQSRLPEFTKEEKELIKGTYDFFGLNYYTSRIARRARGEVVGPWPLKGGPDIDVKTSVRPEWPQAGTSWFYVHPQGLRKLISWVKEQYGDIEIFIAENGFATHGQDLDDQVRVDYYKSHLEQVHLAIEEDKANVVAYTAWTMIDNFEWSDGYRSKFGLYEVDFSDPARARRPRASAHYYKEIVKAKSLDVDSHVLNDEL-