Monarch geneset OGS2.0

DPOGS212760
TranscriptDPOGS212760-TA1617 bp
ProteinDPOGS212760-PA538 aa
Genomic positionDPSCF300012 + 788444-792218
RNAseq coverage354x (Rank: top 33%)
Annotation
HeliconiusHMEL0073461e-12743.26% 
BombyxBGIBMGA013129-TA4e-12648.54% 
DrosophilaCG31414-PB6e-8233.39% 
EBI UniRef50UniRef50_UPI0000519EB36e-10339.59%UPI0000519EB3 related cluster n=9 Tax=unknown RepID=UPI0000519EB3
NCBI RefSeqXP_393207.21e-10339.59%PREDICTED: similar to glucocerebrosidase precursor isoform 1 [Apis mellifera]
NCBI nr blastpgi|3504168501e-10439.81%PREDICTED: glucosylceramidase-like [Bombus impatiens]
NCBI nr blastxgi|3504168502e-10440.08%PREDICTED: glucosylceramidase-like [Bombus impatiens]
Group
Gene OntologyGO:00066656.8e-175sphingolipid metabolic process
GO:00057646.8e-175lysosome
GO:00043486.8e-175glucosylceramidase activity
GO:00070406.8e-175lysosome organization
GO:00431692.2e-119cation binding
GO:00059752.2e-119carbohydrate metabolic process
GO:00038242.2e-119catalytic activity
KEGG pathwayame:4097083e-103 
 K01201 (E3.2.1.45, GBA, srfJ)maps-> Lysosome
    Sphingolipid metabolism
    Other glycan degradation
InterPro domain[1-530] IPR0011396.8e-175Glycoside hydrolase, family 30
[84-441] IPR0137812.2e-119Glycoside hydrolase, subgroup, catalytic core
[98-448] IPR0178532.6e-86Glycoside hydrolase, superfamily
[442-512] IPR0137804.4e-08Glycosyl hydrolase, family 13, all-beta
Orthology groupMCL10162 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212760-TA
ATGAGAACGCTAATCTTGCTTGGAGGCCTACTTACTGTTGGAGTTTACGCGGATGTATCATGCGCACCGAGATCCCTGAACAACTCAGTGGTGTGTGTCTGTAACGCCACTTACTGTGACACCGTGACGAGGAGGGTGCCTGAAGCTGGCACCTACATAGCTTACACTTCCTCCAAGTCAGGATTACGATTCAGCATCACTCAGGGAGACATTGAAGACGCAGATACATCTTATAGCAGTGCTGATTATGGAAAGGTTTTCGACTTACAACCAAGCAAAGTTTATCAAGCTATTGAAGGATTCGGAGGAGCAGCGACTGACGCAGCGGGAATAAATTGGAAGAAAATGAAGCCGGCTATTCAGGACACTCTAGTGAAGTCCTACTTTAGTGAAGACGGCTTAGAATATAATATAATCAGACTCCCGATTGGTTCTACTGATTTCTCGACACGTTTTTATGCTTATAACCAATATCCGAAAGACGACACAGCTCTCAGCAACTTTACTTTCGCTCCAGAAGATATAAAGTATAAGGTTCCTTTAGTGAAGTCCTGTTTGAGTGCTGCAAATAATGAAGTAAAAATAGTGTCTGCGACATGGTCACCACCAAAATGGATGAAAGTAAAGGAGCCACAAAGCGGTATTAGTTTTATTAAAGAGGAATTTTACCAAGTTTACTCCGATTACCACTGCAAATTTGCAGAGCTGTTTGAAGAAGAGGGTATTCATATTTGGGGAATATCCACTGGAAATGAGCCTTTAGTGAATATGTTCGCTGGTGTGAGAAAAGACGAAACAGCCTGGAATGCACCCAGTTTTGCGAAATTCATAAGGGAATACTTCGGCCCGACCATAAAAAACTGTTCAGTAAAGGACATGAAAATTCTAGCTATAGAAGATCAACGTTACGCTTTGCCCCTTTTCTTTACAAAATTACAATCCGATACTGAAGCTATGTCTTATGTAGACGGAATATCCTTACATTTTTATGGTGACAAGAATACTCCGGCTTCGACAATTCCAAGAGTTCTCAAGGAATTTCCCGACAAATTCGTTTTATACACGGAAGCCTGTAATGGTCCCCAAAGTCCTAAAGATGAGAAAGTCGTTTTGGGTTCCTGGGACAGAGCCAAGACTTATTTTACGAACATACTTGAGAATCTTAATTACAATGTGGTTGGATGGCTCGACTGGAATTTGTTCTTAGACACAGAAGGTGGCCCGACTTGGACAAAAAATTTTGTTGACTCTTCGATAATTGTCGATTACGATAAACAAGAATTCTACAAACAACCTACATACTATGCAATAGGTCATTTTTCAAAATTTGTTCCCAGAGGCTCTCAAAGGATTAAGGTTAAAACTATTTTACCTGTAACCAACTATGGTTTGGATATAATTGACTTTACGACAGTGGAGGCAAGCTTTTTCGACAACGTAGCCTTTATCACTCCTAAAGGCACTATCGTCGTAATCATACACAACGAGGGGGCAGAACAAAACTGTGCAATACAATTAGGTGATTCGCAAGCCACTGTACTCTTAGAAGCTGAATCCATAACTACGGTCGAGATACCATACGACGGTAAATCACTCGGAACACCGTGCAGCCAATGA

Protein sequence:

>DPOGS212760-PA
MRTLILLGGLLTVGVYADVSCAPRSLNNSVVCVCNATYCDTVTRRVPEAGTYIAYTSSKSGLRFSITQGDIEDADTSYSSADYGKVFDLQPSKVYQAIEGFGGAATDAAGINWKKMKPAIQDTLVKSYFSEDGLEYNIIRLPIGSTDFSTRFYAYNQYPKDDTALSNFTFAPEDIKYKVPLVKSCLSAANNEVKIVSATWSPPKWMKVKEPQSGISFIKEEFYQVYSDYHCKFAELFEEEGIHIWGISTGNEPLVNMFAGVRKDETAWNAPSFAKFIREYFGPTIKNCSVKDMKILAIEDQRYALPLFFTKLQSDTEAMSYVDGISLHFYGDKNTPASTIPRVLKEFPDKFVLYTEACNGPQSPKDEKVVLGSWDRAKTYFTNILENLNYNVVGWLDWNLFLDTEGGPTWTKNFVDSSIIVDYDKQEFYKQPTYYAIGHFSKFVPRGSQRIKVKTILPVTNYGLDIIDFTTVEASFFDNVAFITPKGTIVVIIHNEGAEQNCAIQLGDSQATVLLEAESITTVEIPYDGKSLGTPCSQ-