Monarch geneset OGS2.0

DPOGS215551
TranscriptDPOGS215551-TA1509 bp
ProteinDPOGS215551-PA502 aa
Genomic positionDPSCF300129 + 332266-334504
RNAseq coverage24x (Rank: top 78%)
Annotation
HeliconiusHMEL0116175e-12346.91% 
BombyxBGIBMGA002293-TA4e-10543.25% 
DrosophilaCG7985-PA4e-7033.20% 
EBI UniRef50UniRef50_UPI0002060CB15e-7540.00%UPI0002060CB1 related cluster n=1 Tax=unknown RepID=UPI0002060CB1
NCBI RefSeqXP_001649003.14e-7734.13%hypothetical protein AaeL_AAEL004383 [Aedes aegypti]
NCBI nr blastpgi|1571057348e-7634.13%hypothetical protein AaeL_AAEL004383 [Aedes aegypti]
NCBI nr blastxgi|1571057347e-7433.99%hypothetical protein AaeL_AAEL004383 [Aedes aegypti]
Group
Gene OntologyGO:00431691e-21cation binding
GO:00059751e-21carbohydrate metabolic process
GO:00038241e-21catalytic activity
GO:00045531.4e-10hydrolase activity, hydrolyzing O-glycosyl compounds
KEGG pathwaytca:6611886e-59 
 K04678 (SMURF)maps-> Ubiquitin mediated proteolysis
    Endocytosis
    TGF-beta signaling pathway
InterPro domain[16-330] IPR0178539.7e-33Glycoside hydrolase, superfamily
[21-240] IPR0137811e-21Glycoside hydrolase, subgroup, catalytic core
[73-234] IPR0158831.4e-10Glycoside hydrolase, family 20, catalytic core
Orthology groupMCL12442 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215551-TA
ATGGAAGTTGCAGTAACAGAAAGTAATAAACATCCTACGCTTAAACTGAAAAATGTGATTTTACATTTAGATTTTAAAGGTTCTCCGCCTAAATTAAGTTATTTAAAGACTCTGCTTCCTAAACTACAGAGCTTGGGTGTCACCGGACTGCTAATGGAATACGAAGACATGTTTCCTTATGAAGGAAAATTGGTTAATTTAAGTGCTGAAAATCATTATGAAATTATTAAGCTTCAGGAGTTCGTCACTATTGTTGTCCGGCTTGGCCTGGATCTCATACCCCTCGTACAAACATTTGGTCACTTGGAACATGCTCTAAAGCTTCGGGAATTTCAACATTTAAGGGAAAACCCACTATATCCCGATTCAATTTGCCCGAGCCAATCAGAGAGTTATGATCTCATAAAGGCTATGCTCGATCAGATCATCAACTTCCATGAAAACATATTTCCACTTAAATATTTGCATATTGGTTCCGACGAAGTCTATCATATTAAGGAATGCAAAAAATGTTTGAGAAGTAAACTTACGGACATGGACATTTACCTAAGTCATGTTGAAGCAATATCACATTATATAAAAATTAGAAGTCCTTTGACGACAGTACTGCTCTGGGATGATATGTTGAGGAAAATTCCTATGAAAAAGTGGAGATATGTAACATTAGGAAAAACTAATATAGAACCAGTGTACTGGGACTATAAACCCTCGATCAAAGTTTCCCACACGAGTTTGATACAATACCACAAAAAGTTTAAGAACATATGGATTGCTTCAGCATTCAAAGGAGCTGATGGTAGAGTTGCAACATTTCCAGACTTAAGGAAAAGATTATTGAATAATTTTAGTTGGTTAAATCTGATATTTGACTACAAATTTGGAGGCGAAAGTGAAATTTACGAATTTAGCGGAATTATACTTACTGGATGGTCCCGATATTCTCATATGGATCCGCCGTGCGAATTATTACCAGTTGCTACACCAAGCCTTTATTTGAATTTATTAATGATAAAAACTTTTAAATACTCAGACTCTAAACCAAAGGATATCTCAATAGCCCTCAATTATATTAACAAAGACTTTTCAACCAATTTACATTGCCAATATGAAATTAATATAGATAACTTTAATTCAATTCATTGTCACTTCGAAGGAAACGAACTATTTAAGCTCTTAATGGATTGTGAGAAGATTATCAATGATATAACAAAAACAATCGCAGATGTTGAAACAGATTTATCAACACTGGAATTGTACTCAAAAAATTATTACAACAATATAAATATGTGGACAAAAAATTTTAAGTGGTGCATAGATTCGATTAACAGTCTTAATGATATCAGAGAAAAGTTGATTGGTAATTTATCGCATTACTATGGTTCATCTTTTGTTACAGAATATGTTGATTATAAGTTATTTAACACAGAGAACATTATAAAAAATATTATGAAGATTTTAAACAATATGTTTAAAGTGAAGAATTGGAAGAGAAGACTAGAACTAGACTAG

Protein sequence:

>DPOGS215551-PA
MEVAVTESNKHPTLKLKNVILHLDFKGSPPKLSYLKTLLPKLQSLGVTGLLMEYEDMFPYEGKLVNLSAENHYEIIKLQEFVTIVVRLGLDLIPLVQTFGHLEHALKLREFQHLRENPLYPDSICPSQSESYDLIKAMLDQIINFHENIFPLKYLHIGSDEVYHIKECKKCLRSKLTDMDIYLSHVEAISHYIKIRSPLTTVLLWDDMLRKIPMKKWRYVTLGKTNIEPVYWDYKPSIKVSHTSLIQYHKKFKNIWIASAFKGADGRVATFPDLRKRLLNNFSWLNLIFDYKFGGESEIYEFSGIILTGWSRYSHMDPPCELLPVATPSLYLNLLMIKTFKYSDSKPKDISIALNYINKDFSTNLHCQYEINIDNFNSIHCHFEGNELFKLLMDCEKIINDITKTIADVETDLSTLELYSKNYYNNINMWTKNFKWCIDSINSLNDIREKLIGNLSHYYGSSFVTEYVDYKLFNTENIIKNIMKILNNMFKVKNWKRRLELD-