Monarch geneset OGS2.0

DPOGS201064
TranscriptDPOGS201064-TA1869 bp
ProteinDPOGS201064-PA622 aa
Genomic positionDPSCF300185 - 371156-374869
RNAseq coverage103x (Rank: top 60%)
Annotation
HeliconiusHMEL0098470.079.70% 
BombyxBGIBMGA007153-TA0.074.47% 
Drosophilatobi-PA7e-16551.80% 
EBI UniRef50UniRef50_Q19P000.078.31%Glycosyl hydrolase family 31 protein (Fragment) n=2 Tax=Obtectomera RepID=Q19P00_BOMMO
NCBI RefSeqXP_002073831.19e-17253.48%GK14321 [Drosophila willistoni]
NCBI nr blastpgi|1030581580.078.31%glycosyl hydrolase family 31 protein [Bombyx mori]
NCBI nr blastxgi|1030581580.078.31%glycosyl hydrolase family 31 protein [Bombyx mori]
Group
Gene OntologyGO:00045534.9e-185hydrolase activity, hydrolyzing O-glycosyl compounds
GO:00059754.9e-185carbohydrate metabolic process
GO:00081521.3e-19metabolic process
GO:00038241.3e-19catalytic activity
KEGG pathwaydme:Dmel_CG119096e-163 
 K01187 (E3.2.1.20, malZ)maps-> Starch and sucrose metabolism
    Galactose metabolism
InterPro domain[43-616] IPR0003224.9e-185Glycoside hydrolase, family 31
[215-555] IPR0178532.1e-75Glycoside hydrolase, superfamily
[341-511] IPR0137851.3e-19Aldolase-type TIM barrel
Orthology groupMCL10426 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201064-TA
ATGAAGTGGTTTGTGCTTATAACAGTTCTGGCGGTGGCGCTGGGAGCCGTGCCACGAAAACCAACAGCCCGTGACTTCTTCTTAGAAGAAGCGGATAACGATGCTTTTACCCTGATTGTTTTAACAGAGGGCAGGCGGGTTGTGCTTGGTGAAATCGGAAGAAAAGTTTCCTTAAACCATGACGACATTAGTTTTGAGATGTACGAAGAGCGAGATGAGGAGAGTGGCGGCTACCATGTAACGATCTCGTGGGAGGGGCCGAGCTCGGTCGTCTTCGAGGATTGTTTAGATTTTGGTGACAGACAATGGTACGGAGGCCCAGAGCAGAAAGAACAGTACTGGCCCATTCAAAAATCTAAACTCGAGAAGTACTCCATCATATCCAAAGAAGCAGATAACGCAGCTGTGTCTGAAAGGTACTGGGTGAACTCCGCCGGCTGGTACGTCTACATTCAACCAGAGGTACCGCTTTTTGTTGATCATCACAACATCCTCGATAACCACATTTGTTTTGTGGCCGAGGTTGCCGATCCCTACTCCAGCAAACGCCCGAGGAACGTCCTGAAATATGACCTATGGTTCTTCGATACTCCTAAAGATGCTCACATGCACGCTGTACATACCTATTTAGGAAAACCATCAGGAGTTCCCGATTACAGAATGATCCAATACCCAGTTTGGTCGACGTGGGCGAGGTACTCTAGGGAGATTGACCAAGAAAATCTATGGACTTTCGCAAACGAAATTAAGGACAGTGGTTTCCCCAACGCACAATTCGAAATCGATGATCTATGGGAAGTTTGTTACGGTTCTTTGACGGTCGATGAAAGGAAATTGCCTGATTTCAAACAGCTTATACAAGACATAAAAGCTCTAGACTTCAGGGTGACCATATGGGTACATCCGTTTATCAATAAAGATTGTGAACCATGGTATTCAGAGGCATTAGGAAAAGGCTATCTAGTCCTCAACGAGGAAGGCAGTCCTGACTCGAGCTGGTGGAACAACAACGGCTCCGTTCCTGGATACATCGACTTCACCAACCCTGACGCTGCAGAGTGGTACAGCTCCAGGATCCGGAATCTTATTGAAACATACGACATCGACAGCTTGAAATTTGATGCCGGAGAGTCGAGCTGGTCGCCTCAGATTCCAGTACAAAATGGGGACATAGAACTCCATCCAGGTCACATCGTTCAATCTTACGTGAGGACAGTCGCCCAGTTCGGACCCATGATTGAGATACGATCTGGGATGAGAACTCAAGATCTGCCAGTGTTCATTCGTATGGTGGACAAGGATACCCTATGGGACTTCAACAACGGCCTGGCGACTCTGGTCACCACTCTCCTACAGATGAACATGAACGGCTACGGCCTGGTGCTGCCCGACATGATCGGGGGCAACGGGTACAACGAGAAGCCCAGCAAGGAGCTGTTCGTGAGGTGGCTTCAAGCCAACGTATTCATGCCAACGCTGCAATACTCATTCGTCCCTTGGGACCATGATGAAGAAGCGGTCGAGATCTGTCGTCGCTACACCCAGCTGCACGCGGAGTACTCCCCACTGATTCTGGAGGCGATGGAAGCGGCCGTAGAGCGCGGGGAACCGGTCAACGCACCAATCTGGTGGCTCGACCCTCAGGACAAGGACGCCCTGGAGATATGGGATGAATTCCTACTCGGTGAAAGTGTTTTGTCGGCCCCTGTGTTAGAAGAGGGGGCGGTGTCCCGAGACATCTACCTGCCCAAGGGTCTCTGGAGGGACGGTAACAGTGGTGAGATGATCACCGGCCCCCAGTGGCTGCGGGATTACCCCGCACCGCTAGACGTGTTGCCATACTTCGTCTTGGAAGAGAAACACGTCTAA

Protein sequence:

>DPOGS201064-PA
MKWFVLITVLAVALGAVPRKPTARDFFLEEADNDAFTLIVLTEGRRVVLGEIGRKVSLNHDDISFEMYEERDEESGGYHVTISWEGPSSVVFEDCLDFGDRQWYGGPEQKEQYWPIQKSKLEKYSIISKEADNAAVSERYWVNSAGWYVYIQPEVPLFVDHHNILDNHICFVAEVADPYSSKRPRNVLKYDLWFFDTPKDAHMHAVHTYLGKPSGVPDYRMIQYPVWSTWARYSREIDQENLWTFANEIKDSGFPNAQFEIDDLWEVCYGSLTVDERKLPDFKQLIQDIKALDFRVTIWVHPFINKDCEPWYSEALGKGYLVLNEEGSPDSSWWNNNGSVPGYIDFTNPDAAEWYSSRIRNLIETYDIDSLKFDAGESSWSPQIPVQNGDIELHPGHIVQSYVRTVAQFGPMIEIRSGMRTQDLPVFIRMVDKDTLWDFNNGLATLVTTLLQMNMNGYGLVLPDMIGGNGYNEKPSKELFVRWLQANVFMPTLQYSFVPWDHDEEAVEICRRYTQLHAEYSPLILEAMEAAVERGEPVNAPIWWLDPQDKDALEIWDEFLLGESVLSAPVLEEGAVSRDIYLPKGLWRDGNSGEMITGPQWLRDYPAPLDVLPYFVLEEKHV-