Monarch geneset OGS2.0

DPOGS207424
TranscriptDPOGS207424-TA888 bp
ProteinDPOGS207424-PA295 aa
Genomic positionDPSCF300087 + 337597-348024
RNAseq coverage1264x (Rank: top 10%)
Annotation
HeliconiusHMEL0020772e-8389.66% 
BombyxBGIBMGA009323-TA9e-9678.48% 
DrosophilaCG4747-PB2e-4133.92% 
EBI UniRef50UniRef50_E2A6761e-6653.82%Nuclear protein NP60-like protein n=3 Tax=Formicidae RepID=E2A676_CAMFO
NCBI RefSeqXP_623061.17e-6652.38%PREDICTED: similar to CG4747-PA, partial [Apis mellifera]
NCBI nr blastpgi|3071847345e-6653.82%Nuclear protein NP60-like protein [Camponotus floridanus]
NCBI nr blastxgi|3071847343e-6652.00%Nuclear protein NP60-like protein [Camponotus floridanus]
Group
Gene OntologyGO:00551143e-61oxidation-reduction process
GO:00164913e-61oxidoreductase activity
GO:00054883.7e-10binding
GO:00060984e-09pentose-phosphate shunt
GO:00046164e-09phosphogluconate dehydrogenase (decarboxylating) activity
KEGG pathwayaag:AaeL_AAEL0066847e-57 
 K00020 (E1.1.1.31, mmsB)maps-> Valine, leucine and isoleucine degradation
InterPro domain[11-271] IPR0158153e-613-hydroxyacid dehydrogenase/reductase
[5-73] IPR0003139.5e-18PWWP
[232-269] IPR0160403.7e-10NAD(P)-binding domain
[234-268] IPR0061154e-096-phosphogluconate dehydrogenase, NADP-binding
Orthology groupMCL18349 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207424-TA
ATGTCTGTTGCTTATAAACTTGGTGATTTAGTTTGGGCCAAAATGAAGGGGTTCAGCCCTTGGCCCGGCCGTGTGGCCATTCCTACTCCAGAACTAAAGCCTCCAAAGAAAGCTATGAATGTACAATGCATATACTTCTTTGGTACAAATAACTACGCGTGGATCGAGGAGCACAACATAAAGCCCTACCAGGAGCACAAAGAACAGTTGATAAAGTCGTGTAAAACGACCGCCTTCAAGGAGGCCGTGGCGCAGATTGAAGAGTACATCGAGAACCCTGAGAAATGGGGGGATTTCGAAGAGCACGTCCGTAACCCCGAAGAGGAGTTTGACAAACTAAAAGATGTCATGAAAGAAGAGAGCTCCGATGAGAAACAACAGGAAGAGGAACCCTCGCCCGAGGTGAAGAAGAAGAGCTCCACACCAAAGATACGTTATGGATTTTCAAAGGTGAAGCCAGTGAAGCGGTCGTCGTCGGCCGGCCTGAAGCCCACCCCCCGCAAGAAGGCGAAGAGCGTCTCCCGCTCCTTCACGGACACGGACATCGCCAACGAGAAGCCCTCCATGCTGAACCACTCCTTCTCCAAGAAGTCCAGCCTGCTGCACCGCCCCTCCAATATATTGAGACCCGACACGCCGCCGTTAGACCTGGAGAACGTGTCGGAAACCCTGCTGGAGAAGAACATCAAGGCCAGTCAGATGAAGTTCGGTTTCCTCGGCCTGGGGATCATGGGCAGCGGCATCGTTAAGAACCTGCTGAACTCCGGACACAAGGTGCTGGTGTGGAACAGGACGGCGGCCAAGGGGCGGCTCCCGGCTGAGAGCACACCACCGCTCATGATGGCTGGTCGTGGTTTGAAATTCAACCTTATACAACCTTTACTATAG

Protein sequence:

>DPOGS207424-PA
MSVAYKLGDLVWAKMKGFSPWPGRVAIPTPELKPPKKAMNVQCIYFFGTNNYAWIEEHNIKPYQEHKEQLIKSCKTTAFKEAVAQIEEYIENPEKWGDFEEHVRNPEEEFDKLKDVMKEESSDEKQQEEEPSPEVKKKSSTPKIRYGFSKVKPVKRSSSAGLKPTPRKKAKSVSRSFTDTDIANEKPSMLNHSFSKKSSLLHRPSNILRPDTPPLDLENVSETLLEKNIKASQMKFGFLGLGIMGSGIVKNLLNSGHKVLVWNRTAAKGRLPAESTPPLMMAGRGLKFNLIQPLL-