Monarch geneset OGS2.0

DPOGS200443
TranscriptDPOGS200443-TA957 bp
ProteinDPOGS200443-PA318 aa
Genomic positionDPSCF300236 + 483883-510591
RNAseq coverage3437x (Rank: top 4%)
Annotation
HeliconiusHMEL0115983e-6341.67% 
BombyxBGIBMGA008902-TA1e-8185.98% 
DrosophilaCG9265-PA8e-9859.57% 
EBI UniRef50UniRef50_Q8SYX34e-9559.22%RE29926p n=21 Tax=Endopterygota RepID=Q8SYX3_DROME
NCBI RefSeqXP_001965258.12e-10357.81%GF24197 [Drosophila ananassae]
NCBI nr blastpgi|1947662914e-10257.81%GF24197 [Drosophila ananassae]
NCBI nr blastxgi|1582993072e-10060.85%AGAP010232-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00054881.2e-54binding
GO:00081521.4e-26metabolic process
GO:00164911.4e-26oxidoreductase activity
KEGG pathwaygga:4201832e-55 
 K11151 (RDH10)maps-> Retinol metabolism
InterPro domain[25-246] IPR0160401.2e-54NAD(P)-binding domain
[27-192] IPR0021981.4e-26Short-chain dehydrogenase/reductase SDR
[27-44] IPR0023471.8e-21Glucose/ribitol dehydrogenase
Orthology groupMCL14743 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200443-TA
ATGCTTGCTATTGGTTATGTTATACAAGCAATATACAGATCCATAAGAGGACATCCGAAGAAGGATCTCAAAGGTTGTATAGCTCTTGTGACCGGGGGAGGGGGTGGGCTGGGAAGTCTCGTTGCTTTGAGGCTAGCCAGGCTTGGATGCGTCGTGGTGCTGTGGGATATCAATAGACAAGGTTTGGAAGATACTGTCAAGCTAGTCAAGGGTATTGGCGGTAAATGCTACGGATATGTCGTCGACTTGGCAAGTCGAGACGACATCTATAACACAGCGAAACAAGTAGAGAAAGAAGTCGGGAAAGTGTCGCTGTTGATCAATAACGCAGGTGTTGTATCAGGACAATACCTTCTGGACACCCCCGACTATCTTATACAGAGGACATTCGATGTTAATATTTTAGCACACTTCTGGACAGTGAAAGCTTTCTTGCCAGCTATGATAGAAGATAATGATGGTCATATAGTGACCATAGCATCAATGGCTGGTCAAGTCGGGGTAGCCAAGTTAGTCGACTATTGCGCCTCTAAATCAGCCGCCTGTGGCTTCGACGAGGCGTTAAGATTAGAATTAGAAGTCAAAGGTGCTAAGGGTGTGAACACATCGCTGATATGCCCCTACTTCATCCGAGCGACGGGAATGTTTGAAGAGGTTAACTCGAGGTTCGTACCTACCCTCAGCCCGAATGAAGTAGCTGATCGCGTGGTTTTGGCGATAAGGACGAACGAGCCAGTGGCCGTAATACCCTCATATTTCAGACTGCTACTGCCATTTAAATGGATCGTGCCATGGGCGTGCATATCAGAATTGATTAGAGGTCTAGTCCCCGATGCTGTCCCAGCGCCGATGCCGTTGCCAGATCCACTTGAGAAGCCTGTACTCGCTACCCCTAGCAAGGCTGATTCACGGCCGATGACCCTCACACCACCCTCCAGACACGATCGTCAAGTGTGA

Protein sequence:

>DPOGS200443-PA
MLAIGYVIQAIYRSIRGHPKKDLKGCIALVTGGGGGLGSLVALRLARLGCVVVLWDINRQGLEDTVKLVKGIGGKCYGYVVDLASRDDIYNTAKQVEKEVGKVSLLINNAGVVSGQYLLDTPDYLIQRTFDVNILAHFWTVKAFLPAMIEDNDGHIVTIASMAGQVGVAKLVDYCASKSAACGFDEALRLELEVKGAKGVNTSLICPYFIRATGMFEEVNSRFVPTLSPNEVADRVVLAIRTNEPVAVIPSYFRLLLPFKWIVPWACISELIRGLVPDAVPAPMPLPDPLEKPVLATPSKADSRPMTLTPPSRHDRQV-