Monarch geneset OGS2.0

DPOGS203761
TranscriptDPOGS203761-TA894 bp
ProteinDPOGS203761-PA297 aa
Genomic positionDPSCF300010 + 137764-139133
RNAseq coverage222x (Rank: top 45%)
Annotation
HeliconiusHMEL0025632e-10057.48% 
BombyxBGIBMGA000695-TA8e-7346.53% 
DrosophilaCG2767-PB1e-4734.28% 
EBI UniRef50UniRef50_G9F9F68e-9656.75%Seminal fluid protein CSSFP005 n=3 Tax=Obtectomera RepID=G9F9F6_9NEOP
NCBI RefSeqXP_002426538.12e-7750.17%aldo-keto reductase, putative [Pediculus humanus corporis]
NCBI nr blastpgi|3640235593e-9556.75%seminal fluid protein CSSFP005 [Chilo suppressalis]
NCBI nr blastxgi|3640235595e-9256.75%seminal fluid protein CSSFP005 [Chilo suppressalis]
Group
Gene OntologyGO:00551149.2e-49oxidation-reduction process
GO:00164919.2e-49oxidoreductase activity
KEGG pathwaytet:TTHERM_006975409e-57 
 K00011 (E1.1.1.21, AKR1)maps-> Galactose metabolism
    Glycerolipid metabolism
    Pentose and glucuronate interconversions
    Fructose and mannose metabolism
    Pyruvate metabolism
InterPro domain[7-294] IPR0013959.1e-113Aldo/keto reductase
[11-286] IPR0232102.2e-95NADP-dependent oxidoreductase domain
[42-66] IPR0204719.2e-49Aldo/keto reductase subgroup
Orthology groupMCL18242 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203761-TA
ATGGCTGACCAAAATTTCGCTGAAGTGAAGTTTACATTAAGAAACGGAATTAAGATGCCGGCTGTTGGTTTAGGAACTTTTCGTATACGTGACCCTGCGGTAGTCTACAGCGCTGTCGACAGCGCCTTGGCAGTCGGCTATCGCTTGTTTGATACAGCTGCAGTATACCAGAACGAACGCTTTTTGGGTGACGCATTACGTGATCTATTACCCAAGTACGGTCTGCAACGTAGTGATATTTACGTCACAACTAAATTGTCGCCGTCCGATCAGAGCGCTGAGCTTGTCCCGAAAGCATTCAGTAAATCATTAGAAAACTTAGGTCTTCAGTATATTGATTTGTATTTAATACATTTTCCTGGTGCGGCTAGAATCAACTCGAGCGATCCTAAAAACGAGGCATTAAGGAACGAGACTTGGAACGCTATCACAAAACTTTACGATACAGGCAAGGTGAAAGCAATTGGAGTATCAAATTTCACTGTGAGACACTTAAGGCAATTACAGAAATCATCGACGATAGAGCCGATGGTTAATCAGGTCGAATGGCATCCGCATTATTACGAAAGTGATCTCTTGGAATATTGTAACACTCACAACATAAGACTTCAAGCATACTGTTCGTTTGGGGGTCAAGCGATCAGAAATAATTCTTTGATGGAGGACCCTGTAGTTAGAGATATTTCTGCAAAACATAATGCAACACCAGCACAAGTGCTGTTAACATGGGCGCTGCAACGGGGCATAGCAGTTATCCCTAAATCCGTAACACCTCAGAGAATCAAAGAAAATATTCAGCTCAGTATGAGAATGTCTCCAGAGGAACTGTCCTTATTAGATTCTCTTAGGAATAACGGCCTTAAATATGCTTGGGATCCCAATCCTATTGCTTGA

Protein sequence:

>DPOGS203761-PA
MADQNFAEVKFTLRNGIKMPAVGLGTFRIRDPAVVYSAVDSALAVGYRLFDTAAVYQNERFLGDALRDLLPKYGLQRSDIYVTTKLSPSDQSAELVPKAFSKSLENLGLQYIDLYLIHFPGAARINSSDPKNEALRNETWNAITKLYDTGKVKAIGVSNFTVRHLRQLQKSSTIEPMVNQVEWHPHYYESDLLEYCNTHNIRLQAYCSFGGQAIRNNSLMEDPVVRDISAKHNATPAQVLLTWALQRGIAVIPKSVTPQRIKENIQLSMRMSPEELSLLDSLRNNGLKYAWDPNPIA-