Monarch geneset OGS2.0

DPOGS203683
TranscriptDPOGS203683-TA1617 bp
ProteinDPOGS203683-PA538 aa
Genomic positionDPSCF300010 - 2135620-2142951
RNAseq coverage387x (Rank: top 31%)
Annotation
HeliconiusHMEL0133260.089.81% 
BombyxBGIBMGA003479-TA4e-18087.08% 
Drosophilagrh-PJ8e-13059.64% 
EBI UniRef50UniRef50_A7UTR53e-13465.77%AGAP005564-PA n=7 Tax=Endopterygota RepID=A7UTR5_ANOGA
NCBI RefSeqXP_001688681.15e-13565.77%AGAP005564-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582943891e-13365.77%AGAP005564-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|2700145562e-12966.84%hypothetical protein TcasGA2_TC004589 [Tribolium castaneum]
Group
KEGG pathway 
InterPro domain[187-289] IPR0076043.7e-40CP2 transcription factor
Orthology groupMCL12006 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203683-TA
ATGGAGTCGCTTTTAGCACGTCCTCAGCCAGAGATAATGTATATAAAGGAATTGTACAGGATCAGGGCGCGGTCGGCCGTCGATGAACGAACTTCTCACCGATCCATCAATCAGAGACGACGATCGGAGAGCGTTGTTATGCTTATGTTCCGGGAAGAAAAATCACCGGAAGATGAAATTAAGGCATGGCAGTTCTGGCACGGTAGACAACATTCAGTCAAGCAAAGAATATTAGACGCCGTCGTGCTCGAGGCGACACACAAACATGCATCGTTGCGTGCAGCGCAGGCGGGCGGCGCTGCGCTAGAGCACGCAGCTCCTAATTACTACGCTGCGAGGCAGATAGCCGCGGAGACGCAACTGCGCTCCGCTACCCACCATTATATGTCGGCGATATCGGACACCTGTGCGCGGGCGTGTGAAAGCGCAAAGGGACTTGCGGTGAGAGGCTGGGGGGAGGGAGGGGAGTGCTCCTGGGCCACAGCTCGCAACGCACCTCGCTCCTATACAATAGCTTCTCCGATTACACACCATGTCTTCTTGTATCGCGTGTCCACTGACACTAAGAACAGCATCGGCCTCGCTGGTTGCATAGAAGAGGTTGCTCACAACGCCATCGCCGTCTACTGGAATCCACTGGAAAGTGCAGCTAAGATTAATATCGCTGTCCAATGTCTGAGCACAGATTTCAGCAGTCAAAAAGGAGTTAAGGGTTTACCACTTCATATACAAATTGACACCTTCGAAGATCCGCGGGACACCCAAGTCTATCACAGGGGTTACTGTCAAATTAAAGTTTTTTGTGATAAGGGGGCTGAAAGAAAAACAAGAGACGAAGAAAGAAGAGCCGCGAAGAGGAAAATGTCAGCAACGAATCGGAAAAAGTTGGATGAGATTTACCACCCTGTGACGGAAAGAAGCGAGTTCTACTCGATGGCTGATTTGTCAAAACCACCCGTTTTATTTAGTCCTGCAGAAGATATTGATAAATTAGCGGGAATGGACATACAGGGATTCTATGGGCACGACGAAGGTGCTCTCGCTGAAGCTCATCTCAAGGGCGCTTCCCCATTCCTCTTGCACGCAGCAAAACCTCAAGCACCAGCACTGAAATTCCACAATCACTTCCCACCTGATGCACCTGCATACAGATCTGATGTAGGTGGTCTGTCTCCGTACAGTGACCGTAAAGATTCTTTGGAATTGGAAGGTGTTTTAGGCAAGCGTGCTCGCACCTCCACCCCTCCTCTTAGCGAACGCGTGATGCTGTACGTGCGCCAAGACACCGATGACGTTTACACACCACTTCACGTCGTGCCACCGACCACGCAAGGCCTCCTGCATGCGACCGGTTACGAGCCATCGTTCGCTCCGTTAGACGAGACGTCTCGGTTTAAAGAGAAAAGGCCCATTATTGAGAATAAGTACAAAATTTCAAGTTCGGCAATTAATAATCTTTATCGAAAGAACAAGAAAGGGATTACAGCCAAAATTGACGATGAAATGCTCGCGTATTACTGCAATGAGGATCTGTTCTTACTGGAGGTGCGACCGGCCGGCGGAGACGACGAGCCGCTATACGACATCACCTTCGTGGAACTGCCCCTTGATCACTAA

Protein sequence:

>DPOGS203683-PA
MESLLARPQPEIMYIKELYRIRARSAVDERTSHRSINQRRRSESVVMLMFREEKSPEDEIKAWQFWHGRQHSVKQRILDAVVLEATHKHASLRAAQAGGAALEHAAPNYYAARQIAAETQLRSATHHYMSAISDTCARACESAKGLAVRGWGEGGECSWATARNAPRSYTIASPITHHVFLYRVSTDTKNSIGLAGCIEEVAHNAIAVYWNPLESAAKINIAVQCLSTDFSSQKGVKGLPLHIQIDTFEDPRDTQVYHRGYCQIKVFCDKGAERKTRDEERRAAKRKMSATNRKKLDEIYHPVTERSEFYSMADLSKPPVLFSPAEDIDKLAGMDIQGFYGHDEGALAEAHLKGASPFLLHAAKPQAPALKFHNHFPPDAPAYRSDVGGLSPYSDRKDSLELEGVLGKRARTSTPPLSERVMLYVRQDTDDVYTPLHVVPPTTQGLLHATGYEPSFAPLDETSRFKEKRPIIENKYKISSSAINNLYRKNKKGITAKIDDEMLAYYCNEDLFLLEVRPAGGDDEPLYDITFVELPLDH-