Monarch geneset OGS2.0

DPOGS206638
TranscriptDPOGS206638-TA3285 bp
ProteinDPOGS206638-PA1094 aa
Genomic positionDPSCF300048 - 429940-482861
RNAseq coverage100x (Rank: top 61%)
Annotation
HeliconiusHMEL0111470.068.43% 
BombyxBGIBMGA008339-TA0.060.95% 
DrosophilaCG32529-PA6e-2033.48% 
EBI UniRef50UniRef50_UPI00022C8AB82e-4239.22%UPI00022C8AB8 related cluster n=1 Tax=unknown RepID=UPI00022C8AB8
NCBI RefSeqXP_001816051.14e-6042.86%PREDICTED: similar to AGAP004446-PA [Tribolium castaneum]
NCBI nr blastpgi|2700060022e-5943.61%hypothetical protein TcasGA2_TC008137 [Tribolium castaneum]
NCBI nr blastxgi|3838634582e-8330.02%PREDICTED: uncharacterized protein LOC100880619 [Megachile rotundata]
Group
KEGG pathway 
Orthology groupMCL25466 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206638-TA
ATGCGGTCACAAGCTGCAAAACCCTTACCTAAAAAGGGACCTTTGGCTAAAACATCTTTTAAAGCTAGTACCGCTACAAAACAAGTTAAGAAAAAAGCAGTAGCAACTAAAAAAGTGAAAACACTTCAAAAATCAACGACAAAAAGTCCAAAAAAGAATACTAGTGACGAATCCAAAAAGGACAATGATTCGTCAAATAAAGAAAAAAATGATTCCACGTCCGATACTGAAAAGGCTGGACCTTCCAAGACGATTAATATTAGAGAGGAGTCTCCTATGAAAAAAGGGAAAAAAGACCCAAAGGAAGCAAAAGACAAGAATAAAGCTGCCCCTAAGAAAGTTCTCTCAAAAAAACCTATTAAATTAACAAAAAAGGCCAACCCCGCAGTTAAAACTAAAAAAGACAGTGATGAAGTAAAGTACAAAATAAACTTAGAATCTGTTAAGAAAACGTCTGCAAAAAATCCAAAAGCAAAGACAAATATTAAAAAAAGCAATACAGGAAATCAAAAAAAGAAAAGTAATTTACAAAGTAAGGAGAGTGAAGTAAAAACTTTAGACAAGCATGATGAAGACGAAAAATGTCCTAAAGATAAAGTTAAAGATTCTTCGAATATTTCTACCATAACTGAGAATAAAGAAGACAAACCTGATAAAGATATTGCAGAGTCTGAAAAGAAATCAGAAAAGAAAGAGGAATCCAATAAAATTCTTAAGGGTAATAGCAATTTTTCGGAAACTGCATATCAAAGTATAACAGTTACCAAAAGCACTACTCAATCAGAAACACAAAAAACTGAAACAGAACGGCCTCCGAAATTTTTTACTTCCATCAGATCTCCGAGAACAGTCGATATAGACGTTAAATGTTGTAAGAATTCTAAAGGAAATTTGAGTGTTTCTGAAAGATTTATATCTCTTAAATGCGAAAGTAAAACGGATGACCTTAACAAAGATAAAAATACGTCAAATAATCTCGTATCTCAGGGCCCAGAAATAGACGTTTATACTTTCACAGAAAAGGTTGATTCACCAAAGAGCATATTATCTGATTTTAGAAATCCGATAAATAAAAATATTGGAAGAGTACGGCCAATTGCTAGAGTAAAAGGAACCGTTATAGATAAAAAATTAGATAATGAAAGGAAAAAGTTCTCTCCACATAGACCCATTGAGGAAGTAGTCAAACAACTTAAGGCAAATAAAAGAACAGAAGATTTAAGCTTGTCAAAAAATAATTCAGCTGCAAGTTTGGACTTTAATGTTTTCAACGACGAAAAAGAAGAAGTGCAACCACCTATTGTAAGTAAATCATACAAAATGGTAGCTAGAAAGTCTAGCATTACTGGTAAACCATTCAGCCCTATAAAATTCCTTGACCAAGATTCTCCTCCAAAACAAGAGGTAAAGACTCCAACTAAAGTTACCAATACAAAAAAAAATATTAGCAGATCTAAGAAACCACGGAAGAAAAGTACTTCATCTGATGATGAAATGAATACGTCAACTTTTGCACTCAATACAAAGACATTTCATTACACTTCTTCAGAAGAAAGCGTCAGCGAATCAAATGATGATAATGATCAAGTAGAATCTTCCGAATCCAGTTCCTCGAAGACTAAAAAGACAAGGAATAAAAAGTATAAGAGAATAACCGATGGTGCTAAGAAATCAAATTCTAAGGAATTGAAAGAATTGTCAAAAGATTCCTTGATTGATATCAAAGATACTTCTGCGTTACTAGGAAATGAAAAACCTAGCAAAAGGCGGCTAAAACTTTTATCAATGTGGTCAGGGCCGAAGAAACATAGAATGGCCTCTCTCAACGCTCTTGCAAAAGTACATTGTTTATACGAAAATGAAAGCAGAACTCATATGGAACTAGGTCTAATGAAGACAGTTGACAGGCAAGCTATGCCAAGTACATCTAAATCTAATTTGCCAAAGAAAAAAGAAACAAAAACTAGGGATTCCGACAAACAAGAAAGTACGTCAGAGTCTGAATACGAAAATAAAAAGGAAACGGAAAAAGACAGTTCGGAGAGTTCAGATGATAATCCGCCACAGAGAACTTTACGTGGCGTACCTGGAATAAGGAGTGCTGGTAAATATTGGGATCCGAGATCCTCTACGTCTTCCAGTGAAGATAGTGAATTAGAATCTAAAAGCAAAAATATCGCAACTGACAAAAAGAAAGCCTCCACCAAACCATCAGCTAAGTCAGATTCTGATAAGCCACCTCCCAAAATGAAGAAAACAGCTGGCGTTCCAGTAAAGAAGAAGCGTAACAGAAATGAAGTTGTTATGGACTTAAAAGATATGGTTGTTCAGAAACGTATGGCAAGCTTAAATGCCACTGCTATTCTAGCCGCCAGTTACGAGAAACGATCGCCTAAATCAAGCAAAGACGACACAACGTCAGACTCGTGCTCAGACGATTCTTTTTCACAGAAGCCAAAAAACGGATTAACTTCCGGCATAAAGTCGGAATTAAAAATAGAAGATACTAAAAAAGAATGCGAAGAGTTGTCTGATAGGAATCAAAAAGTTGAAGTAATCGTCAACCAAGATACAGATGTAACGATCACCGGCGTATATTCAACTCATCTTCATGAAGGATTCTGCACCGTATCGGGAATGCAATATCGTATCTCTTCAACAAGCCACACTCAAACAACAGCCACCGCTAATTGCGAAAAGGAGGGTTGTTCCCGAGAAGATGGTTCTCGTTACACACCCCTCTCGGCTCTGTCGTCTATGCAGCCTCCGGCGGACCACTCCCATCATCCACATCCAGTTCCGGAACTTGGCGGTTTGGCGAGGCGAGCTGCTGGTTGTTCCAGCGCATTCTCAGCACCCTCACCAGCTGCACATCATGACCCAGTTCAGCGGGAAGCTTCGCGTCGCTCCCGCAGGTGTACCCCTCCTCCTTCCCCCGCACCAGCTCCCCGCACACCCCGCAACCGTCACCCTCTCCCGCCCCGCACCCCGCCACCCCCGCTCGTATACGACGCAGCTTTCTTTAGTTCGTATGTACGCCACTTCACGCCACTCGCCAGTCGACTGTCAACCACGGAGGCGGGCGGTTTTATCGCTCTATCGGATAGCCGCGCCAACTGTAAACACGACGCACTATCTGCGCACCGCTTCCGTTCGCGAAATAACCTCAATCCATCGCTAGTGTCGTGTTGTGCCAAACCAGTGGATCTGGATTCAACACAAATTGGGCTACGGCCAACATACACAGCCAACTGCATCGAAGGGTGCCACAACCCCGTCGATTGA

Protein sequence:

>DPOGS206638-PA
MRSQAAKPLPKKGPLAKTSFKASTATKQVKKKAVATKKVKTLQKSTTKSPKKNTSDESKKDNDSSNKEKNDSTSDTEKAGPSKTINIREESPMKKGKKDPKEAKDKNKAAPKKVLSKKPIKLTKKANPAVKTKKDSDEVKYKINLESVKKTSAKNPKAKTNIKKSNTGNQKKKSNLQSKESEVKTLDKHDEDEKCPKDKVKDSSNISTITENKEDKPDKDIAESEKKSEKKEESNKILKGNSNFSETAYQSITVTKSTTQSETQKTETERPPKFFTSIRSPRTVDIDVKCCKNSKGNLSVSERFISLKCESKTDDLNKDKNTSNNLVSQGPEIDVYTFTEKVDSPKSILSDFRNPINKNIGRVRPIARVKGTVIDKKLDNERKKFSPHRPIEEVVKQLKANKRTEDLSLSKNNSAASLDFNVFNDEKEEVQPPIVSKSYKMVARKSSITGKPFSPIKFLDQDSPPKQEVKTPTKVTNTKKNISRSKKPRKKSTSSDDEMNTSTFALNTKTFHYTSSEESVSESNDDNDQVESSESSSSKTKKTRNKKYKRITDGAKKSNSKELKELSKDSLIDIKDTSALLGNEKPSKRRLKLLSMWSGPKKHRMASLNALAKVHCLYENESRTHMELGLMKTVDRQAMPSTSKSNLPKKKETKTRDSDKQESTSESEYENKKETEKDSSESSDDNPPQRTLRGVPGIRSAGKYWDPRSSTSSSEDSELESKSKNIATDKKKASTKPSAKSDSDKPPPKMKKTAGVPVKKKRNRNEVVMDLKDMVVQKRMASLNATAILAASYEKRSPKSSKDDTTSDSCSDDSFSQKPKNGLTSGIKSELKIEDTKKECEELSDRNQKVEVIVNQDTDVTITGVYSTHLHEGFCTVSGMQYRISSTSHTQTTATANCEKEGCSREDGSRYTPLSALSSMQPPADHSHHPHPVPELGGLARRAAGCSSAFSAPSPAAHHDPVQREASRRSRRCTPPPSPAPAPRTPRNRHPLPPRTPPPPLVYDAAFFSSYVRHFTPLASRLSTTEAGGFIALSDSRANCKHDALSAHRFRSRNNLNPSLVSCCAKPVDLDSTQIGLRPTYTANCIEGCHNPVD-