Monarch geneset OGS2.0

DPOGS210919
TranscriptDPOGS210919-TA3420 bp
ProteinDPOGS210919-PA1139 aa
Genomic positionDPSCF300045 + 185463-193871
RNAseq coverage394x (Rank: top 31%)
Annotation
HeliconiusHMEL0158190.078.76% 
BombyxBGIBMGA003069-TA0.063.20% 
Drosophilarow-PA0.043.78% 
EBI UniRef50UniRef50_F5HMY60.045.30%AGAP001141-PB n=3 Tax=cellular organisms RepID=F5HMY6_ANOGA
NCBI RefSeqXP_973030.20.046.68%PREDICTED: similar to CG8092 CG8092-PA [Tribolium castaneum]
NCBI nr blastpgi|3479653820.045.30%AGAP001141-PB [Anopheles gambiae str. PEST]
NCBI nr blastxgi|3479653820.045.24%AGAP001141-PB [Anopheles gambiae str. PEST]
Group
KEGG pathway 
Orthology groupMCL16162 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210919-TA
ATGATGGAAATGCGAAAGGAAACGTCTCCTATACCTCCTCAAGCGGAGATCATAGAGGCCGATTTATATAAACGATCATATTTTGGACCTGATGTTCCAACTTTGGAGCTAGAATGTTGGGAGGAAGAGTTATCTGAGGCTCAACTGAAAGCCTATCAGACTGCCACTGAGGAGTATCAAAGTATTCAAAACAAACTTGATGTGATTGTAAAAGATACAGGAGGAGAGATTGTTTACAATGGTGATCAATTTACAGCATACCAATTGCTGGGCAAACAACCAGCTCTAAAGGATTTGGAGAGACAGAGTGCTGATATAATCAGATCTAGATCACGTTCCCTCTCCATATGTAAAGAAACTAAAGAAGTCAAAGGAAAAAGAGGTCGACCCAGAAAGAATCGCGAAGAAGATCGGGATTACTCCCCGTCATCAGACAGGACAAAGTCACCCGATCAAAGACATAAGAAGAAAAAGAAGGACAAGAGGAAGTATGATGAGAGGGATACGAAGACCAAGGATAAAACTTTGGCTCCATCCAATGTTCACAGCGTTGTAACAAACGTCCGGCCGTCCGAGGCGACCGCCATCGGTGGCTTACTCACCGGCAGCGCGACCAAAACAACGGCCATAGTCGACTTGACCAAGGAAGACGGCAACAAGAATGTGGCCGATTCCAGAGAGGTGTCGTTCAATAAACTGCAAGGCAAAACATTCCCGTCGTTGGTGGTGGTGGCCCGGCCGTACCTGCGTTCCAAGGACGCGGCCGTGCCCGCTGATCGCGCGACCCTCGATAGTAAAGTCAAAGCGGTTCTCATCCACACTCCCATGAAGTTCACCGAGTGGCTCATACAACAGGGTCTAGTGCGGTCAGAGCAGTGGTGCGCTCTACATCCCGGGAACAAACTCAAGCTAGGTATGTATTCTGACGTGTCTAAGTTCCCGTACTCGGGCGGCTACGTGTGGATATCCGAGTGCTGTCCTACTCGATTCGTCTCCGTGTTCTCGAGCTCTATCTTCGAAGGAGCCACGTTTCCGCCCAGTGTCCTCCTGAAGCTCATATACCACTGGGCGTGTCAGACGAACGTTCAGAACGTCGTCCAGTGGGTCAAAGTTGACAATCTATACGTCAAAGGTCTGTTTACTTGGTTGAGAGCGGTTTGCACGTCGGCTATACATCAGCACATGGGTCTGCTCGGCGGCCCGGGGAAGAAGGTTGAAGTTGGAGTCATATCTTTGGGTACCACCAGCCATGATGGCACACAGAGACAAGTCAAAGTTGAAGTGTTGGGTGTGCTGGATCCCGTCGAGAAATTGATTCGCCTTCGTGCGGTGGAGCCGTTGGCGGAGTACGAGAAGAATTATAAGAAGCGTTTCCAGAAAATTCTGGAGCCTCTCACCACTTGGGTCCATCCGTCGTCTATAATTCTGACGGATCTGACCGTGGACAAAGGCACGCTTGTGTCCATGGGCTTTAAGACGGTCCACCAGTCCTCGTCTCACTCCGACCAACCCATGAAGTACAGCAACGCCAATATCATGGAATATTTACGACGTATCGTGCCGAGAATGTTCCAGAACACTCTGTCGCTGCTGTCCAGGCAGATTATACAGCAGTTCCTCGACGAACTGGTGTGGAGAGAAAAGTTCGGTGTGTCTCCCGGGCAGGCGTTCGACAACATAGTGTCCCACATATCAGAGCAAACAAAATTGGACGCTAAGGACCCCATCACTATACGGCTCTACAAAATCGCTTCTAATCCATTCAAAAACTGGAAGTACCCCAGCAAGAAAAAGGATAGATCGGAAGAATCTTTAGAACCGGAAGTGAGAAGCAAGCGCGGTAGAAAGAAGAAAGAGCGCTCGCCCTCACCGCCGCCTAAGAAGAAGAGAAGTAAGACTTATATAGAAGACGAGGACGACGAAGAGATTCCACTGGCGCTGCGGCGGTCGAAAGTCAAGCAAGAGAAGAATAAAGACTCCGACGGCCGGCGGCGCAAGGCGCGGGCTTACGTCGACGACGACCTGGACGACGTGCCGCTGAAGAACATCAAGAAGGAGGTCAAACACGACGACACCGTCTCCCTCGAGAGGTTCTACTACGGCAGAACGACCGAGGGCCTCGCCGAGAACATCGCCATAGCCGTGCAGTGTCCGGCGTGTCAGGTAGAGTTCAACGAGTCGATGTCGCTGTGCGTTCACCTGTGCGGGCACGTGTCGCGGCGCGCGGCCGGCGTGCTGTGCGTGTTCTGTCAGAGCATGTTCGACAGTGAAGCAGAGCTGAGCGAGCACCTCAAGTGTTCTCACCCCGTGGACACCAAGTCACCCGAACTCTTCACCTACGCCTGCCTCATATGTGAGGTACGTTTCGCGGCGGTGCTGACCCTGGCGGCTCACATGCAGAAGGCCCACTGTCCGCGCGAGCTGCCCTACAGCTGCGGCTCGTGTCCCTACCGCGCCTCCGCCCACCGCTCACTGCTGGAACACGTCATGAACAAACATCGCCGGTCCGACAAGCTAGTCTGTCCGCACTGTCTCAAGATGATTCCAGTGTACGCCGACGGATGTGAACTCACAGCCAACGTGCTCCTCTACATGGACCATCTCAAGCAACACCAGGACAAGGAGCTGGAGATCAAATGCACGAGATGCGTGCTGAGATTCGTACATCTCGGTCAACTGAAAGAGCATCAGATTCGCGACCACAACCCGTGCGAGGAGGTCCTGCCTCTGTGTTCTACTGAGCACTTGATTAACCTGCCCAAGAACAAAGCCCGCCCTCCCATCAAGGACGTCGCGTGTCACGCCATCAGCGACACGTATGAAGGTGTCACGTTGTTCCTACAGGACGGTCTTCTGTGTCGCGAGTGTGACACGCCGCTTGACAGTGACAAACACTTCCTCGGTCGCACGTCGTGCAGCAAGTGTCCGTATGCTACATCATGTTACCGAGCGATGTTGAGACACAGTGGATACTGCGCCGGCCCACATTCACTAGAGGCCGCCCCTAGACCCGCGCCCATGCTCTACTGCGTATGTGAATACTCTACAGACATAGGCACGGACATGCTGTCCCATCTTCTCGCTACACAGCACACAAGCGCCTACTTAAGTGAGGAACTTGCACGAGCCAACACTGTCAGGGAGGAACCAAAACCAGCTGATGAAGTGGAGCCTCTTGTGGAGAACATGCCAGCTATCCCAGATTACGCTCCTCCATCGGTCATCAACACTCAGCTGTCTCTAGATGATCTTGCTCCCCCTTCAGTTTTACAACCTGATCAGCATGATCAAGAACTCCTGAAGGACGCATATGACCGCCCCCTGGCAACACCAAGACATGAGGAACCTCACTACACTCTCGGAGACTTTGAACCATTGCCTCAAGAGCCACCTCCCCAACCAGACTTTGAACAACTGTAA

Protein sequence:

>DPOGS210919-PA
MMEMRKETSPIPPQAEIIEADLYKRSYFGPDVPTLELECWEEELSEAQLKAYQTATEEYQSIQNKLDVIVKDTGGEIVYNGDQFTAYQLLGKQPALKDLERQSADIIRSRSRSLSICKETKEVKGKRGRPRKNREEDRDYSPSSDRTKSPDQRHKKKKKDKRKYDERDTKTKDKTLAPSNVHSVVTNVRPSEATAIGGLLTGSATKTTAIVDLTKEDGNKNVADSREVSFNKLQGKTFPSLVVVARPYLRSKDAAVPADRATLDSKVKAVLIHTPMKFTEWLIQQGLVRSEQWCALHPGNKLKLGMYSDVSKFPYSGGYVWISECCPTRFVSVFSSSIFEGATFPPSVLLKLIYHWACQTNVQNVVQWVKVDNLYVKGLFTWLRAVCTSAIHQHMGLLGGPGKKVEVGVISLGTTSHDGTQRQVKVEVLGVLDPVEKLIRLRAVEPLAEYEKNYKKRFQKILEPLTTWVHPSSIILTDLTVDKGTLVSMGFKTVHQSSSHSDQPMKYSNANIMEYLRRIVPRMFQNTLSLLSRQIIQQFLDELVWREKFGVSPGQAFDNIVSHISEQTKLDAKDPITIRLYKIASNPFKNWKYPSKKKDRSEESLEPEVRSKRGRKKKERSPSPPPKKKRSKTYIEDEDDEEIPLALRRSKVKQEKNKDSDGRRRKARAYVDDDLDDVPLKNIKKEVKHDDTVSLERFYYGRTTEGLAENIAIAVQCPACQVEFNESMSLCVHLCGHVSRRAAGVLCVFCQSMFDSEAELSEHLKCSHPVDTKSPELFTYACLICEVRFAAVLTLAAHMQKAHCPRELPYSCGSCPYRASAHRSLLEHVMNKHRRSDKLVCPHCLKMIPVYADGCELTANVLLYMDHLKQHQDKELEIKCTRCVLRFVHLGQLKEHQIRDHNPCEEVLPLCSTEHLINLPKNKARPPIKDVACHAISDTYEGVTLFLQDGLLCRECDTPLDSDKHFLGRTSCSKCPYATSCYRAMLRHSGYCAGPHSLEAAPRPAPMLYCVCEYSTDIGTDMLSHLLATQHTSAYLSEELARANTVREEPKPADEVEPLVENMPAIPDYAPPSVINTQLSLDDLAPPSVLQPDQHDQELLKDAYDRPLATPRHEEPHYTLGDFEPLPQEPPPQPDFEQL-