Monarch geneset OGS2.0

DPOGS203592
TranscriptDPOGS203592-TA3210 bp
ProteinDPOGS203592-PA1069 aa
Genomic positionDPSCF300063 - 1003680-1016527
RNAseq coverage285x (Rank: top 38%)
Annotation
HeliconiusHMEL0158781e-6553.48% 
BombyxBGIBMGA001350-TA2e-16154.84% 
DrosophilaCG11593-PB3e-6560.11% 
EBI UniRef50UniRef50_Q7Q7109e-6460.64%AGAP005565-PA n=2 Tax=Culicidae RepID=Q7Q710_ANOGA
NCBI RefSeqXP_001956760.14e-6544.30%GF24411 [Drosophila ananassae]
NCBI nr blastpgi|1947486559e-6444.30%GF24411 [Drosophila ananassae]
NCBI nr blastxgi|1955876283e-6345.45%GD13300 [Drosophila simulans]
Group
KEGG pathwaymcc:7088984e-39 
 K01514 (E3.6.1.11, ppx)maps-> Purine metabolism
InterPro domain[343-441] IPR0221811.7e-17Bcl2-/adenovirus E1B 19kDa-interacting protein 2
[417-561] IPR0012516.9e-11Cellular retinaldehyde-binding/triple function, C-terminal
Orthology groupMCL22603 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203592-TA
ATGAGCTGCCTCCAGTCACCGAGGCTGAGTGACGACGACCTCCTCAGTTCACCATCCTCAGATGAAAAGATGGACACAGCATCGGAGTCGTTCCGCTCCACACAGCACACACCCAACTCGATACCCGACGTGGACCACGAGAATGATGTGCTGAGGCTACGGAACCCCGACATCATAGACACCGACCAAATCCTAGACAATAAAGTCACCAAGTTGAAGTCCGAACTGAACGATAGCTTTATATCACGCTTCACCACACTCACGCTGAGCTCTCCCGACAACAAGGCCAGGAATATAGGTATAAGCACTCCCAAATATAGCTCAACGCTAACTTTAACGACCAACCACCCTCTCATGGGCTTGGCCAGTCCGGACGAGAGCTGTCCCGTCACGAAGAAGACTCACAGGACAGAAGAAGAGAAAGACTTGAGCTTATCCAGTATAGAAGATGATAGAGGAACAACGAACAGATACGAGTTCTACACGCCACAGAGGAATACTTCGAAGAATAATCTTTACTTTTCCGAAAACTACGTCACAGCGGACAGTCAGTTCCTGAGCGAGTCTTTAAACATGACAGCGGAGGCTAAAGATCAGCACCAGAAAAGAATTATAATACCAAGGTTCCTGGACACCCCTGTCAAAGATACCACCCAGCAGAATCTAAGCGTGTTCCTGAATTCGGGGAGGGAGGGGAGAGACATAAAGAAGTACACAGAAGAAAGGGGAGAGAGAACAGCCTTGGATGTTATAGATAAAGAAGATTTTCATTCCGGGGTTGAATATCGAGACTCCCCTCCGCGGACAGTTCCTAGCTTCGACCTACCTATTGAAATGTCCAGTAACATAGGCATGCATAATGGCGTGGAATCCCCCGACGTGGAGTCATTATCGACCCAGGAAGACAGGTCCGCGTACCAAGCCTTACTGGACCCCTACACCGGGTCAGTGGCCTTGAGACACACCACGCACAGGAGCAACCCGCCGAGAAGAAAAGTACAATTGCCGCCGGAGGATGACGAGTGTAGCCTGGACAGCGTCAGCGGCGGTTCCCTGGAGTCCGAGGACGAGCCGCCGCCCGTAGAGAGTGCGCCCGACCCTCACAGCGAGGACGACACCAAGAGGTCCAAGTCCACGAACACGGTGAGCGAGTGCAGCGACCCCATACCTGAATACTCAGCGGCCGAGGAGTTCCGCGAGGAGCGCTCCTGGCTCAGCGTCACACACGGGGGAGGCCGCGCCGTCTGTGATATGAAGGTCATAGAGCCGTTCAAGCGCGTGGTGTCCCACGGCGGATACGAGGAGGGAGGCGCCGCCCTCATCGTGTTCAGCGCTTGCCACCTCCCGGACACCAGGCGCCCCGACTACAGATACGTCATGGACAACCTGTTCTTGTATGTGATGTGGAGCCTGGAGCGGTTGGTGACGGACGAGTACGTGCTAGTGTACCTGCACGGGAGCGCCGGCAGACGGAGGATGCCCACCTTCGCCTGGCTGCACGAGTGTTACAAGCTGGTGGACAGACGGTTGAGGAAGAGTCTGAAGCGCCTGTACCTGGTGCACCCCACGTTCTGGTTGAAGTCGTTCGTCGTCATCACCAAGCCTTTCGTCAGTTACAAGTTCTTCCGGAAGCTGTCCTACGTGGAGAGTCTGAAGGAGCTGTTCCGCCTGGTGCCGGTGGAGCCCAACGCGATACCCGACCTCGTGAAGGAATCTGAGAGGGACGAAGACGACACTAAAAGACGACACTTCCGCAAACGCCGCGGGAGTGATGCCAGCGAAACCGGACTGGTGAAGGTTGCGCCGAAGGCTACCAAGCGTGGTAGAGGCCGACCGCCAACCACAGGAGAGTACGTCGGCCTACATCAGGCCAAGAAGGCAGTTTTGGAAGTGGAGCGGGAACTCCAACGTCTGGAACAACAGAAGGAGGAGGCGGATATTGCGAGGAGGATACTACCGCCACGCATGTCGCGACTGTTGGAGACCCCTTCTAACTCAGAGTGGAGCCTCAACACTGAGGACCAGGAGACGGCAAGCGCGATAGGTGCCACCATATCGAAAAGCCTTGAGGCTATAGCGAAGGTGGCAACAAAATCCAAGAACCTGAGTGGACCCTTTGTCAAAATCCTAAAAGAGTCCACAAAATCCATACAGGAGGCGTGCGCCACGCTACTCAACAGAACAAAGACGAAAGAGACGAGGGTGTTGGAGGTGACAAACGCACGTCTAAAGAGAGAGCTGGCGGATATGAGGGCCGAACTAGCGGACATGAGACGCGAACTCGACAATGCGCGTCTGCAGAAGCCCAAATCACGGACGGTCTCTTCCAACGGAGGTCTCAACGTGGAAGAGCTCCTGCAGCGGGCCGTTCGGGAGGCCGTTTCATTAATGAGTGCCCGTATGGACGCCCGACTCGAGAGTCTGAGTGAACGGCTCTTACCGGAACCTCGAGTGCGACTATCGCTTGCCTCGGACAAGCGGAGAGACAATGCGGTCCCAACAATAACATCCGCGGGGCCATCGCGAGAGAAGGAAAGGCGTAACGAGTCAACCCCTGCGGTTACGAAGCTCATGCCACCGGCAAATCCGGGCCCAGGGGTGAATTTTCTTATGAAAAAGAAACGCCAGACTGCTGCTGCTGCTGAAGCGGCGAGTAAAAGACGCGATGCACCGCAAGTCACAGCGGCATCGGATCAACCTCCTAACGAGAGTTGGGCAACGGTGGTCAAAAGGAAGACCAAGAAGGGGCCTCCAAAAGGAAATAGTGGGCAGGCCTCAAAGCAGAAGGAAAAGGGGAAACGAAAGCTCCGTTCCCCCAAGTCCGAGGCCGTAGTCCTTACCCTACAGCCAGGGGCAGAAGAGCGCGGAATAACGTACAAGTCTGTCTTAGAGGAAGCAAAACGGCGAGTTGACCTGGCTGCGCTGGATATTCCGGCTGTCAAGTTCAGGCTGGCTGTCACCGGAGCTCGCATTCTAGAGATCTCGGGTGACGCCAGCAAACAGAAAGCGGACGCCATGGCCAATAGGCTAAAGGAAGTGTTAGAGGGGGAGAACGTGCGCATCTCTCGCCCACAGAAATGTGTTGAGTTGCGGGTGAGCGGTCTCGACGACTCCGCCACTGCCGAAGAAATCGCGGAGGCCGTTGCGAAGCCAGGCAATTGCTTGCCCGGCGATATCAGGGTGGGAGAGGACGCGGAACTGCCTGGATAA

Protein sequence:

>DPOGS203592-PA
MSCLQSPRLSDDDLLSSPSSDEKMDTASESFRSTQHTPNSIPDVDHENDVLRLRNPDIIDTDQILDNKVTKLKSELNDSFISRFTTLTLSSPDNKARNIGISTPKYSSTLTLTTNHPLMGLASPDESCPVTKKTHRTEEEKDLSLSSIEDDRGTTNRYEFYTPQRNTSKNNLYFSENYVTADSQFLSESLNMTAEAKDQHQKRIIIPRFLDTPVKDTTQQNLSVFLNSGREGRDIKKYTEERGERTALDVIDKEDFHSGVEYRDSPPRTVPSFDLPIEMSSNIGMHNGVESPDVESLSTQEDRSAYQALLDPYTGSVALRHTTHRSNPPRRKVQLPPEDDECSLDSVSGGSLESEDEPPPVESAPDPHSEDDTKRSKSTNTVSECSDPIPEYSAAEEFREERSWLSVTHGGGRAVCDMKVIEPFKRVVSHGGYEEGGAALIVFSACHLPDTRRPDYRYVMDNLFLYVMWSLERLVTDEYVLVYLHGSAGRRRMPTFAWLHECYKLVDRRLRKSLKRLYLVHPTFWLKSFVVITKPFVSYKFFRKLSYVESLKELFRLVPVEPNAIPDLVKESERDEDDTKRRHFRKRRGSDASETGLVKVAPKATKRGRGRPPTTGEYVGLHQAKKAVLEVERELQRLEQQKEEADIARRILPPRMSRLLETPSNSEWSLNTEDQETASAIGATISKSLEAIAKVATKSKNLSGPFVKILKESTKSIQEACATLLNRTKTKETRVLEVTNARLKRELADMRAELADMRRELDNARLQKPKSRTVSSNGGLNVEELLQRAVREAVSLMSARMDARLESLSERLLPEPRVRLSLASDKRRDNAVPTITSAGPSREKERRNESTPAVTKLMPPANPGPGVNFLMKKKRQTAAAAEAASKRRDAPQVTAASDQPPNESWATVVKRKTKKGPPKGNSGQASKQKEKGKRKLRSPKSEAVVLTLQPGAEERGITYKSVLEEAKRRVDLAALDIPAVKFRLAVTGARILEISGDASKQKADAMANRLKEVLEGENVRISRPQKCVELRVSGLDDSATAEEIAEAVAKPGNCLPGDIRVGEDAELPG-