Monarch geneset OGS2.0

DPOGS210642
TranscriptDPOGS210642-TA3309 bp
ProteinDPOGS210642-PA1102 aa
Genomic positionDPSCF300401 - 160284-171231
RNAseq coverage566x (Rank: top 22%)
Annotation
HeliconiusHMEL0107891e-14631.21% 
BombyxBGIBMGA001804-TA4e-7432.14% 
DrosophilaCG31224-PA1e-1926.46% 
EBI UniRef50UniRef50_F7AF169e-2232.64%Uncharacterized protein (Fragment) n=1 Tax=Xenopus (Silurana) tropicalis RepID=F7AF16_XENTR
NCBI RefSeqXP_001947082.11e-2125.66%PREDICTED: similar to mCG121035 [Acyrthosiphon pisum]
NCBI nr blastpgi|2607956372e-2329.13%hypothetical protein BRAFLDRAFT_275668 [Branchiostoma floridae]
NCBI nr blastxgi|2608110414e-3021.59%hypothetical protein BRAFLDRAFT_66735 [Branchiostoma floridae]
Group
Gene OntologyGO:00036761.8e-06nucleic acid binding
GO:00082703.6e-05zinc ion binding
GO:00056223.6e-05intracellular
KEGG pathway 
InterPro domain[935-970] IPR0130871.8e-06Zinc finger, C2H2-type/integrase, DNA-binding
Orthology groupMCL16717 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210642-TA
ATGGAGGATAATTTTAAGAAAAGGTGTCGTAAGTCGAGGTTGGTGCCACGGGAATGCTTGGATACCGATGACGAATATATAGCAGTTATGCAGACTGATCATACAAATAAGATGGCTCCCAAAATCAAACAATCCCGAGAGGAAGTATTGAAGAAGAAGAGAGAGGCAGAGCGTGTGAGGTACTACAGAAGAAAAAATGACCCGCAGAAACGAGAGGAGATGAGGGTGAAACAAAATGTGAACTATGAAAGGAGGAAGCAGACAGGTTTGAGGAAGCTGGTCAAGGACATGACTCCCAACGAGCACAGCACAGCTCTACAGAAGTGGAAGGAATACTGCAGAGTGTACCGAGAAAGGAAACGGATCCAAGAGAGCCACGTTACTGATATTGATGAAATAGAGGAGACCATAGCTCGGATGACGTCAAATTCCATCCTCAAGAAGCTGAATCAGAGAAGTGAAGACAATCTGAAGTTTGATGTCAAAGTTGAAGTTGAGGAGGTGGAGGAACACCAGACGAACGATTTGAACATAACCATCGGTTCTGTCGTTAGTCTGAAGGGCACAGCCGATGATGCACTGGGCGCAGCTGATGAAGAACAGAATCTACATGACATAGCCGACACGAAACTGCTGGTCCGTGAATACGACTTCAAAGCTACAACTGTTATAGATGAGTTGAACAAAATGAAAAACATGACTAATATAGCCGATGTGCTCATAGATGTGAGCGTCAGGAACTATTTTGGACGATATTTCTTTAAAGTCGATGCCGAGGTTAAGGAAGCCAAGTTGCTGGTGGTCTTCGCCAAGTGGCAGACCTGGTGTAGGATGAACCCTGACGATAGAGACGCTCCTCTCCTGTACAAGTGCTATATATGCCGTCGATCTTGGTGGCATTACCATGAATTCAGAGAGCATTTCCACTGTCACGACGATTTCAACTTGGATATAGATCAGTTCGGCCAGGAATGTATCGTATTCGCCTACAGCAAGGAAATACAACAGAACGACATCAGCGTCACCGGAAACTGCTGGCGTTGCGGCAACGACTTCCTCTTCCACCAGAATAAGAGGAGATACAAGAAGCTATACGACTGCTCCGGGTGCCTGGCTAAGTTCACGACCTGTTTGAAATTATCCAATCATGTGGGGGGCTGTAACTACTATAAGAGATCTCTTCAGGCTTTGGGCAAAAACGTCTCTCAGATACACCCATGCGATGTTTGCCCAGTGAAGTGCTTCACTCAGACCGATTTGGCCGATCATATAAAAGACAGGCACTCGGTGAGATCGGATCTGCCGATAGTGACGAATCATCAAAAGTGCAGTCACTGCTCCCAGCCATGTGACGCCCTGACGCACGAATGCAAGAACCAACCGTATATAAAATGCTGTGAGCTGTGCGACAGGAAGTTCCACAGGCCTATAAACTATCAGATTCACGTGAAAAACAACAGAAACCAGTACAAATGCAAGGTCTGCGATGAACAGCTGCCTGGACAGTGCATGGAGGTGAAGCATCTGATGAAGCACACCAACAACTTTGTATATCTATACAGGTGTCTGCTCTGTCCCTCGCCGGTGTACTTCAGCAAGAAGAATATGCTGGAGGAGCACAACGACGCGCACCACCAGGAAAATAGTACGAAATATTTCTTTGACGAGGTCGTTGTACCAAAGTCGCTGATCAAAACGAAGATATTAATCAGCAGACCAAGGATAAGACGGAAAGACCAACTCAAGCCTATTAAACAAGTACAGAATAATGAGACGAAACCTCTCAGCATTCCATGGATGCACCGTGCGGGGGCCGGGTCCATGGGAGATGGAGGGGATACGGGTCAAACCCAGCAGAACAACACCGAACACCTGGAAACTAACGAGGCTATGTACAACCTGGAGCCAGTGATCACAATAAAGAAGGAATTGAACGAAGATGACTTGATGAGAGAAATAAAACAGGAAGCCGAGGAGCTGATGATCTATGATGGGATCGGTGATGAGATGGTGATCAAACAGGAGATAGTGGAACATGTGATACAAGTGTGTGAGTACGATGACATTAATGTGGGGATTAAAACCGAGCCGGCTGATGATGATGAAGTTACCAACGACGAATATCTCTTGGACATAAGAAACCTGGCTTACAACTGTACCAAGTTGTACAGCTGTAAGAAGTGCCTCTTCCAGGGTGTGCACCGGGAGTATATGGAACACCTCAAGAGCAAGTGCCTTCACCGCACAAAGTACTACTGCAGTAAATGCAAAACCACGTACCTGACGATGAAGAGGTACCTGGTTCACTTCAGGAGGCACGGCTACGAGGAGAACACCTGCCCCAAATGCATCAGGACAGTGGAGATGAGCCAGCTGATAGCGCACGTGTACCAGCACGTGAAGAACACCTTCATCGGCTGCCACTACATCAACGACAAGACCTTCAACAAGTGCTACCAGTGCAGGGAGTGTAGAGAGGTGGTGCAGTTCTGTGACTTCTTCAAGCACTGGGAGCTGCATCTGGAACTGAAGACTGAAGACAGCGCTGGCAGGAACGACCTGGTGGAGAACAAACCGCTGCTCAAAGAACTCATAGCGCTTCTCCTCGGCGACACCATGGACGTGTCCAAGGAGTTGCATCCGAAGCAGTGCATCATGTGCCTGAAGCTGTTCTCCAGGAAGAACGACCTGAAGCGCCACCTCATAGAACACCTGCTGAACGACGCGTACAGGAACAGGCAGAAGTATGAGTGTCTCCGCTGCCAGATATGCAGCGTCGGCTTCAACAAGACGGACATTTACAAGCGTCACATGAGGGACCACGGCTCGCTGCCTCTCTACAAGTGCGAGATCTGCGACAAGACCTTCAGTGACTCGAGTAACTTCTCCAAGCACAAGAAAGTACACAACATGTCCGTGGTCATCTGTGACATCTGCAAGAAGAAGTTCACCTGCAAGGCCATACTCGTCAAGCACATGGAGTTACACAAGATCCTAAAGCCCATATCGTGTGAGTGCTGCTCGCGGGTGTTCCACTCTCCGTCGCTGTACAGGAAGCATCGCCTGGGGAAGAACAGGTTCAAATGCCCCGCCTGCAAGGTGTTGTTCAACAAGCTCAAGGACAAGTGGGATCACATGTGGCTGGAGCATAAGGAGAGGAAGTACATAGCTGATTGTCCGATCTGCAAGAAATCCTTCAGGAAGTACCAGGACGTGAAGGTCCACATCAGGAGAGAACACGACGCCAAATACGTGTACAGGCCCGTGTTCCACAGAGTAAATGAAGAGGAGATTATAGTGTGCGACTAG

Protein sequence:

>DPOGS210642-PA
MEDNFKKRCRKSRLVPRECLDTDDEYIAVMQTDHTNKMAPKIKQSREEVLKKKREAERVRYYRRKNDPQKREEMRVKQNVNYERRKQTGLRKLVKDMTPNEHSTALQKWKEYCRVYRERKRIQESHVTDIDEIEETIARMTSNSILKKLNQRSEDNLKFDVKVEVEEVEEHQTNDLNITIGSVVSLKGTADDALGAADEEQNLHDIADTKLLVREYDFKATTVIDELNKMKNMTNIADVLIDVSVRNYFGRYFFKVDAEVKEAKLLVVFAKWQTWCRMNPDDRDAPLLYKCYICRRSWWHYHEFREHFHCHDDFNLDIDQFGQECIVFAYSKEIQQNDISVTGNCWRCGNDFLFHQNKRRYKKLYDCSGCLAKFTTCLKLSNHVGGCNYYKRSLQALGKNVSQIHPCDVCPVKCFTQTDLADHIKDRHSVRSDLPIVTNHQKCSHCSQPCDALTHECKNQPYIKCCELCDRKFHRPINYQIHVKNNRNQYKCKVCDEQLPGQCMEVKHLMKHTNNFVYLYRCLLCPSPVYFSKKNMLEEHNDAHHQENSTKYFFDEVVVPKSLIKTKILISRPRIRRKDQLKPIKQVQNNETKPLSIPWMHRAGAGSMGDGGDTGQTQQNNTEHLETNEAMYNLEPVITIKKELNEDDLMREIKQEAEELMIYDGIGDEMVIKQEIVEHVIQVCEYDDINVGIKTEPADDDEVTNDEYLLDIRNLAYNCTKLYSCKKCLFQGVHREYMEHLKSKCLHRTKYYCSKCKTTYLTMKRYLVHFRRHGYEENTCPKCIRTVEMSQLIAHVYQHVKNTFIGCHYINDKTFNKCYQCRECREVVQFCDFFKHWELHLELKTEDSAGRNDLVENKPLLKELIALLLGDTMDVSKELHPKQCIMCLKLFSRKNDLKRHLIEHLLNDAYRNRQKYECLRCQICSVGFNKTDIYKRHMRDHGSLPLYKCEICDKTFSDSSNFSKHKKVHNMSVVICDICKKKFTCKAILVKHMELHKILKPISCECCSRVFHSPSLYRKHRLGKNRFKCPACKVLFNKLKDKWDHMWLEHKERKYIADCPICKKSFRKYQDVKVHIRREHDAKYVYRPVFHRVNEEEIIVCD-