Monarch geneset OGS2.0

DPOGS212037
TranscriptDPOGS212037-TA3504 bp
ProteinDPOGS212037-PA1167 aa
Genomic positionDPSCF300054 - 15889-19392
RNAseq coverage427x (Rank: top 29%)
Annotation
HeliconiusHMEL0180650.081.55% 
BombyxBGIBMGA010005-TA0.070.42% 
Drosophilabip2-PA5e-3637.13% 
EBI UniRef50UniRef50_E2BU544e-5548.94%Transcription initiation factor TFIID subunit 3 n=6 Tax=Formicidae RepID=E2BU54_HARSA
NCBI RefSeqXP_001814148.17e-5249.36%PREDICTED: similar to bip2 CG2009-PA [Tribolium castaneum]
NCBI nr blastpgi|3504100881e-5852.12%PREDICTED: hypothetical protein LOC100743667 [Bombus impatiens]
NCBI nr blastxgi|3504100884e-13731.55%PREDICTED: hypothetical protein LOC100743667 [Bombus impatiens]
Group
Gene OntologyGO:00055152.6e-12protein binding
GO:00082702.6e-12zinc ion binding
KEGG pathway 
InterPro domain[1013-1145] IPR0110112.1e-17Zinc finger, FYVE/PHD-type
[4-78] IPR0065653.7e-15Bromodomain transcription factor
[1084-1140] IPR0130835.2e-14Zinc finger, RING/FYVE/PHD-type
[1091-1137] IPR0019652.6e-12Zinc finger, PHD-type
[1091-1138] IPR0197871.1e-11Zinc finger, PHD-finger
Orthology groupMCL25677 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212037-TA
ATGTCAGAGGCGTACGCCCGAGAGATATTACGGAGGAATGTTGCCCAAGTATGCCAAACTATAGGATGGAATGGGATAAACTCTACGCCACTCGACATTTTAGTGCATGTTTTGGAAAAGTACATTTGTACTTTGGGCACTCAAGCTAACCGATACGCCGAACAATTTAATAGGACTGAACCAAACTTGAATGACCTAGGGTTAGTGTTTCGTGACCTTCACATCCAATTGCCAGAATTAGGAGAGTATACTAGATCTGTGCCTCCCGTTCCACCTCCTGTTAAAACAGAAAGATTCCCAAAACCTAAAGAATCTAATTTAAATTTCCTTAAGCCTGGCAGTTATGAGGTAGTTACAAGACCTATGCATGTGCATGAGCACTTGCCTCCTATGTACCCAGAGAAAGAAAGAGATACACCTGTTGTTGCAGGAACAGTTGAAATTCGGCAAAATGGTATTGATAATGTTGATGCTAATGTGTCCTGTACAAGTCCTGAAATATCTGTCACAGACAGTCCAGAAAAACCTAAAGATATATTTAAGAGGCCTATTGATCCAGTTTCATTACCAAATAGTAAAAGACCAAGGTTACGACTGGATGAAGAGGAAAGGACAAGGGAAATTAGCAGTGTTATGATGACTATGTCAGGTTTTCTTTCACCAGCTAGAGAAGGTAAATTACCAGAAGCTAAGCCTCCTACTATTATTTCTGAAAGACATCATGACAAGCATAAAGTGAATTCACACCATTCAAATGCAATTAAAGTACCAATGTTAGATAAAATTGATAAGAAATCGAAGAAGAGTAAGTTAATTAATGGAAAAATTATGAAAAGTAAAAGAAAAGATAAGAGTCATAAAGGTGAGGGTAGTAAATCTAAAGATAGTAGTAAATTGGAGAGGTATCCTCCGGGATATCCAATGAAAAGTAAAGACGTTCATCCAACGCATCATAATCACGTGACAATGCCTGCGCCAAGGCCTTTACAAGCACCTGTAAGACCAACAATGCCACCACCACCTATACCTTTACCAACACCGCCACCTGTTACAATACAAGAACCTATCAGGCCAGTTATAAAACAGGAACCAATAGATCCACCCCCACCTGTCACAACTCGGCCATCTCTACCTGCTGAAGATTCAATTCCGGTACCTAAAAAAATACCAATTTCAAAATCTCCGCTCGTTTCCAATTCTTTAACTACTCACAAACATCAATCTTCCACAATTTCTCTTATTCCGGATGTTCAAATCAAAAAAGAGGTTATAGATGAAGAAGAAAAGTTAGCATCTCAGCCAGATAGATCAAAGATTAATATATTTAAAAGAATATCGAACAAATCTAAAGAAGAAAAGCATACACCAGAAGTTGTTCCTGAAAAATTATTTTCACAGCCAGATACAACAATTTCAAGGTTGCAGAATTCATCACATGAAATAAGTGAAAACCGAGTTAAAAGTGCCGAATATGTGAACAATAATAACAGTCCAGTTGATTTGTCGCGAGAAATAGATATTAGGTCTCACGAAATTATCAACATTGATGATGATTCATTGGATGCTCAACCAGTGCCTCATTCTAGAAATACATCCCCTGAGCCAAAATCAGTTGGCCTTTCAATAAATAAATCGTTACAAGTACCTTTTCCCAAAGATATTGCTAGTGTCAGTCCAAAATTGAAGAAGGAAAAGAAACATAAAGATAAAAAAGATAAAGCAGCGAAATTAGAAGCAAAACTCAAAAAACAGCATCAACAGTTGGCCTTTGAAATGATGCCAATGGTAGAAAAGAAAAAGTCTAAAATTAAAAGTGAAAAGTCAGTTAAGTCTAATAGGCTTAAAAATGATTTAAAATTACCACAAATGCCACCTGGTTTTCCATTCTTTCCTAACATGCCACCAGGACGGGGAATGATGCCAGGTCCGGGATTAATACCTAGTCATGGCTTAATACCTGGTGGGGACTTTTTAGCTGGTTTGACTAACAACCCTGCACTAAGAGGTTTACAACCCCCAAATATCCTTGGTAATCCTTTTGCTGTAGGGGCAGGCGGACCAGGGCTTATACCAGGATCAAGTTTTCTACCAGGTGGTCTCGGCCCCGGTATCCCCAATCATCTTATGCCGATGGGTAACTTTCCTCATTCTTCGCGATCGTCTCCTGTAAAAATACCGCCAATGCTTAGACGTCCAAGTTTAGAGGTTATACCTGTTGAAAACGAAGAAGACAGGATGATGCATAAATCAGCAATGACGGGGCGCGACAAAGATCGTCATGATAAGCACAAATCTCCAACCATTCCAAATATTTTACAAAAACAGAAATCTAAATCAAATAAGGATCATAAGTCAAGTATATATAAAATGCCACCTGTACAGCCGGATATAACGATTGAATTGAATCCTCCTAAAGAGCCAGTTAGACCTGAACCACCACGAGAAAATCCAACGCCACTTCGAATACCCACACCAGAACCACAAGCCGTTGTTTCTAAACCTGAACCGCTCCCAATTCCTGAACCAACGCCAGTCAAACATTCTGCTCCAGAAATTTCTCAAGACCCGGATAATATAGAGAAGAAGAAAGACAAGTCTCATAAAAAGGAAAAACGAGATAAGGATGGCATTAAAATAAAAAAGAAGAAGGATAAAAAAGACAAAAATAAAGATAGGTCTGAAAAGAAAAAGGATAAGGAAGAGAGACAGGAAATTAAAGATAGAATAAAGAAAGAAAAGAAAGAGAAAAAGAAAGAAAAGTCGGCAGATGGTCTCGTGCCTAAACTTACCCTTAAACTAGCTTCTTCCAACTCAAATTCACCGATGCCACCCAGCTCTCCAGATGTATATAAACTAAATATAAAGCCTGTTGTAAAGAAAGAAGAGGAAGAGACATCTCCTATTAAAGAGGAATCCGTATCACGAGAGCACAGTCGGTCCCCAGAATTAGCCCAAATATCTGCCTTAGTAACGAGGCCACCAAAGCAGAGACATTCTAAACATAATCATGTTTCAGAGCCGTTAGAATCACAATCGCCTCCGCCTATACCTGGTTCGCCGCAAAGAAAGAATCGTCCACCCTCGAGTCATTCTAAATATAAAAGAATCTTGATAAAACCTTTGTCGAAGAAAGGTAATAACGAAGATTTTGAAGACGAGCCAGCTACGATATCAGATGAACCGCAAGCGCCGGCACCAGTTTCTGTAGAAAAACCAACTGGACCACTTCCAACACCATATTATGTGGACGAACAAGGAAACAAAATATGGGTATGTCCCGCTTGTGGACGGCCAGACAATGGCTCGCCGATGATAGGTTGTGACGGATGCGATGGGTGGTACCATTGGATCTGTGTTGGAATCACGGAGGATCCGGGGGCCACGGAAGACTGGTTTTGTAAATCTTGCGTTGCTAAAAGGGCTGCGATGGTTCTCGCCGGCGTCACTTCCGGCAAAAAGAGGGGGCGGAAACCAAAAGGAGAAAAAATCAGAGACTGTCATTGA

Protein sequence:

>DPOGS212037-PA
MSEAYAREILRRNVAQVCQTIGWNGINSTPLDILVHVLEKYICTLGTQANRYAEQFNRTEPNLNDLGLVFRDLHIQLPELGEYTRSVPPVPPPVKTERFPKPKESNLNFLKPGSYEVVTRPMHVHEHLPPMYPEKERDTPVVAGTVEIRQNGIDNVDANVSCTSPEISVTDSPEKPKDIFKRPIDPVSLPNSKRPRLRLDEEERTREISSVMMTMSGFLSPAREGKLPEAKPPTIISERHHDKHKVNSHHSNAIKVPMLDKIDKKSKKSKLINGKIMKSKRKDKSHKGEGSKSKDSSKLERYPPGYPMKSKDVHPTHHNHVTMPAPRPLQAPVRPTMPPPPIPLPTPPPVTIQEPIRPVIKQEPIDPPPPVTTRPSLPAEDSIPVPKKIPISKSPLVSNSLTTHKHQSSTISLIPDVQIKKEVIDEEEKLASQPDRSKINIFKRISNKSKEEKHTPEVVPEKLFSQPDTTISRLQNSSHEISENRVKSAEYVNNNNSPVDLSREIDIRSHEIINIDDDSLDAQPVPHSRNTSPEPKSVGLSINKSLQVPFPKDIASVSPKLKKEKKHKDKKDKAAKLEAKLKKQHQQLAFEMMPMVEKKKSKIKSEKSVKSNRLKNDLKLPQMPPGFPFFPNMPPGRGMMPGPGLIPSHGLIPGGDFLAGLTNNPALRGLQPPNILGNPFAVGAGGPGLIPGSSFLPGGLGPGIPNHLMPMGNFPHSSRSSPVKIPPMLRRPSLEVIPVENEEDRMMHKSAMTGRDKDRHDKHKSPTIPNILQKQKSKSNKDHKSSIYKMPPVQPDITIELNPPKEPVRPEPPRENPTPLRIPTPEPQAVVSKPEPLPIPEPTPVKHSAPEISQDPDNIEKKKDKSHKKEKRDKDGIKIKKKKDKKDKNKDRSEKKKDKEERQEIKDRIKKEKKEKKKEKSADGLVPKLTLKLASSNSNSPMPPSSPDVYKLNIKPVVKKEEEETSPIKEESVSREHSRSPELAQISALVTRPPKQRHSKHNHVSEPLESQSPPPIPGSPQRKNRPPSSHSKYKRILIKPLSKKGNNEDFEDEPATISDEPQAPAPVSVEKPTGPLPTPYYVDEQGNKIWVCPACGRPDNGSPMIGCDGCDGWYHWICVGITEDPGATEDWFCKSCVAKRAAMVLAGVTSGKKRGRKPKGEKIRDCH-