Monarch geneset OGS2.0

DPOGS211954
TranscriptDPOGS211954-TA1227 bp
ProteinDPOGS211954-PA408 aa
Genomic positionDPSCF300011 + 1020334-1024026
RNAseq coverage34x (Rank: top 74%)
Annotation
HeliconiusHMEL0029476e-13894.68% 
BombyxBGIBMGA000905-TA3e-11991.07% 
DrosophilaKr-PA1e-6570.37% 
EBI UniRef50UniRef50_Q17D316e-6759.63%Zinc finger protein n=2 Tax=Culicinae RepID=Q17D31_AEDAE
NCBI RefSeqXP_001648943.11e-6759.63%zinc finger protein [Aedes aegypti]
NCBI nr blastpgi|2975221542e-6760.62%kruppel protein [Clogmia albipunctata]
NCBI nr blastxgi|2211398544e-7151.36%protein krueppel [Tribolium castaneum]
Group
Gene OntologyGO:00036762.1e-14nucleic acid binding
GO:00082700.0001zinc ion binding
GO:00056220.0001intracellular
KEGG pathway 
InterPro domain[269-298] IPR0130872.1e-14Zinc finger, C2H2-type/integrase, DNA-binding
Orthology groupMCL16090 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211954-TA
ATGAGTGTGGTCGGAGTTGAGGGCGCGGGCCGGGGAGCGTTATGTGGTACACGGGAGGAAATAGTGCATCGCGTGCATACGGCTGCAGTGAAGACCGGCGGGGAGGCCGGGGCGGCGGCGGGGCGAGCGGGGACCCGGGCTCGGGGGGCGCGGGGTCCGAGGAGACCCTGGTGGGGCGTGGGGGTGTGTTTACAGCGAGGGGCGGCCGGACAGCGGCGCGCCACAGTCAGTGTTGTGGTAGGTCCCACAGAGGAAGCTCGCCTCGGCACGACGCACTCGCTCTCGAGACTTCGCGACGGAACATGCTCACAGTGGTGTGAGATGGCCCTGTCGCTGCAGTCCAACCAGACCTCGCGAGGAGTTTCTGTAAGATCACTATCAAGTATGAGGGAGGAATCGTTGCCGATGTTGGCTGAGCGTATACTAGCAAGCCGTGCTGCTGCTTTAGTGGCCGGTCTACCTGCGGAACTATATTCTGGAGCATTACTAGCAGCTTGGCCACCGTCACCGCCCGCCCCCTTATTGCCTCCACCAGCTCCGGAAATAGAAAGAGCAAGGAAACGACGACGAACGCCGAAACTAGACGATACAACGTCCCCTGCCCCACCATCACCACCGTCCTCTGGTTCATCACCAGGAGCTACCGACCCACCGCGAGATAAATTGTTCACTTGCAAAGTTTGTTCGAGGTCCTTCGGATACAAACACGTCCTCCAAAACCACGAGCGAACACACACTGGTGAAAAACCATTCGAGTGCGGAGAATGCCACAAACGGTTTACTCGTGATCACCATTTGAAGACACACCTTCGCCTGCACACTGGCGAGAAGCCGTACAGTTGTCCGCATTGTCCACGGCACTTCGTACAAGTCGCTAATCTGAGAAGACATCTTCGGGTTCACACAGGAGAGAGACCCTACGCCTGCGCTCGCTGTCCAGCTCGATTCTCAGACTCGAATCAGCTCAAAGCTCATGCTCTGGTACACGAAGGTGACGCCCCGTTCGCTTGCCGATGTGGAGCGAGATTCAGGAGACGACAGGCTGCAGCTTTACACCGCTGTCCGAGCGGTGGCTGCGAGCCCGGCACGCCAAGCCCTCCAGCCTCGATCGCCCCGGATTGGCGCTGGGACGACTGGCCGGAGCAAACCGAGCCAGAAGATTTATCTCTGCCCCGAAGACCGGCGACTCCGGATTCGCCGACTGACTTGCGCGTGCACGCGGCGTAG

Protein sequence:

>DPOGS211954-PA
MSVVGVEGAGRGALCGTREEIVHRVHTAAVKTGGEAGAAAGRAGTRARGARGPRRPWWGVGVCLQRGAAGQRRATVSVVVGPTEEARLGTTHSLSRLRDGTCSQWCEMALSLQSNQTSRGVSVRSLSSMREESLPMLAERILASRAAALVAGLPAELYSGALLAAWPPSPPAPLLPPPAPEIERARKRRRTPKLDDTTSPAPPSPPSSGSSPGATDPPRDKLFTCKVCSRSFGYKHVLQNHERTHTGEKPFECGECHKRFTRDHHLKTHLRLHTGEKPYSCPHCPRHFVQVANLRRHLRVHTGERPYACARCPARFSDSNQLKAHALVHEGDAPFACRCGARFRRRQAAALHRCPSGGCEPGTPSPPASIAPDWRWDDWPEQTEPEDLSLPRRPATPDSPTDLRVHAA-