Monarch geneset OGS2.0

DPOGS205904
TranscriptDPOGS205904-TA3240 bp
ProteinDPOGS205904-PA1079 aa
Genomic positionDPSCF300089 - 212220-224514
RNAseq coverage590x (Rank: top 22%)
Annotation
HeliconiusHMEL0130080.065.61% 
BombyxBGIBMGA007089-TA0.079.32% 
Drosophilacrol-PE0.069.16% 
EBI UniRef50UniRef50_E0VFL00.065.94%Zinc finger protein, putative n=6 Tax=Arthropoda RepID=E0VFL0_PEDHC
NCBI RefSeqXP_002424904.10.065.94%zinc finger protein, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2420082110.065.94%zinc finger protein, putative [Pediculus humanus corporis]
NCBI nr blastxgi|2420082110.052.56%zinc finger protein, putative [Pediculus humanus corporis]
Group
Gene OntologyGO:00036767.9e-13nucleic acid binding
GO:00082705.3e-06zinc ion binding
GO:00056225.3e-06intracellular
KEGG pathway 
InterPro domain[662-689] IPR0130877.9e-13Zinc finger, C2H2-type/integrase, DNA-binding
[671-693] IPR0070875.3e-06Zinc finger, C2H2
Orthology groupMCL10000 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205904-TA
ATGAATCAGGATCATCACAATATAAACACGGGTGGTGGCCAGCCACCAGGCAATTCTGAGCCTCAAACTCAGAGAGTTCAATCATCGCAGCAGCAACAGCAGCAGCAAACACAAAATAACAGTTTAACTCCCACAACATCTGCCACTGATTTACGGGTCAATTCGGCGGCAGTAAACGTTGCTTTGTCAAGCGTAGCTAAATATTGGGTGTTTACTAATTTGTTTCCGGGTCCAATACCTCAGGTGTCAGTGTATGGCGTGCCCACTGGTGCAAGAATTGAAAATGGAAAACCTGTACAGGATCTAGGTCAAGCGCATGCTAGTATCTTGAATGGTGATCCTAGTATTATACTAGGTCATGCAGGACAACCCCAGGTCACAGTATCAGCTGCGGGACAACAGATTCCTGTCTCCCAGATAATCGCAACTCAATCAACACAAGGACATGAGTCGCTGGTCGGTCACGGGTCGGGTGAATCGGGCGGTTCGCTGGCAACACCCTCGCAGGTACCCAATCGGGTCGAGTTTGTACAACATAACGTTGACATGGGTCACCACTCGCAGCAGCATCTGATGCAACAACAGCTAGCCATGGCAAGACCCGACCACGCCAATCAACAGATCCAGCTCACTGTGAGCGAGGATGGGATAGTAACAGTGGTGGAGCCGGGCGGAAACAAGTTGGTTGACAAAGAAGAACTACACGAGACTATAAAGATGCCCACAGATCATACACTCACTGTACACCAACTGCAACAGATTGTTGGACATCATCAGGTGTTAGACAGCGTGGTTCGCATCGAGCAGGCCACGGGCGAGCCCGCGAACATACTTGTGACTCAAAATCCCGACGGTACCACCTCCATAGAAACTAGCGCCGCAGACCCGTTGACGGTCAAGGATGAAAAAGGAACTAAAATGGAAACCGCACAGTTCGCCATACCAGCCGAAATCAAAGACATCAAGGGCATTGACCTTAAGAGTGCGATGGGTATGGAAGGCGCTGTGGTAAAGATATCAACTGGCTCCGAACAACACGATCTACACACCATGTATAAAGTTAATGTCGAAGATTTATCACAGTTGTTAGCCTATCATGAGGTCTTTGGAAAATTAAACACTGAGGGACAACCCCAAGCTAAGGTGATAAACGAGGTGGAAGTTGAGGCCGGAACTAGCGCTGCCATGTCGGAAGCGGAGTCATCTCCCGGCCATCATTCTTGTGATATATGCGGGAAAATATTCCAGTTCAGATATCAACTTATAGTACACAGACGGTACCATGGCGAAAACAAACCACATACATGTCAAGTGTGCGGGTCCGCTTTCGCTAATCCAGTTGAACTGTCCAAACATGGGAAATGTCATTTGGCGGGAGATCCGACCGAACGCCAGGCCAAACGTCTTGCTCAGGACAAACCGTATGCTTGTTCTACTTGCCACAAGACGTTCTCGCGTAAGGAACACCTCGATAACCACGTGCGAAGTCACACTGGAGAAACACCCTATAGATGTCAGTTCTGCGCCAAGACGTTCACTCGCAAGGAGCACATGGTGAACCACGTCCGTAAACACACGGGCGAGACTCCGCACCGCTGTGAGATCTGCAAGAAGAGCTTCACGAGGAAGGAGCACTTCATGAACCACGTCATGTGGCATACCGGTGAAACACCGCACCATTGTCAAATATGCGGCAAGAAGTATACTAGGAAGGAGCATTTAGTGAACCATATGAGATCCCATACAAACGATACTCCCTTCAGATGCGATCTGTGCGGCAAATCATTCACCAGAAAGGAACACTTCACCAATCATATATTGTGGCACACTGGTGAGACTCCCCACCGCTGCGACTTCTGTTCGAAGACTTTTACCCGTAAGGAGCATCTTTTAAACCACGTGCGACAACACACGGGCGAGTCTCCGCACCGCTGTAACTACTGCGCCAAGTCATTCACACGGCGAGAACATCTCGTGAACCACGTGAGGCAACATACTGGCGAGACGCCATTCCAGTGTGGATACTGCCCTAAGGCTTTCACTAGAAAGGATCATCTTGTAAACCACGTCCGGCAGCATACTGGGGAGTCTCCACACAAATGTTCTTTCTGCACGAAGTCTTTTACTCGCAAAGAACATTTGACCAACCACGTGCGTCAACACACAGGAGAATCTCCGCATCGGTGTATTTATTGCTCCAAATCTTTCACTAGAAAAGAACATTTAACTAATCACATTAGACAACATACGGGCGAGACTCCTCACAAGTGCACGTACTGTCCGCGTGCGTTCGCGAGGAAGGAACACCTCAACCAGCACGTGAGGCAGCACGTGGGCGACTCCCCGCACACCTGCTCCTACTGCCAGAAGACCTTCTCCAGGAAGGAACATCTAGTAACTCACGTCCGACAACACACGGGTGAGACTCCATTCAAATGCACCTTCTGCGCCAAATCGTTCAGTCGAAAAGAGCATCTAACGAATCACGTTCACCTTCATACCGGCGAAACGCCGCACAAATGCCCCTTCTGTACCAAGACGTTCTCGAGGAAGGAACACTTGACTAATCATGTTAGGATCCACACAGGAGAATCTCCACATCGATGTGAATTCTGTCAGAAGACATTTACCCGTAAGGAGCATTTGACGAATCATCTAAAGCAACACACCGGCGACACGCCGCACGCCTGCAAAGTGTGCTCCAAACCATTCACTAGAAAAGAACATCTCATTACTCACATGAGGTCCCACAGCTGTGGCGAGCGACCATATAGTTGTGGCGAATGCGGGAAATCCTTCCCTCTGAAGGGCAACCTATTATTCCATGAGCGATCTCACAACAAAAACAACGCAGCTAACAAGCCGTTCCGATGTGATGTGTGCTCCAAAGAGTTTATGTGCAAAGGTCATCTAGTAACACATAAGAGAACTCACACGGACACTGAAACACCGGCCGCTGAAACGGCTCCAGAAGATGATTGCGGAGATTTCACTAAATGCGAAAAAGACGCTGATAGACCTGAACGAAAGCACGATATTAGGACAACAACAGAAAATAGACCAGCGGAAACGAATGTCACAAGCAATCAGCCAACAAATACAGCAGTGATGCAAATAACTAGCCAGGAAGTTAGAACGTGCCCCACAACAAGCACGCCATCTGTTGCCGGTACATACACACATACAAATACCCATCACAGTGGAACGATAACACACCATCCAGTGTCCGTGAATTACTAG

Protein sequence:

>DPOGS205904-PA
MNQDHHNINTGGGQPPGNSEPQTQRVQSSQQQQQQQTQNNSLTPTTSATDLRVNSAAVNVALSSVAKYWVFTNLFPGPIPQVSVYGVPTGARIENGKPVQDLGQAHASILNGDPSIILGHAGQPQVTVSAAGQQIPVSQIIATQSTQGHESLVGHGSGESGGSLATPSQVPNRVEFVQHNVDMGHHSQQHLMQQQLAMARPDHANQQIQLTVSEDGIVTVVEPGGNKLVDKEELHETIKMPTDHTLTVHQLQQIVGHHQVLDSVVRIEQATGEPANILVTQNPDGTTSIETSAADPLTVKDEKGTKMETAQFAIPAEIKDIKGIDLKSAMGMEGAVVKISTGSEQHDLHTMYKVNVEDLSQLLAYHEVFGKLNTEGQPQAKVINEVEVEAGTSAAMSEAESSPGHHSCDICGKIFQFRYQLIVHRRYHGENKPHTCQVCGSAFANPVELSKHGKCHLAGDPTERQAKRLAQDKPYACSTCHKTFSRKEHLDNHVRSHTGETPYRCQFCAKTFTRKEHMVNHVRKHTGETPHRCEICKKSFTRKEHFMNHVMWHTGETPHHCQICGKKYTRKEHLVNHMRSHTNDTPFRCDLCGKSFTRKEHFTNHILWHTGETPHRCDFCSKTFTRKEHLLNHVRQHTGESPHRCNYCAKSFTRREHLVNHVRQHTGETPFQCGYCPKAFTRKDHLVNHVRQHTGESPHKCSFCTKSFTRKEHLTNHVRQHTGESPHRCIYCSKSFTRKEHLTNHIRQHTGETPHKCTYCPRAFARKEHLNQHVRQHVGDSPHTCSYCQKTFSRKEHLVTHVRQHTGETPFKCTFCAKSFSRKEHLTNHVHLHTGETPHKCPFCTKTFSRKEHLTNHVRIHTGESPHRCEFCQKTFTRKEHLTNHLKQHTGDTPHACKVCSKPFTRKEHLITHMRSHSCGERPYSCGECGKSFPLKGNLLFHERSHNKNNAANKPFRCDVCSKEFMCKGHLVTHKRTHTDTETPAAETAPEDDCGDFTKCEKDADRPERKHDIRTTTENRPAETNVTSNQPTNTAVMQITSQEVRTCPTTSTPSVAGTYTHTNTHHSGTITHHPVSVNY-