Monarch geneset OGS2.0

DPOGS214691
TranscriptDPOGS214691-TA1974 bp
ProteinDPOGS214691-PA657 aa
Genomic positionDPSCF300022 - 1551915-1563730
RNAseq coverage303x (Rank: top 37%)
Annotation
HeliconiusHMEL0108306e-14868.77% 
BombyxBGIBMGA005595-TA1e-8562.69% 
DrosophilaGATAd-PA3e-2935.62% 
EBI UniRef50UniRef50_D6WX574e-4236.93%GATAd n=2 Tax=Tribolium castaneum RepID=D6WX57_TRICA
NCBI RefSeqXP_001812551.14e-4336.66%PREDICTED: similar to GATAd CG5034-PA [Tribolium castaneum]
NCBI nr blastpgi|1892402778e-4236.66%PREDICTED: similar to GATAd CG5034-PA [Tribolium castaneum]
NCBI nr blastxgi|2700128033e-4437.10%GATAd [Tribolium castaneum]
Group
Gene OntologyGO:00063558.1e-23regulation of transcription, DNA-dependent
GO:00082708.1e-23zinc ion binding
GO:00037008.1e-23sequence-specific DNA binding transcription factor activity
GO:00435654.9e-21sequence-specific DNA binding
GO:00056343.5e-14nucleus
KEGG pathway 
InterPro domain[494-549] IPR0130888.1e-23Zinc finger, NHR/GATA-type
[492-542] IPR0006794.9e-21Zinc finger, GATA-type
[11-89] IPR0129343.5e-14Zinc finger, AD-type
Orthology groupMCL22098 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214691-TA
ATGTGTAATTTGCCGCCGAAGTTCCATTCAGTGTGTCGCCTTTGTTTATCATTCTGTGGCGATAATTGCAGTGATGTTAAACTCCCAATATTCGATCGTGATAAGGATAAATCCCGGCTTTCCGAGATGATAATGACATATTTGTCCATAATGGTATCATCAGAGGACATGCTGCCGCAGGTGGTATGTGGGAGCTGTGCACACAAACTTGATGAGTTCCACACATTCAGAGAACTGACACACAAGTCTGAGAGACTTTTGGAACAATTTGTACAATACGCTAATTCACTGTCAGGTCCAAAAGAGGATATCCTGAATGTAACCGCCGACAAGTTGGAAGAAATTATTAAGTCCTTAAACGAGAATGATTACGACGATCCCATAAAGAGCAAGTACGATGAGATCGGCTCCCCGGACTCTACCGAGGAGATGAAGAACCTGGAGAGTCGGCAGGCGGCTGTGACACTGCTGCAAATAAAAAACTACGATCCTACTAAATACGCCGTCAAGACTGAGGAAAGTCCTCACATAATGTTCAATAGCGTTCCGAGTTTACCGCCCGCCGACAGAGCGAGAGAGGTTATGCATTGTAATGCCGTCATAGATATAATAAGCAAGGCTGTCGCGGTGGCACAGCGCGAGAACGTTGAATCTCAGAACTACCAGCCGAATTACACGGGCGTCATAGACAGGACGTCCTCTACGGCCGCTGAGCTAGCGTACACGCAGGACTACCCTGAGGAGCAGTACGCTGGGTACAATGCCGCTCATGACGTCACCAGCCCCGGCAGCGATGACGACAACAAGGAAATGGACCTTGAACAGAACGAGCAGTGCGAGCGTGATACGAGCGGCTTCTTACAGAGGAACCACGCCAGTAACAAATCCAGCTTCGTCGAGGAATACAAGCAGCACGTGTTCGGGCAAGCTAAGAAGCATAATGACAGCTCGCCCGTGTACGAGGAGTGCAGTCAGAGCAGCAGCGGCTCAGACCCTGATAGACTACAGATGGATATCTCTGAAGTATCGCAGGACGACCCCGAAGAGACGCAATCGGTGCCATCAGCTCAGTCTTCCCCCAAGCCGCCTCACGACAACGATACGGACAAGGAGTCCCTGTGGCAGGCGCTCCACAAACAGAACGGTCGTGGCGGCGAGGCGACTCACCTGCTGCGGAGGCTCATCAACAGCAAACACCTGGGCATGACGGTGTCCCCGCTCCGGGCCGCGCCCTCACCACTACCACAGACACACCCGCACACACACAACGGCACCGTGTCACCGAACGGTGAGTGGTCGAGTCCCACTCGCGGCGGGTCGGGGGCGGGCGCCGGCACAGCACGCAGGAAGCAGAGCTGCCCGGCTCGAGCACAACCAGCCCTGGACACCACCGGCTGGACCAGCGACCAGCAGGAGAGTCCAGAGAGCGCGTCTAATACAACATCAGGCGTGGTGTCGGGTGGGGCAAGGGGTCCTCGTGTGGAACTGTCCTGCAGTAACTGCGGCACTCACACCACCACCATCTGGCGGAGGGACGCCCGCGGGGAGATGGTGTGCAACGCGTGCGGTCTGTACTACAAGCTGCACGGTGTACCGCGGCCCAGCGCCATGAGGAGGGACACGATACACACACGGCGCAGGCGGCCCAGACACGACGGGAAACATACTAGGAACACCTCGCCAGGCGGCGGTGAGGGAGGGGGGACAGTGGTCAGTACTGAGGGGGAGGTGTCACGCGGTGGAGGCTCTGGAGGGGGAGGGGGAGCCGCCGCAGGGGGGCCGTCTGACGGGGCCGAGGAGGCCGTGCTCGCAGCGCTCAGGAGACAGCTACAACCTCACTTGCTGGCAGCACTACACGCACACACACCCAGGGAACACACGCACACACGTACACAGGGCCGCAGCGTGTCGGAGTACGATGAGGCGCCCCTGAACCTGGTGGCGAGTCACGTGGCCGCCGAGGAGACGCGCTGA

Protein sequence:

>DPOGS214691-PA
MCNLPPKFHSVCRLCLSFCGDNCSDVKLPIFDRDKDKSRLSEMIMTYLSIMVSSEDMLPQVVCGSCAHKLDEFHTFRELTHKSERLLEQFVQYANSLSGPKEDILNVTADKLEEIIKSLNENDYDDPIKSKYDEIGSPDSTEEMKNLESRQAAVTLLQIKNYDPTKYAVKTEESPHIMFNSVPSLPPADRAREVMHCNAVIDIISKAVAVAQRENVESQNYQPNYTGVIDRTSSTAAELAYTQDYPEEQYAGYNAAHDVTSPGSDDDNKEMDLEQNEQCERDTSGFLQRNHASNKSSFVEEYKQHVFGQAKKHNDSSPVYEECSQSSSGSDPDRLQMDISEVSQDDPEETQSVPSAQSSPKPPHDNDTDKESLWQALHKQNGRGGEATHLLRRLINSKHLGMTVSPLRAAPSPLPQTHPHTHNGTVSPNGEWSSPTRGGSGAGAGTARRKQSCPARAQPALDTTGWTSDQQESPESASNTTSGVVSGGARGPRVELSCSNCGTHTTTIWRRDARGEMVCNACGLYYKLHGVPRPSAMRRDTIHTRRRRPRHDGKHTRNTSPGGGEGGGTVVSTEGEVSRGGGSGGGGGAAAGGPSDGAEEAVLAALRRQLQPHLLAALHAHTPREHTHTRTQGRSVSEYDEAPLNLVASHVAAEETR-