Monarch geneset OGS2.0

DPOGS213022
TranscriptDPOGS213022-TA1011 bp
ProteinDPOGS213022-PA336 aa
Genomic positionDPSCF300024 + 277159-278332
RNAseq coverage19x (Rank: top 80%)
Annotation
HeliconiusHMEL0211542e-14378.48% 
BombyxBGIBMGA006942-TA2e-11067.14% 
DrosophilaSp1-PD3e-5073.81% 
EBI UniRef50UniRef50_B0XGA46e-5666.43%Transcription factor btd n=2 Tax=Culicinae RepID=B0XGA4_CULQU
NCBI RefSeqXP_001868676.11e-5666.43%transcription factor btd [Culex quinquefasciatus]
NCBI nr blastpgi|3838574654e-5667.07%PREDICTED: transcription factor Sp3-like [Megachile rotundata]
NCBI nr blastxgi|3838574654e-5867.07%PREDICTED: transcription factor Sp3-like [Megachile rotundata]
Group
Gene OntologyGO:00036761.3e-19nucleic acid binding
GO:00082703.8e-06zinc ion binding
GO:00056223.8e-06intracellular
KEGG pathway 
InterPro domain[129-160] IPR0130871.3e-19Zinc finger, C2H2-type/integrase, DNA-binding
[138-162] IPR0070873.8e-06Zinc finger, C2H2
Orthology groupMCL25291 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213022-TA
ATGCAATACGTGGATCCGCCCGCGCAACAGGTGATGAATGTGATGAGTGGTTACGGTTGTGGTTACGGGCGTAGTGATGCTCTGTCGCCAGCCTCGGATTTATCATCGTGCAGCAGTGCCTCGTCCGGTGCTTGGTGTGGCACCAGCTGGAGAGAGCTGCCACCATACCCTCAGTATCCCGCATATTGGCCACCAACTACTTCAACCACCGAAGACGCAAGAGAAATTCAACGGCGTTGCGCAAAATGCCGTTGCCCTAACTGCCTTACAGAAGCAGCTGGATTTGGGCCAAACTACGGCAAGGATGGCGCGAAAAGGGAGCATGTGTGTCACGTTCCCGGATGCGGGAAAGTTTATGGAAAAACATCTCATCTTAAAGCCCATCTCCGCTGGCACACAGGGGAGAGACCTTTCGTGTGCAACTGGCTATTTTGTGGAAAGAGATTTACCCGTTCCGATGAACTACAGCGTCATCTTCGCACTCATACTGGTGAGAAAAGATTTGCTTGCCAGTTGTGTACGAAACGTTTCATGCGCTCCGATCACCTCGCGAAACATGTAAAGACACACGCGAATGTGTCGAGGAAGGCGAAAAAGGCTAAAGACGATGAAAAGGAATCGATTGTTAAGTCTAATGATGAGAAAGCAGAACAAAGACAAATAACTGAAATAACACCACCTCCAGCTGTAGCTCCAATCGGTAGTTCGAACTATGGAACAGTGCCCACATTAAACGAGGGGCCGAAGCAGATTTTAAATTACGCAACAGTAGCCCCTCACGTAATGTCAGGTGCTCCAGCGAGTTATAATAGTAGTGGTGTGTTTGCTAACGGAAATGTCCCTTACAGCAGTGGTTGGTACCCGGAGGTTCGTCAGGACTCATTGTACTCCAGAGACCCACGTTTCTACCAACAATATCCCTCACACCTAACATACCAGTGCGCCTCCAAGGATAACTATGTTTTCCAAGGACATTATAATTTCCAACCACCCGTGGCCATAGGACAGTGA

Protein sequence:

>DPOGS213022-PA
MQYVDPPAQQVMNVMSGYGCGYGRSDALSPASDLSSCSSASSGAWCGTSWRELPPYPQYPAYWPPTTSTTEDAREIQRRCAKCRCPNCLTEAAGFGPNYGKDGAKREHVCHVPGCGKVYGKTSHLKAHLRWHTGERPFVCNWLFCGKRFTRSDELQRHLRTHTGEKRFACQLCTKRFMRSDHLAKHVKTHANVSRKAKKAKDDEKESIVKSNDEKAEQRQITEITPPPAVAPIGSSNYGTVPTLNEGPKQILNYATVAPHVMSGAPASYNSSGVFANGNVPYSSGWYPEVRQDSLYSRDPRFYQQYPSHLTYQCASKDNYVFQGHYNFQPPVAIGQ-