Monarch geneset OGS2.0

DPOGS202118
TranscriptDPOGS202118-TA3027 bp
ProteinDPOGS202118-PA1008 aa
Genomic positionDPSCF300150 + 308626-317515
RNAseq coverage65x (Rank: top 67%)
Annotation
HeliconiusHMEL0023850.083.12% 
BombyxBGIBMGA006961-TA0.067.84% 
DrosophilaCG43284-PB2e-7749.31% 
EBI UniRef50UniRef50_E1ZWS31e-15748.59%Myelin transcription factor 1 n=2 Tax=Formicidae RepID=E1ZWS3_CAMFO
NCBI RefSeqXP_001606059.11e-15952.70%PREDICTED: similar to CG32778-PA [Nasonia vitripennis]
NCBI nr blastpgi|3454940432e-15852.70%PREDICTED: hypothetical protein LOC100122454 [Nasonia vitripennis]
NCBI nr blastxgi|3320226340.044.50%Myelin transcription factor 1-like protein [Acromyrmex echinatior]
Group
Gene OntologyGO:00056345.4e-17nucleus
GO:00063555.4e-17regulation of transcription, DNA-dependent
GO:00082705.4e-17zinc ion binding
GO:00037005.4e-17sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[684-714] IPR0025155.4e-17Zinc finger, C2HC-type
Orthology groupMCL10814 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202118-TA
ATGTCAGACAGAAACGAAGACAAATACGAAGAGGACAGAGAAGCGAAACGTAGGCAGGCTGAAATAATCTCAGTTAACAAGCAATCTCTTTACAACGCTTACAAAGCTGCCACAGCTTCATTGCCGTATTCGACCCAATCCGCCTTCAAGCCACCAGCGGAAGTGAAACACAAGATCCACGGCTCCAGTTTTCCCAGCGAGCCATTCGGTGGCTATTCTAATGATCACGAGAAGAGTGCGATTCATAAGGGTTCGAAGCAGTACACGGTGCTGCAGCCAGCCGCTGCTGGGTCAAGGGCTGCTACAGCGTTGCAGGAGGCTCGCACTGTACCATCAGCTCAACCGGCGCGAGACTTGCGACCCATCAACCCTCTCTCCCCACCAGCCGTCCGAGGTACAATGATCCTTGGCTCGTCTCGTCCCCAACTTCTTAACGACACCAAACTTCGCTTTATGAATGACTTAAAAGTTTGTTTCAAGAAGGCAACAAGTGTCCGACTCCTGGTTGCAATGGGCAGGGCCACGTTACAGGCCTCTATACGCATCATCGAAGTATTGGCGTTGCACGAGACAATCTTAAAGTGTCCGACGCCGGGGTGTAACGGACGAGGTCACGTCAGTTCTAACCGCAGCACTCACCGCTCTCTCTCTGGCTGTCCGACGGCCGCTGCACGCAAAGCAGCAGCTCGCTCCCAACGAGGCCGACCACCGGTTCAATTGCCCGCACCGGCTCCTGTGACTCAAGTGTCAGCCAGCACGTCAGATAGCCACACTTCAGACGGTTCCCGCGATCGGGTTCCAGCGGCGTCGCCGCAAACTCCGGCCGTGAAGCGCGAAGCCCCCGAACTGCTGGTGCCGAAGCGTGAGGCCGCCGAGCCCGAGCGCGACTCCCCCGGCATGGAGACGCGCCACGCCGGCTACGGCGCGCCGCCCGATCAGCGCTCACCATACGAACGACCGCCTGACGACCACGTACGGTCATACAGTCAAATGAACGAAGCTCGGTACGGATACGAAGCTAGGTGCTATGAGGGCGCCCCAGCTTTTGAGAGATATGACCCAGCTCAATGCCCTCAGAGGCCTTACGGTTGGGAAGAAGAACGATACCATGACCCTCATTTGCCAACGCCAATGAAAACGGACCAATCAGAACAGGAAACTAATTCTGGACCTATATATCCTAGACCAATGTACCATTACGAAGCTGGCGGTGTAGGCGCCGTGGGCGGGGTGGGCGCTATGGGCCCAGGCGTTCCCCCCGGCTTCTCCGCTATCAATCTCTCAGTGAAGATAGCCGCAGCTCAGGCTCAACGTCCTCGAAGTCCCACACCCAGGGATCCTCGTGATCCGCGTCCGGCTATAGATCTATCCACATCTAGTGGCAGTCCACAGGGTCCATATGCGTCACCGGTATACACGAGCGCCGGTGGTGGCAGTGGGGGCGGTGCACGGGGAAGCCCGCAGCCGGGCGCTTCGCCCCAACTTACGGCAAGTCCCCAAGTTCCCAGTCCACAAGGCCAGACCCTCGACCTTAGTGTGTCCCGTTTACCACATAGTAGAAGTTTTCCGGGTGGTGTTTCATACAGTCGAGAATCAACGCCGGATAGCGGTGGAAGCCATCCATATCTTGAAGCATACCATCGCGACACAGCCGCAGGGTACGGTGGTGTAAGCCCTCACCCGGTAGCCGGATACGGTCTTGCGCAGCCGGATTACGCAGCTGCTGCTGCCGCTGCCGGATACGGTGGCTATCAGTACCAATGCGGGGCATACCCACCCCCGCCCGCGTACCCCCCGCACGCGCCGCCGTATTCACCACCGTGCTATATGCCGCCGCCGCACGCACCGCACGACAAGCCCAAGGATAGCTATCACCGCGACGACTTTTACGGGAAACATGGTTACCGTATTCGGGAGTCGAAAGAACTGATCCACTGTCCCGTCCGAAGCTGCGACGGATCCGGACACGTGTCTGGCAACTTTGCAACTCACCGCAGTCTGTCCGGGTGTCCTCGTGCTGATCGCTCTCAACTGCAGCCACATTCTCAAGAACTGAAGTGTCCCACACCAGGTTGCGACGGCTCCGGGCACGTTACCGGGAACTACTCCTCCCATCGATCACTATCAGGTTGCCCCAGGGCTAATAAACCGAAAAGCAAGCCCAGGGATGGCCAAGATTCTGAACCGCTCAGATGCCCTATACCGGGCTGTGATGGATCTGGGCATGCCACAGGAAAATTCTTATCACACAGAAGCGCGTCGGGCTGCCCTATTGCAAATCGGAACAAAATGCGGGTTCTAGAAAGCGGCGGCACAGTTGAGCAGCACAAAGCGGCAGTGGCGGCAGCGGCATCCGCTATTAAATTCGATGGCGTGAACTGTCCTACCCCGGGATGTGATGGATCGGGACATATAAACGGTTCGTTTCTAACCCATCGTTCGCTATCCGGCTGTCCCGTAGCCGGTGCAACCACACCGACGCCTCAACCAAAGAAACCGAAATATCCTGATGATATCACTCCGCTATACCCCAAGCCCTATTCAGGTATGGATATTAACATGCAGACAGGAAACGGCGAAGATTTAATGACACTGGAGCAAGAAATTACTGAACTCCAGCGTGAAAATGCAAGAGTGGAATCACAGATGATGCGTCTGAAATCGGACATAAACGCGATGGAGTCACACTTGAGCCATGGAGAAAGGGAGAATCAGCTCATCATTCATCGCAACAGCAATCTGAATGAATATTACGAAAGCCTTCGGAACAATGTGATCACGTTGCTGGAGCACGTTAAGATACCAGGAGGAGGTACGGTGCCCGTATCAACTGCCCCCGGAACTCCCGGAGCTGCACCCCCGACTGGCCCTGGTGATAAACCCGCCCACGATAACTTCGACTCTTATCTCACCAAGCTGCAGACCCTATGCTCCCCGGAAGGATACTGCCCCGATGAGAATCGACCGATCTATGAGACCGTTAAAAACGCGCTCCAAGACTTCACAGTGCTACCAACGCCGATATAA

Protein sequence:

>DPOGS202118-PA
MSDRNEDKYEEDREAKRRQAEIISVNKQSLYNAYKAATASLPYSTQSAFKPPAEVKHKIHGSSFPSEPFGGYSNDHEKSAIHKGSKQYTVLQPAAAGSRAATALQEARTVPSAQPARDLRPINPLSPPAVRGTMILGSSRPQLLNDTKLRFMNDLKVCFKKATSVRLLVAMGRATLQASIRIIEVLALHETILKCPTPGCNGRGHVSSNRSTHRSLSGCPTAAARKAAARSQRGRPPVQLPAPAPVTQVSASTSDSHTSDGSRDRVPAASPQTPAVKREAPELLVPKREAAEPERDSPGMETRHAGYGAPPDQRSPYERPPDDHVRSYSQMNEARYGYEARCYEGAPAFERYDPAQCPQRPYGWEEERYHDPHLPTPMKTDQSEQETNSGPIYPRPMYHYEAGGVGAVGGVGAMGPGVPPGFSAINLSVKIAAAQAQRPRSPTPRDPRDPRPAIDLSTSSGSPQGPYASPVYTSAGGGSGGGARGSPQPGASPQLTASPQVPSPQGQTLDLSVSRLPHSRSFPGGVSYSRESTPDSGGSHPYLEAYHRDTAAGYGGVSPHPVAGYGLAQPDYAAAAAAAGYGGYQYQCGAYPPPPAYPPHAPPYSPPCYMPPPHAPHDKPKDSYHRDDFYGKHGYRIRESKELIHCPVRSCDGSGHVSGNFATHRSLSGCPRADRSQLQPHSQELKCPTPGCDGSGHVTGNYSSHRSLSGCPRANKPKSKPRDGQDSEPLRCPIPGCDGSGHATGKFLSHRSASGCPIANRNKMRVLESGGTVEQHKAAVAAAASAIKFDGVNCPTPGCDGSGHINGSFLTHRSLSGCPVAGATTPTPQPKKPKYPDDITPLYPKPYSGMDINMQTGNGEDLMTLEQEITELQRENARVESQMMRLKSDINAMESHLSHGERENQLIIHRNSNLNEYYESLRNNVITLLEHVKIPGGGTVPVSTAPGTPGAAPPTGPGDKPAHDNFDSYLTKLQTLCSPEGYCPDENRPIYETVKNALQDFTVLPTPI-