Monarch geneset OGS2.0

DPOGS210685
TranscriptDPOGS210685-TA1245 bp
ProteinDPOGS210685-PA414 aa
Genomic positionDPSCF300013 - 885685-890946
RNAseq coverage192x (Rank: top 48%)
Annotation
HeliconiusHMEL0215163e-10456.22% 
BombyxBGIBMGA006301-TA1e-13563.85% 
DrosophilaYL-1-PA1e-4549.57% 
EBI UniRef50UniRef50_Q8T5I13e-7443.10%AGAP001565-PA n=7 Tax=Neoptera RepID=Q8T5I1_ANOGA
NCBI RefSeqXP_321546.35e-7543.10%AGAP001565-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1187945081e-7343.10%AGAP001565-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1700556294e-8245.70%transcription factor [Culex quinquefasciatus]
Group
Gene OntologyGO:00056341.9e-47nucleus
GO:00036771.9e-47DNA binding
GO:00063551.9e-47regulation of transcription, DNA-dependent
GO:00037001.9e-47sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[1-217] IPR0088951.9e-47YL1 nuclear
[346-374] IPR0132721.6e-11YL1 nuclear, C-terminal
Orthology groupMCL13634 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210685-TA
ATGGCGCAACGAGAACGTCGGTCCAATGCAGGAAATCGTATGGCAAAATTACTTGATGAGGAGGAAGAAGATGATTTCTATAAAACCACTTACGGCGGCTTTACAGAAGCCGATGAAGACAATGATTATATCCAAGAGAAAGAGGTTGACGATGAAGTGGATTCTGATTTTGATATTGATGAGACTGATCAGCCAATATCTGACACGGAAACCGAAGAAAAATCTAAACAAAAAGTTAGCACTAAGGGTTATAAGGATCCAAATAGAAAAAGAAAAGGAGAAAAGAATAAAAAGGTAATAGTGAAAAAGCCTAAAGAGCCAAAAGCCAAAGAGGAATCCGAAAAAGTCGAGCTCAATGAAATATCAGTAATGGACACATCAATTGAAAGAAAATCTATAAGACAAAGCACAGCCGTCAAATCCCTCGAAACACAGCAGCGGATTAAAATAAGGTCGGAACTTAAAAAGAAAAAACCAAAGAAGGTTGAAGAAAGAGTTCTCACTCAAGAGGAACTTCTTGAAGAAGCTCTTATAACTGAAAAAGAGAACTTAAAGAGTTTGGAGAGATTTGAACAGAGTGAATTAGAAAAGAAGAAAATAAGACCAACAAAGAAAACTATAACGGGTCCAGTCATAAGGTACCATTCTTTTGCGGTCCCGCTAATAACGGAGGTGACGCCGGAGGATAAGATTAGTTTAAACACACCAACAGCGGAACAACCAGAGTTGAATTTAAGTGATTTGATAAAAAGTGAAGATGGTTTAATGCTAGATGATTCAATTGAAAATCAACAGTCAATGCAAACAATGAATGAAAATGGTGATCAAAAAGCTGATAAAAAAGATATACAAATCTTAGAGCCAGATGAGAGCTTTAATATAGAAGACGGAACTATGAAAGAGGAAAAGCCGCAAACAGTTAATGCGAATACAAAATATCATGAGAGGACATTATTATACTTTGAGAATGACATAAAGGATGAGGCGTTCAATGCTTGCTTTCCTCAGAGGAAACCAAAGAAGAAACGAGATCTGCTATGTGCTGTTACAAAACGTCCAGCCAGATATATAGATCCAATAACAAAACTTCCCTATCGTAGCGTGGACGCATTCCGAATTATACGTGAGGCTTACTATCAGCAATTAGAAGCGAGGGGGGACAAAAACGATCCTCAAGTGGCCGCCTGGTTGAAGTCACGGAGACCTAATACAACATCGTCTTACGTACAAATACATATTAAATGA

Protein sequence:

>DPOGS210685-PA
MAQRERRSNAGNRMAKLLDEEEEDDFYKTTYGGFTEADEDNDYIQEKEVDDEVDSDFDIDETDQPISDTETEEKSKQKVSTKGYKDPNRKRKGEKNKKVIVKKPKEPKAKEESEKVELNEISVMDTSIERKSIRQSTAVKSLETQQRIKIRSELKKKKPKKVEERVLTQEELLEEALITEKENLKSLERFEQSELEKKKIRPTKKTITGPVIRYHSFAVPLITEVTPEDKISLNTPTAEQPELNLSDLIKSEDGLMLDDSIENQQSMQTMNENGDQKADKKDIQILEPDESFNIEDGTMKEEKPQTVNANTKYHERTLLYFENDIKDEAFNACFPQRKPKKKRDLLCAVTKRPARYIDPITKLPYRSVDAFRIIREAYYQQLEARGDKNDPQVAAWLKSRRPNTTSSYVQIHIK-