Monarch geneset OGS2.0

DPOGS208361
TranscriptDPOGS208361-TA1986 bp
ProteinDPOGS208361-PA661 aa
Genomic positionDPSCF300251 + 429355-432182
RNAseq coverage833x (Rank: top 15%)
Annotation
HeliconiusHMEL0027587e-9877.57% 
BombyxBGIBMGA009570-TA2e-12866.92% 
DrosophilaCG9915-PB2e-9957.27% 
EBI UniRef50UniRef50_Q7QFC03e-10263.27%AGAP000400-PA n=3 Tax=Anopheles gambiae RepID=Q7QFC0_ANOGA
NCBI RefSeqXP_002430170.12e-11651.02%conserved hypothetical protein [Pediculus humanus corporis]
NCBI nr blastpgi|2420194424e-11551.02%conserved hypothetical protein [Pediculus humanus corporis]
NCBI nr blastxgi|1892353811e-16754.53%PREDICTED: similar to CG9915 CG9915-PB [Tribolium castaneum]
Group
Gene OntologyGO:00056341.8e-17nucleus
GO:00036771.8e-17DNA binding
GO:00063511.8e-17transcription, DNA-dependent
KEGG pathway 
InterPro domain[486-540] IPR0179231.8e-17Transcription factor IIS, N-terminal
Orthology groupMCL11525 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208361-TA
ATGTCTCGCGATGGTTCAGTTCGCTCCAAGTCAGCGTCTCGTTCTCGTTCTCGCTCCAAGTCGTCTTCCCGCTCTCGCTCCAGATCCCACTCGTCCAGGTCAAGGTCGGGCTCCAGATCTCCCAGCGGCTCGCGGAAGTCCCGGTCTCGCAGTAGTTCCCCAAAATCTCCCCGCTCCAGAAGCGGATCGGCTAATTCTAATCGATCTCTGAGCGGATCGAATAAATCCAGATCTAGGAGTGGGTCTCCTAGGAAATCCAAATCGAGGAGTCCTTCGGTCGCGTCCAAGGCTGAATCTAGATCGAGGAGTGGATCGGCACACTCGCGCTCCCGGTCTAGGAGCGGCTCTCCCGCTAAGTCTCGTTCTCGTACCGGTTCCCCTGCAAAGTCTCGGTCGCGCAGTGTGTCGCGTGCCAAGTCGCGATCCCGAAGTCGGTCTGGGAGCGGATCTCCGAGAAAATCGAGATTCAGGAGCGGGACACCCAGGAATTCTAAATCCAAAAGTAGATCGAGGTCAAGGAGTTTATCGAAACGTTCTCGATCGAACAGCGTGTCTCCAGAAAAAGCCAGATCGAGAAGCGGATCCGTCAAATCAGACACAGCTCGAAAATCAAGATCTAAGAGTCCTTCACCGAGTACAGAGTCCAAGAAGAAATCGAGATCTCGCAGCCTTTCACCTCAGAAGGCATCCAGTAAGTCCCCAGACAGTAAACAAAGGAATGAATCTAAACAAATGGAAACCGAGGAAAAGGAGCTTGACAGGCCGGGGTCGGCCGCTGATGTGAGAGCCAGTCGGTCTCGCTCCAGGTCAGTCGGTCGTAAGTCTGGGTCTCGTTCTCGTTCCCGGTCCGGTTCCCGGTCTCGCTCTCACTCCAGATCAAAGTCTCGATCACGCTCACGATCCGGCTCAGCGAAGTCAAGGTCCCGTTCTCGTTCTGGGTCCCGCTCGGGGTCCGGCTCACCGTCTCGTAAAGAACACAAAAAACGACGCACGGTCCGCCTGGCCTCAGACGACGAGAACGAGGGCGTGGCCGAGGGGAGGGAGGAAGAAGAGGTCGGAGAGGGAGTCGTTGAGGAGGAGGAGGGGGAGGACGAGGAGGGAGGGGGAGGTGGCAGAGAACAACACGGACTGTCCGACTTCGAGGCTATGATGCAGAGGAAGAGAGAGGAGCGGCGAGGAAGACGCAGGAGGAGAGACATCGAGATGATCAACGACAACGACGACCTCATAGCGGCGCTCCTCGCGGACATGCGGCGGGCGGCGGACGAGGACCGCGAGCTGAACCGAAGGAATCAGCCCGCCGTGAGGAAGGTGTCCATGCTGAAGAGAGCCGTGTCGCAGCTCATCAAGAGAGACCTGCAGCTGGCTTTCCTGGAGGCCAACGTGCTCAACGTGCTGTGCGACTGGCTGGCGCCGATGCCCAATAGAGCGCTGCCCTGTCTGCTCATCAGGGAGAGCGTGCTGAAGCTGCTCATGGATTTCCCAGCCATCGACAAGTCTCTTCTCAAGCAGTCGGGGATCGGCAAAGCGGTGATGTACCTCTACAAGCATCCCAAGGAAACGAAAGCTAACAAAGAGCGTGCCGGCCGCCTCATATCCGAGTGGGCCCGACCGATATTCAACTTGTCCACAGACTTCAGAGCTATGACACGAGAGGAGCGACAGGCGCGAGACGAGGCCATGTCGGGGAATAGGAGGAGGGAGGAAGCCCCGCCCAGCAAGAGAACCCGCACAGAGGAACCGGAGAGAGCTGTCCGTCCCGGTGAGCCGGGCTGGGTGTCCCGGGCGAGGGTTCCCGCGCCCTCCAACAAGGACTACGTGGTGAGGCCCAAGTCTACCTGCGACCTGGACATGTCCCGGGTCAGCAAGAAGAAGATGACGCGCTACGAGAAGCAGATGAAGAAGTTCCTCGACCAGAAGAGAATGAAGGGAGGGACCAAGAGAGCCGTCGAGATCTCCATAGAGGGGAGGAAGATGGCGCTGTAG

Protein sequence:

>DPOGS208361-PA
MSRDGSVRSKSASRSRSRSKSSSRSRSRSHSSRSRSGSRSPSGSRKSRSRSSSPKSPRSRSGSANSNRSLSGSNKSRSRSGSPRKSKSRSPSVASKAESRSRSGSAHSRSRSRSGSPAKSRSRTGSPAKSRSRSVSRAKSRSRSRSGSGSPRKSRFRSGTPRNSKSKSRSRSRSLSKRSRSNSVSPEKARSRSGSVKSDTARKSRSKSPSPSTESKKKSRSRSLSPQKASSKSPDSKQRNESKQMETEEKELDRPGSAADVRASRSRSRSVGRKSGSRSRSRSGSRSRSHSRSKSRSRSRSGSAKSRSRSRSGSRSGSGSPSRKEHKKRRTVRLASDDENEGVAEGREEEEVGEGVVEEEEGEDEEGGGGGREQHGLSDFEAMMQRKREERRGRRRRRDIEMINDNDDLIAALLADMRRAADEDRELNRRNQPAVRKVSMLKRAVSQLIKRDLQLAFLEANVLNVLCDWLAPMPNRALPCLLIRESVLKLLMDFPAIDKSLLKQSGIGKAVMYLYKHPKETKANKERAGRLISEWARPIFNLSTDFRAMTREERQARDEAMSGNRRREEAPPSKRTRTEEPERAVRPGEPGWVSRARVPAPSNKDYVVRPKSTCDLDMSRVSKKKMTRYEKQMKKFLDQKRMKGGTKRAVEISIEGRKMAL-