Monarch geneset OGS2.0

DPOGS216059
TranscriptDPOGS216059-TA1380 bp
ProteinDPOGS216059-PA459 aa
Genomic positionDPSCF300067 + 228706-236505
RNAseq coverage25x (Rank: top 77%)
Annotation
HeliconiusHMEL0150470.087.39% 
BombyxBGIBMGA009025-TA6e-10578.06% 
Drosophilagsb-n-PA3e-10984.30% 
EBI UniRef50UniRef50_UPI00021A7D295e-11670.32%UPI00021A7D29 related cluster n=3 Tax=unknown RepID=UPI00021A7D29
NCBI RefSeqXP_974185.12e-11766.46%PREDICTED: similar to gooseberry-neuro CG2692-PA [Tribolium castaneum]
NCBI nr blastpgi|910896593e-11666.46%PREDICTED: similar to gooseberry-neuro CG2692-PA [Tribolium castaneum]
NCBI nr blastxgi|910896596e-11365.24%PREDICTED: similar to gooseberry-neuro CG2692-PA [Tribolium castaneum]
Group
Gene OntologyGO:00036771.8e-78DNA binding
GO:00063551.8e-78regulation of transcription, DNA-dependent
GO:00055157.1e-43protein binding
GO:00435654.3e-28sequence-specific DNA binding
GO:00037004.3e-28sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[13-134] IPR0015231.8e-78Paired box protein, N-terminal
[14-135] IPR0090577.1e-43Homeodomain-like
[16-81] IPR0119918.7e-37Winged helix-turn-helix transcription repressor DNA-binding
[174-236] IPR0013564.3e-28Homeobox
[170-236] IPR0122875.7e-27Homeodomain-related
Orthology groupMCL18400 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS216059-TA
ATGGAACGCCACCAAAATGGTATGGATGTGTGGGTAGGTCAGGGCAGGGTTAACCAATTGGGTGGTTTGTTCATCAATGGACGACCTCTGCCGAACCACATCCGCCTGAAGATCGTTGAGATGGCAGCGGCGGGAGTACGGCCTTGCGTCATCTCAAGACAGTTGAGGGTCTCTCACGGCTGTGTCTCTAAAATACTCAATAGATACCAGGAGACTGGTTCCATCAGACCCGGTGTAATTGGCGGGTCAAAACCAAGAGTTGCGACCCCTGAAGTTGAAGCTAGAATTGAAGAACTCAAAAGACAAAATCCTGGCATATTCTCCTGGGAAATCCGTGAAAAACTTATTAAGGAGGGGGTATCCGATCCGCCCAGTATCTCATCCATATCTCGTTTACTGCGAGGCGGCTCTAGAGATCCAGATGGGAAGAAAGATTACAGCATCGATGGCATATTAGGAGGCCGTGGATCAGACTCTTCAGATATAGAATCGGAGCCAGGCCTGACCCTTAAAAGAAAACAGCGTAGATCTCGCACCACGTTTACAGGAGAACAGCTTGACGCGCTTGAGAGAGCTTTTCATAGAACACAATATCCTGATGTATACACTAGAGAAGAACTTGCTTTGCAGACTGGCCTTACCGAGGCCAGAATACAAGTTTGGTTCTCCAATAGAAGAGCTCGACTACGAAAGCATACTGGTTCAAATCCGACTCCTTCACTCGCTAGTTATTCGACGATACCAATGCCGCAGATACCGTGCCCGTATCCTGCCGGAGAAATACCTTCACTATCTCAACATCACCCGCAACATCCGGATGCCTGGCATCATCAAAAGTATGCCAATTATAACCAGCTAATGGCTCAGTCTCAACATCTTAACCAAGCTTTTCAAACTGCAGCCTTCCCCAGCACCTCTGGGACTACTTTCAGCCATTTAGTGACCGGTGCTAGCGCACCAACTCACAGTCAGCTTCTTGATAGCACTCCAAGAACTGATTATCCTCGATATCCCACTGATGTCTACAACAAACCCATCAGTTATATGCCTAAAGATACGGAAGCGGAAGATAAGGGAGTGGGAGAAGAAATTATAGAGCAACGTGAAGAAGCTTACATAAAAACAGGTGGAAATGAATACAAAGAATTAGCGACCAGTGATTATCCTAAAGTTCCTACTGATTATTCTAAGCTTTCTGTTGATCCTTCCTCCACCAACTGGACTGCATCTAATAACTCCTTGAATATGAGTCTATCTGGATTATCTAGTGACTATAAATATATGAGTGACCCTTATGCTTTTCCTGCTATCGCGTCGGATACCCTAAATCAACATACCTACACCAATCCAGGAAATGCAGCCAATAAATACTGGATTTGA

Protein sequence:

>DPOGS216059-PA
MERHQNGMDVWVGQGRVNQLGGLFINGRPLPNHIRLKIVEMAAAGVRPCVISRQLRVSHGCVSKILNRYQETGSIRPGVIGGSKPRVATPEVEARIEELKRQNPGIFSWEIREKLIKEGVSDPPSISSISRLLRGGSRDPDGKKDYSIDGILGGRGSDSSDIESEPGLTLKRKQRRSRTTFTGEQLDALERAFHRTQYPDVYTREELALQTGLTEARIQVWFSNRRARLRKHTGSNPTPSLASYSTIPMPQIPCPYPAGEIPSLSQHHPQHPDAWHHQKYANYNQLMAQSQHLNQAFQTAAFPSTSGTTFSHLVTGASAPTHSQLLDSTPRTDYPRYPTDVYNKPISYMPKDTEAEDKGVGEEIIEQREEAYIKTGGNEYKELATSDYPKVPTDYSKLSVDPSSTNWTASNNSLNMSLSGLSSDYKYMSDPYAFPAIASDTLNQHTYTNPGNAANKYWI-