Monarch geneset OGS2.0

DPOGS202866
TranscriptDPOGS202866-TA1311 bp
ProteinDPOGS202866-PA436 aa
Genomic positionDPSCF300018 + 1334546-1337605
RNAseq coverage98x (Rank: top 61%)
Annotation
HeliconiusHMEL0085722e-6796.73% 
BombyxBGIBMGA010868-TA6e-6057.60% 
Drosophilanub-PB6e-7177.02% 
EBI UniRef50UniRef50_D2A2H93e-7572.49%Nubbin n=2 Tax=Tribolium castaneum RepID=D2A2H9_TRICA
NCBI RefSeqXP_002414009.15e-7775.14%pou2, putative [Ixodes scapularis]
NCBI nr blastpgi|238934211e-7864.02%nubbin [Cupiennius salei]
NCBI nr blastxgi|1700323692e-8356.56%nubbin [Culex quinquefasciatus]
Group
Gene OntologyGO:00063554.3e-53regulation of transcription, DNA-dependent
GO:00037004.3e-53sequence-specific DNA binding transcription factor activity
GO:00055154.2e-41protein binding
GO:00036777.9e-40DNA binding
GO:00435651.5e-17sequence-specific DNA binding
KEGG pathway 
InterPro domain[206-280] IPR0003274.3e-53POU-specific
[225-242] IPR0138474.2e-41POU
[210-280] IPR0109827.9e-40Lambda repressor-like, DNA-binding
[298-360] IPR0122871.5e-22Homeodomain-related
[302-376] IPR0090574.7e-18Homeodomain-like
[304-360] IPR0013561.5e-17Homeobox
Orthology groupMCL34465 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202866-TA
ATGGCTGAGCGGGTCGGGGGTGACGGTGGACAATTAGATGGGTCCGCGAGCGGGGAAGGCGAGGGCGAGGGCGAGGGTAGCGGAGACGAGAGCGTCCCAGCTGGCGGAGGAGCCCGCGCCCCGCACGCTCCTTCCTCCGCCGCCTCCGCTGCCTCTGCCGCCACGGCGGCCGTGCTGGTGAGTCTGCTTCGAGACCACCACTACCACCTCTCCCCGAGTGCCGCTGGTGCCGGCAATGCTCTAGCCCAACTGCAACATTTGTTGCTTACACAGCATGGTGCTCATTCTCTACTTTTACATACGCAAGTGCAGCAGGCCGTTGCCCAGGCAGCAGCACAACAGCTTCAGCAACTTCAAGCGAGAGCCAACGCTGGAGCACCAACTGTTGACACTCGTTCTCCGTCTCCGCGCGGCTCTCCTCCTGGCGCAGCCGCCTTCTTGACTCCGCTCACGCCAGGTCCGGGACGGGCACGCTCCCCCCATGCGCATCACGCGCACGCACATGCGCACGCGCACTCGCTGCATGCGCACTCGCCAGTTGCGCACGCTCAGTCCCAGGCAGGACACTCGCCGCTGCACGCACACGCGCACAAACCTCGCGCGCTCGACCCAGCCGACGACACAGCCGACCTGGAAGAGCTCGAACACTTCGCCAAAACATTCAAACAACGAAGAATTAAACTCGGTTTCACTCAGGGCGATGTGGGGCTCGCCATGGGCAAATTGTACGGAAATGATTTCTCCCAGACGACAATATCGCGCTTCGAAGCGTTGAATCTTAGTTTCAAGAACATGTGCAAACTGAAGCCACTACTACAGAAGTGGTTGGAGGACGCGGACTCCTCGCTGAGCGGCAGCGGTGGCGGAGCGTCTCTAGGCGCCGGCCTGGCTGAGGCTGTGGGGAGACGCCGCAAGAAGCGCACGTCCATAGAATCAGGAGTCAGGGTAGCACTCGAGAAGGCTTTTCTCCACAACCCGAAACCGACCAGCGAGGAAATATCGGCGTTAGCCGACAGTCTCTGTATGGAGAAGGAAGTAGTACGCGTTTGGTTTTGCAATCGCAGACAGAAGGAGAAGCGTATAAACCCACCCGCAGGCGAGGCGGGCGGAGCGTCATCTCCGGGTGGCGGTGGGTCGTTGCTGCCCCTGTCTCACCCCCTAGCACATGCTTTGCCTGCGCATGCTCACGGACACGCCCACGGACATGGTCTGCAGCACGCGCACGCACACCCCGCTGCCGCAGCGCACGCCGCCCTGCAGGCTGCGGCGCTGCAGCCGCTGGCTCTCCTCGCGCGCCCCCCACGCGACTGA

Protein sequence:

>DPOGS202866-PA
MAERVGGDGGQLDGSASGEGEGEGEGSGDESVPAGGGARAPHAPSSAASAASAATAAVLVSLLRDHHYHLSPSAAGAGNALAQLQHLLLTQHGAHSLLLHTQVQQAVAQAAAQQLQQLQARANAGAPTVDTRSPSPRGSPPGAAAFLTPLTPGPGRARSPHAHHAHAHAHAHSLHAHSPVAHAQSQAGHSPLHAHAHKPRALDPADDTADLEELEHFAKTFKQRRIKLGFTQGDVGLAMGKLYGNDFSQTTISRFEALNLSFKNMCKLKPLLQKWLEDADSSLSGSGGGASLGAGLAEAVGRRRKKRTSIESGVRVALEKAFLHNPKPTSEEISALADSLCMEKEVVRVWFCNRRQKEKRINPPAGEAGGASSPGGGGSLLPLSHPLAHALPAHAHGHAHGHGLQHAHAHPAAAAHAALQAAALQPLALLARPPRD-