Monarch geneset OGS2.0

DPOGS211117
TranscriptDPOGS211117-TA1473 bp
ProteinDPOGS211117-PA490 aa
Genomic positionDPSCF300007 - 552675-576620
RNAseq coverage80x (Rank: top 64%)
Annotation
HeliconiusHMEL0124272e-16590.09% 
BombyxBGIBMGA002994-TA2e-8398.62% 
Drosophilamid-PA3e-9348.65% 
EBI UniRef50UniRef50_UPI00022CA4A65e-10078.34%UPI00022CA4A6 related cluster n=1 Tax=unknown RepID=UPI00022CA4A6
NCBI RefSeqXP_972626.19e-15863.14%PREDICTED: similar to T-box transcription factor TBX20 [Tribolium castaneum]
NCBI nr blastpgi|910829172e-15663.14%PREDICTED: similar to T-box transcription factor TBX20 [Tribolium castaneum]
NCBI nr blastxgi|910829172e-15463.14%PREDICTED: similar to T-box transcription factor TBX20 [Tribolium castaneum]
Group
Gene OntologyGO:00056341.5e-157nucleus
GO:00063551.5e-157regulation of transcription, DNA-dependent
GO:00037001.5e-157sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[39-488] IPR0016991.5e-157Transcription factor, T-box
[124-315] IPR0089674.9e-69p53-like transcription factor, DNA-binding
Orthology groupMCL10344 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211117-TA
ATGCTGCTGCAGGGCCGGGAAGGCTTCGAAGGCCGGGGGCCTGCCATGGAGGATGAGTGCTCGGCGAGGCCCTGCGCCACTGACTTCTCCATCGCCGCCATAATGGCCAGGGACCGGCAGGAGCGGAGGGAGCGCCGAAGACACAGAGACCCCAGAGAAGATACACTTACACCATTAGAAAAGTTTGTGGACGCTACTGCATCCGCTGCGGAGTCTCCGTCTCCACCTCTGGAATATGAACGGGATTCCCCCGTGGACGTCTCCTCAACCTCAGAGGCTGGTTCAGCTGGAGCGCCCAGCGGTTCCAGAGCTCTCTCCCCCCCAAGAAACCCTCAGCTGGCTGAACGATGGTCCAGCGAGGAAATGAGGCACATACAGTGCCATCTCGAAACTAAAGAACTGTGGGACAAATTCAACGAACTAGGAACGGAAATGATCATCACAAAAACAGGAAGACGAATGTTCCCGACGGTGCGAGTGTCTTTCGCGGGATGCCGAGCCGAGGCTCGTTACGCCGTATTGCTGGACGTGGTGCCGGTGGACGGCAAGCGTTACCGATACGCTTACCACCGCTCTTCCTGGCTGGTGGCCGGCAAGGCAGACCCTCCCGCGCCAGCGAGGCTCTACCCTCACCCTGACTCACCCTTCTCCGGGGACCAACTTCGCAAGCAGGTCGTCTCCTTTGAGAAGGTCAAACTCACGAACAACGAAATGGACAAAAATGGACAGCTGGTCCTTAATTCAATGCATAAGTATCAACCCCGAATCCATCTGGTATTGCGAAGAGAAGGAGCTATCAACGCACCGATCACAGACCTTGAGCAAGAGGAGTTTAAGACGTTCATATTTCCTGAATGCGTCTTCACCGCAGTCACGGCCTACCAGAACCAACTTATAACCAAACTTAAAATCGACAGTAATCCGTTCGCAAAGGGATTCCGCGATTCTTCACGTCTGACGGAATTTGAGAGATTCTATATCACGGGGGAGCACGAAAGAACATCAGTCTTCCCCGATGACGCGCGCCTCGGTGCCGCCCATCCTAGAGAGACGATGGAGTCGATGCTGGCTGAGCAGCATTATTTACGGTCACCTCTTAGACCGTTCGATCTGGATCAGCACAACAACAATCTGACGCTGGAAGAGAAAGCGATTTTGGCGGCCAGGTCACAGTTGTTCTTGCGAGCAGCGTATCCTCTGTACGGTGTACCAGCAGCAGCGTTGTGGGGTCAATGGGCGTGTCTGGCGCCACAATTACTGGCACAACAGCATCTAGCTTCAGGGTCTGGGCTACAGTTGCCTCGGCCAGTATACCCGGGTGGTGTGCCAGCATCACTCTCGCAGCATCGCTTCTCCCCCTACCCCGCCCGCCGTTCCTCACCGGGTTCCTCACCAGACTCCCTCCGCGCGAGTCCCCACTCCTTGCCCCCGCCCGCACCGCACACACCTCACGCACCCCACAGCCCGACCTAG

Protein sequence:

>DPOGS211117-PA
MLLQGREGFEGRGPAMEDECSARPCATDFSIAAIMARDRQERRERRRHRDPREDTLTPLEKFVDATASAAESPSPPLEYERDSPVDVSSTSEAGSAGAPSGSRALSPPRNPQLAERWSSEEMRHIQCHLETKELWDKFNELGTEMIITKTGRRMFPTVRVSFAGCRAEARYAVLLDVVPVDGKRYRYAYHRSSWLVAGKADPPAPARLYPHPDSPFSGDQLRKQVVSFEKVKLTNNEMDKNGQLVLNSMHKYQPRIHLVLRREGAINAPITDLEQEEFKTFIFPECVFTAVTAYQNQLITKLKIDSNPFAKGFRDSSRLTEFERFYITGEHERTSVFPDDARLGAAHPRETMESMLAEQHYLRSPLRPFDLDQHNNNLTLEEKAILAARSQLFLRAAYPLYGVPAAALWGQWACLAPQLLAQQHLASGSGLQLPRPVYPGGVPASLSQHRFSPYPARRSSPGSSPDSLRASPHSLPPPAPHTPHAPHSPT-