Monarch geneset OGS2.0

DPOGS212692
TranscriptDPOGS212692-TA2931 bp
ProteinDPOGS212692-PA976 aa
Genomic positionDPSCF300012 - 966156-970029
RNAseq coverage161x (Rank: top 52%)
Annotation
HeliconiusHMEL0141640.086.24% 
BombyxBGIBMGA013199-TA0.089.74% 
Drosophilact-PC1e-14156.55% 
EBI UniRef50UniRef50_D2A6H50.064.66%Putative uncharacterized protein GLEAN_15699 n=3 Tax=Endopterygota RepID=D2A6H5_TRICA
NCBI RefSeqXP_970668.20.066.39%PREDICTED: similar to Homeobox protein cut [Tribolium castaneum]
NCBI nr blastpgi|1892385070.066.39%PREDICTED: similar to Homeobox protein cut [Tribolium castaneum]
NCBI nr blastxgi|1892385070.066.81%PREDICTED: similar to Homeobox protein cut [Tribolium castaneum]
Group
Gene OntologyGO:00036771e-30DNA binding
GO:00063556.6e-19regulation of transcription, DNA-dependent
GO:00435656.6e-19sequence-specific DNA binding
GO:00037006.6e-19sequence-specific DNA binding transcription factor activity
GO:00055152.3e-18protein binding
KEGG pathway 
InterPro domain[545-653] IPR0109821e-30Lambda repressor-like, DNA-binding
[364-442] IPR0033502.4e-28Homeodomain protein CUT
[693-755] IPR0013566.6e-19Homeobox
[677-752] IPR0090572.3e-18Homeodomain-like
[670-751] IPR0122871.2e-14Homeodomain-related
Orthology groupMCL16011 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212692-TA
ATGTTTGAAAAGTGTGCCGGCAGATGTCGTACGTCAAACAATGTTTTAGGAAAGAGTGAAAGCGAATTACGGCTCGTTTCCGTCACACGCGCGATATTCCTATACGCCGCCGGAACATATTGCTTCTATACAGACCTCTCGTTGGTCACAGTTCTGAATGTTGGAACAAAAGACGGTACAACCGGCCCCGGGTTCGGGAGGTCAGATGGTGACGGCGAGGAACGCCTGGCTCACATGCTCAATGAAGCCTCACATATCATGAAGACACCGACGGGACAAGCCAACAACGATGACTCCAGGAGCAACGAAGACTCCAGCTCACCGAGGACCCAGTGCCCGTCACCGTTTTCTAATAAGGATTCGAGTCAAAACAGACGGCTTAAGAAATACGAAAACGATGACATTCCTCAAGAAAAAGTAGTGCGTATATACCAAGAAGAGCTGGCGAAGATAATGACGAGACGCGTGGAAGACATGCGCCATAACAGAGACGGCTTCCCTGGCAGCGGCATGGCCCCGCACATGGAACGTCCTCCGGAAGACATTAGGATGGCTCTGGAAGCGTATCACAGGGAACTAGCCAAAATACAACCGGGCGGAAACATTCCGACCCTGCACAACTTGCCAGGGATGCCACCCTTCCCCAACCTGCTGGCCCTTCAGCAGCAAGCCATGCAAGCACAAAGCCAGCACATCAACGGCTCCGGGGCAATCCAAGATCTCTCTCTGCCCAAAGAGAAAAATACCAAAATTAATGGAATGACTGATAGTGATAAGGAAAGGTCTATGGACGCTGAAGAGGCCATCAGACACGCGGGAAGCGCTTTCTCGCTAGTTAGACCGAAATTAGAACCGGGACAGCAATCCACCGGCTCCTCGGCATCCAGCCCGCTGGGAAATGCTATTCTACCTCCCGCCATTACGCCGAATGAAGACTTCAGTAACTCGGCCGCAGCGAGTCCATTACAAAGAATGGCTTCCATAACGAATAGTTTGATATCCCAGCCCCCGAATCCGCCACACCACGCGCCACCGCAGAGATCGATGAAGGCAGTCCTGCCACCGATAACTCAGCAACAGTTCGATTTGTTCAACAATTTGAACACGGAGGAAATCGTGAAGAGAGTCAAAGAGGCTCTCAGCCAGTATTCCATAAGCCAGAGATTGTTCGGCGAATCCGTGCTCGGCCTGTCTCAAGGATCCGTCAGCGATCTGCTAGCGAGACCGAAGCCATGGCACATGTTGACACAAAAGGGAAGAGAGCCGTTCATTCGTATGAAAATGTTCTTGGAGGATGAAAACGCAGTGCACAAATTGGTTGCGTCCCAATACAAAATCGCACCGGAGAAGCTGATGAGAACAGGAAACTATAGCGGAGCACCTTCATGTCCGCCAAATATGAACAAGCCGATGCCACCAACACAGAAGATGATCTCAGATGCCACGGTGCTCCTTAGCAAGATGCAACAGGAACAACTTCTAGGATCTGGACACTTAGGACATTTGGGACAACCGACCCCTCTCCTGTTGACTCCGCCTGGCTTCCCACCACATCACGCCGTGACGCTGCCGCCTCAGCATCACGACAACAACAACAAGGAGAGGAAACCACCACCGCCTCCACAACCCCATCACCAGCCGCCCGTGATGCGAGGCCTTCACCAGCACATGTCACCCAGCGTCTACGAGATGGCAGCTCTGACGCAAGACCTCGACACTCAGACGATCACGACCAAAATAAAGGAAGCGCTCCTCGCCAATAACATCGGACAGAAAATATTCGGCGAGGCCGTGTTGGGACTCTCCCAGGGATCGGTCAGTGAACTTCTATCGAAACCGAAACCCTGGCACATGTTGAGTATCAAAGGACGAGAGCCCTTCATCAGAATGCAGCTCTGGCTCAGCGATGCGCATAATATAGATCGTCTCCAAGCGTTGAAGAATGAGAGACGCGAAGCTAACAAGAGACGGCGGTCGAGCGGACCCGGTCAGGACAACTCCTCGGACACCTCATCGAATGATACGTCGGAGTTCTACCACTCCAGCTCGCCTGGACCGATACCCGGCGCGCCGTCCGCCAAGAAGCAGCGCGTGCTGTTCTCGGAGGAACAGAAGGAAGCGCTGAGACTAGCCTTCGCTTTGGATCCCTACCCGAACATGCCGACGATAGAATTCCTCGCTGCCGAGCTGGGCCTGTCCACCAGAACGATCACCAACTGGTTCCACAACCATCGCATGCGGCTAAAGCAACAGGCGCCGCACGGCCTGCCCGCGGAACCTCCAGCACGAGATCAGGCCTCCGCTCCCTTCGATCCCGTACAGTTCCGTCTCCTGCTCAATCAGAGGCTTCTGGAGCTGCAGAAGGAGAGGATGGGCCTGGCGGGGGTTCCTCTGCCGTACCCGCCCTACTTCGCCGCCAACTCCAACTTCGCCGCCCTCATCGGTCGCGGCCTGCTGCCCACCGACGAGCGCGTCAAGGACCCTGCCGCCGGACTCGACCTCTCGATGCCGCTGAAGCGTGACCCTGACGGAGACGACTTCGAGGAGGACGACGTCGAGAGCAACCTCGGCTCCGAGGACTCCCTCGACGATGACTCCAAGACTGAGCCCAAGGCGGCCTCCACCCCCGCTGGTCGGTCCAGCCGCCGCAAGCCCGCGGCGCCGCAGTGGGTCAACCCCGACTGGCAGGACGAGAAGCCGCGCAACCCCGACGAGGTCATCATCAACGGCGTCTGCGTGATGCGCGCCGACGACTACCGTCGCGAGGCCACGGAGACCGTGAGGGTGGAGCCATCCCCCGCCCCCCGCGAGAGCTCCCCCGCCCCCCAGGACACGCCGCGCGCGCCTCGCACCCCCCGCACGCCGTCCCCGGACGTCCTGCCCGAGGACAAGATCAAGACGGAGGCGGAAGACGACCGGTGGGAGTATTAA

Protein sequence:

>DPOGS212692-PA
MFEKCAGRCRTSNNVLGKSESELRLVSVTRAIFLYAAGTYCFYTDLSLVTVLNVGTKDGTTGPGFGRSDGDGEERLAHMLNEASHIMKTPTGQANNDDSRSNEDSSSPRTQCPSPFSNKDSSQNRRLKKYENDDIPQEKVVRIYQEELAKIMTRRVEDMRHNRDGFPGSGMAPHMERPPEDIRMALEAYHRELAKIQPGGNIPTLHNLPGMPPFPNLLALQQQAMQAQSQHINGSGAIQDLSLPKEKNTKINGMTDSDKERSMDAEEAIRHAGSAFSLVRPKLEPGQQSTGSSASSPLGNAILPPAITPNEDFSNSAAASPLQRMASITNSLISQPPNPPHHAPPQRSMKAVLPPITQQQFDLFNNLNTEEIVKRVKEALSQYSISQRLFGESVLGLSQGSVSDLLARPKPWHMLTQKGREPFIRMKMFLEDENAVHKLVASQYKIAPEKLMRTGNYSGAPSCPPNMNKPMPPTQKMISDATVLLSKMQQEQLLGSGHLGHLGQPTPLLLTPPGFPPHHAVTLPPQHHDNNNKERKPPPPPQPHHQPPVMRGLHQHMSPSVYEMAALTQDLDTQTITTKIKEALLANNIGQKIFGEAVLGLSQGSVSELLSKPKPWHMLSIKGREPFIRMQLWLSDAHNIDRLQALKNERREANKRRRSSGPGQDNSSDTSSNDTSEFYHSSSPGPIPGAPSAKKQRVLFSEEQKEALRLAFALDPYPNMPTIEFLAAELGLSTRTITNWFHNHRMRLKQQAPHGLPAEPPARDQASAPFDPVQFRLLLNQRLLELQKERMGLAGVPLPYPPYFAANSNFAALIGRGLLPTDERVKDPAAGLDLSMPLKRDPDGDDFEEDDVESNLGSEDSLDDDSKTEPKAASTPAGRSSRRKPAAPQWVNPDWQDEKPRNPDEVIINGVCVMRADDYRREATETVRVEPSPAPRESSPAPQDTPRAPRTPRTPSPDVLPEDKIKTEAEDDRWEY-