Monarch geneset OGS2.0

DPOGS206374
TranscriptDPOGS206374-TA1617 bp
ProteinDPOGS206374-PA538 aa
Genomic positionDPSCF300192 - 196299-203335
RNAseq coverage239x (Rank: top 43%)
Annotation
HeliconiusHMEL0090161e-15791.50% 
BombyxBGIBMGA005775-TA0.080.98% 
Drosophilaewg-PB2e-10667.43% 
EBI UniRef50UniRef50_E0VWB82e-17565.19%DNA-binding protein Ewg, putative n=3 Tax=Pediculus humanus corporis RepID=E0VWB8_PEDHC
NCBI RefSeqXP_002430412.14e-17665.19%DNA-binding protein Ewg, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2420199327e-17565.19%DNA-binding protein Ewg, putative [Pediculus humanus corporis]
NCBI nr blastxgi|3800258411e-17365.40%PREDICTED: uncharacterized protein LOC100864370 [Apis florea]
Group
KEGG pathwayphu:Phum_PHUM4773601e-175 
 K11831 (NRF1)maps-> Huntington's disease
InterPro domain[79-300] IPR0195254.8e-122Nuclear respiratory factor 1, NLS/DNA-binding, dimerisation domain
[449-516] IPR0195261.5e-09Nuclear respiratory factor-1, activation binding domain
Orthology groupMCL13294 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206374-TA
ATGGCTCTGGACAGTCAGGGGGACTCCGACTACGGCAACATGAACTCAGCAGTGAGCACGGAGTCCATGGACATGGCAGAGGAGGAGATGTCACAGGTGGAGGGTTGTGGTCTGAGCGCCTCCGAGGATGAGGACGAGTGTGCCTCGTCTCCGGCCGGCTCAGCCTACGACGACGCCGGTGACGTCATCAAGAGCGCCATGAGCGACGAGGTCACCAAGCAGTTAGCGGCCGCAGGCCCAGTAGGCATGGCGGCGGCGGCTGCGATCGCCTCCTCTAAGAAACGCAAGAGACCGCATTCCTTCGAAACAAACCCCTCCGTGAGGAAGAGACACCAGAACAGACTGCTGAGAAAGCTTAGACAAACGATCGAGGAGTTCGCGACCCGCGTGGGTCAGCAGGCGGTGGTGCTGGTGGCGACCCCCGGCAAGCCGAACACCTCCTACAGGGTGTTCGGAGCCAAGCCGCTGGAGGACGTTGTGCGGAACCTGCGCTGTATGATCATGGAGGAACTCGAGAACGCGCTGGCGCAGCAGTTCGGGCTGGGCGGGTGCGCGCAGGCGCCGCCGCCGCCGCAGGACGACCCGTCGCTGTTCGAGCTGCCGCCTCTCATCATAGACGGCATCCCCACGCCCGTGGAGAAGATGACGCAGGCGCAGCTGCGGGCCTTCATACCGCTCATGCTCAAGTACTCCATGGTACGCGGTAAGCCGGGTTGGGGCCGGGAGTCGACGCGGCCGCCCTGGTGGCCCAAAGACCTGCCCTGGGCCAACGTGAGGATGGACGCGCGCTCTGAGGACGAGAAACAGAAGATGTCGTGGACGCACGCCCTGCGGCAGATCGTGATCAACTGTTACAAGTATCACGGGCGAGAGGACCTGCTGCCCGCCTTCACCGAGGACGAGGACGACAAGCAGGCGCCGCAGACTTCGTCGTCGTGTGCTAGCGGCAGCACGAGTCGCTCGCAGCCGGCCGTGCTCGCCTCGCAGCAAGTCTGCATCGACCAGATGACGCTGGCCGACGTCGATGATGTTGTCGTTATACGTTATCGGTCTGACCACCAGATGTCGCAGTACGCGCCGGCCATGCTGCAGACGATCACCAACCCTGACGGAACGGTGTCTCTCATACAGGTGGATCCCAACAGTCCCATCATCACCTTACCTGATGGTACCACCGCACAAGTGATCCACAGCGGTTCTGAGGGAGCTGCGAGTGTGGTGCAGGCTCTGGAGGGCGAAGGCGCGGTCGCCGTAGACCTCAATGCTGTGGCAGAGGCCACGCTCAACCACGACGGACAGATCATACTCACCGGGGAGGACGGACACGGCTACCCGGTGTCGGTGTCGGGTGTGATCACCGTGCCCGTGTCAGCATCAGTGTACCAGTCTATGGTGGCCTCTATGCAGCAGCAGGACGGCGTCTGCGTCACTCCACTAGTACAGGTGGAGCAGGGCGGCGAGACGCTGGAGGCGCTCTCGATGGGTGGAGGGGTAGCCCAGGTGGTACTACAGGGGGGGGAACAGGTGTTGCAGGTGTTGAGCCTCAAGGACGCCTCCGTACTCACCAAGGCCATGCAAGTGAAGTCCGAACGTGACGCGGTGGTGGCGGACTCCTAG

Protein sequence:

>DPOGS206374-PA
MALDSQGDSDYGNMNSAVSTESMDMAEEEMSQVEGCGLSASEDEDECASSPAGSAYDDAGDVIKSAMSDEVTKQLAAAGPVGMAAAAAIASSKKRKRPHSFETNPSVRKRHQNRLLRKLRQTIEEFATRVGQQAVVLVATPGKPNTSYRVFGAKPLEDVVRNLRCMIMEELENALAQQFGLGGCAQAPPPPQDDPSLFELPPLIIDGIPTPVEKMTQAQLRAFIPLMLKYSMVRGKPGWGRESTRPPWWPKDLPWANVRMDARSEDEKQKMSWTHALRQIVINCYKYHGREDLLPAFTEDEDDKQAPQTSSSCASGSTSRSQPAVLASQQVCIDQMTLADVDDVVVIRYRSDHQMSQYAPAMLQTITNPDGTVSLIQVDPNSPIITLPDGTTAQVIHSGSEGAASVVQALEGEGAVAVDLNAVAEATLNHDGQIILTGEDGHGYPVSVSGVITVPVSASVYQSMVASMQQQDGVCVTPLVQVEQGGETLEALSMGGGVAQVVLQGGEQVLQVLSLKDASVLTKAMQVKSERDAVVADS-