Monarch geneset OGS2.0

DPOGS209788
TranscriptDPOGS209788-TA1419 bp
ProteinDPOGS209788-PA472 aa
Genomic positionDPSCF300117 - 981622-983040
RNAseq coverage382x (Rank: top 31%)
Annotation
HeliconiusHMEL0090010.074.21% 
BombyxBGIBMGA008014-TA2e-17864.94% 
DrosophilaCG5694-PB2e-4431.27% 
EBI UniRef50UniRef50_UPI000206459B3e-6034.79%UPI000206459B related cluster n=3 Tax=unknown RepID=UPI000206459B
NCBI RefSeqXP_393488.35e-6134.79%PREDICTED: similar to CG5694-PA, isoform A [Apis mellifera]
NCBI nr blastpgi|3407101966e-6238.08%PREDICTED: hypothetical protein LOC100646816 [Bombus terrestris]
NCBI nr blastxgi|3407101962e-6137.09%PREDICTED: hypothetical protein LOC100646816 [Bombus terrestris]
Group
KEGG pathwayhmg:1001999033e-08 
 K11831 (NRF1)maps-> Huntington's disease
InterPro domain[2-92] IPR0195254e-08Nuclear respiratory factor 1, NLS/DNA-binding, dimerisation domain
Orthology groupMCL16973 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209788-TA
ATGATTACCAATTTGCCGCTTCTTATTTGCAATGGTAATCCGACAGCTCTGGCGAAGCTGAATGCCGCCGAATTGGAAAAGTTTATAACGTTCATGGTGACATGTTCCTGGGGTCATGATACTGCTAAGGATATCCGGCAACCACCATGGTGGCCCAAGGATGTAAACTTTTCTCATCCCTTTGTTAAACCACCAGTTGTTCCCGACGATTGGGAGGCGAGACTGAAGAGATTGATCAAGAGATGTTATGAATATCATAAGAGTGCGTTTTTATTAGTGTTCTCAGCACAGCTTGCGCGATATCCGCCTCGACGACTCCGGTATGTTGACAATCGTGATCACACCACTTCCCTCTACTACAGACCCAGTGGTAGACTACTGGTGACGTTCCGAAATGAAAATCTGTGTTATGATAGAGACACTGTGGAGGAGACGAGCTATCACCTGAAATCCACTGATATTTATCTTTGTGATAATTGTGATAGTCATTTTGATAATTTGGAAGTTCTCAAAGCTCACGAGAGACTGTGTAACAATGAAGTAGTGGCAACTAGTTCGTGTAGTAGTGGTTTTTCAGATTTTCTATCAGCTCTGAAGTTGCAATCTATATCGGATGTTTCAGATAACAAACATCCACTATGTGTTGAAGTTGACTCGCGGCCACGGAATGCTAGAGGTGCATCTTATCTGGATAGAGGTCCTCCCTACCCATTCTCATCCCTTGCATATATGAAAAATGCAAAGATAAATGTACAAAGGGATACCACCTATTCTAGAGAAAGAATAGAGAGATATTGTTGTCCTACAACAATTATTAGTAAAAATGTAGGTAGCAAAAGTAAAAATCATCAATTTCCAGTAAGATATAGACGACCAATAGATTACTGGCACAGGAAGCATGTGTTCCCCAATCAAAGATACAAGAAAATACTTGATCTCAAAAGCCAGTTGTTGCTTTTAAAATGCAGGCCCGTTACTGTGAATGTTGAGAGAATGACAATGGAAAAAGTAGATGAATATATCGAAAACCTGCATAAGGAGTCCGAGAAACATAGCTTAGTGGACAAAGACATTGTGTTTGTTGATGGATTAGACTCTGAACAAATGGATGTAGACTGTAAGGTTGAGACTAAAACTAGTGATCCTCTCAAAAAGGTAGACTGTGACTGTGAAGTGATTGATCTGTGTTCGGACGATGAAACTTCCAGTACTAATGAGAACTGTGACCCTCGAGCTGGAGTGACTTGTGTGATGAGAGGTGGTGCGGTACTCAGGCGTACTGCCGCGACGCCTCATTCATTGCCCGCAGAGCCCTGCGGCGCTCGCCAGCGCCCTCTACCGTCTCTCATCCTACAGCCCCATCCAGTTATTTTAATAACTCACACTCTAAACAATTTACAGACTATAGCATTAGATTAA

Protein sequence:

>DPOGS209788-PA
MITNLPLLICNGNPTALAKLNAAELEKFITFMVTCSWGHDTAKDIRQPPWWPKDVNFSHPFVKPPVVPDDWEARLKRLIKRCYEYHKSAFLLVFSAQLARYPPRRLRYVDNRDHTTSLYYRPSGRLLVTFRNENLCYDRDTVEETSYHLKSTDIYLCDNCDSHFDNLEVLKAHERLCNNEVVATSSCSSGFSDFLSALKLQSISDVSDNKHPLCVEVDSRPRNARGASYLDRGPPYPFSSLAYMKNAKINVQRDTTYSRERIERYCCPTTIISKNVGSKSKNHQFPVRYRRPIDYWHRKHVFPNQRYKKILDLKSQLLLLKCRPVTVNVERMTMEKVDEYIENLHKESEKHSLVDKDIVFVDGLDSEQMDVDCKVETKTSDPLKKVDCDCEVIDLCSDDETSSTNENCDPRAGVTCVMRGGAVLRRTAATPHSLPAEPCGARQRPLPSLILQPHPVILITHTLNNLQTIALD-