Monarch geneset OGS2.0

DPOGS214973
TranscriptDPOGS214973-TA1842 bp
ProteinDPOGS214973-PA613 aa
Genomic positionDPSCF300256 - 381795-386441
RNAseq coverage192x (Rank: top 48%)
Annotation
HeliconiusHMEL0029760.063.29% 
BombyxBGIBMGA012145-TA0.061.67% 
Drosophilagkt-PA5e-14245.83% 
EBI UniRef50UniRef50_Q9VQM47e-14045.83%Probable tyrosyl-DNA phosphodiesterase n=14 Tax=Diptera RepID=TYDP1_DROME
NCBI RefSeqXP_001968528.11e-14346.67%GG24923 [Drosophila erecta]
NCBI nr blastpgi|1948553703e-14246.67%GG24923 [Drosophila erecta]
NCBI nr blastxgi|1954709931e-14246.67%GE18215 [Drosophila yakuba]
Group
Gene OntologyGO:00056343.9e-142nucleus
GO:00080813.9e-142phosphoric diester hydrolase activity
GO:00062813.9e-142DNA repair
KEGG pathway 
InterPro domain[178-613] IPR0103473.9e-142Tyrosyl-DNA phosphodiesterase
[24-46] IPR0194066.1e-11Zinc finger, C2H2, APLF-like
Orthology groupMCL11926 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214973-TA
ATGAGTCTTTCAAACAAACGTATGAATCGGGACAAAGTTGATCAACATAATAAAAGAATAAAGCTAGCCTGTAGCTATGGCGAGTCGTGTTATAGAATGAATCCGGCACACTTCAGAGAGTTCAGTCATCCGCATTTGGAGAGTATCCTGGATAATTACAGTGGCTCGGGTGATTACGACATACCCGAGAGGTTCAGCTTACAACGAAACATGATCAGGGAACAGCTCGACATGATGATACAAAACAAGCTCTATGAAGGACAACGGCGTGGAGATCTAGCGAGGAAAGACAGTGAAGATAGTAAGAAAGACGGAGGTGAGCCTAAAACAGACAAGGCCGGGACGAGTAGGAGTAGTGATGGAAAGAAAAAGGCACAAACGACCGATGAAGAACAGATGTCAAGTAGAACACACGAACAGACAGACACGCGATCAAGCAAGACATGTGGTGACAAAAAAATACTCGACTCCACGTACCGGCCGATGGTGCCGCCGACCAGGGACCCTCGCTCGTTCCTGGACGTGGTGGTGAGTCCCGGTGGTATGTTGTCCAAACACGCGGCCGCCGCGCCCTACCACGTGTTTTACACCACCATCAAGGATAGCAAGAAGACTCACAACCAGAAGTACTCCATCACACTGCTGGAGATCCTCGACAGTAGTTTGGGCGAGCTGAAGTGCTCCCTCCAGATAAACTTCATGGTGGACGCGGGCTGGCTCCTGGCGCACTATTACTTCGCGGGTTACAGCGCAAAGAAGCTAACGATCCTGTACGGGGAGGAGAGCGCGGAGCTGAGGAACATCAGTGCCAAGAAGCCCAACGTGGAGGCGCACCAGGTCAAGATGGCGACGCCCTTCGGCAAACATCACACGAAGATGATGTTGCTGTGCTACGAGGACGGCTCCCTGAGGGTGGTGGTGTCCACCGCCAACCTGTACATGGACGACTGGGAGAACAGGACGCAGGGCCTCTGGCTGAGTCCGTCCTGCCCGCAGCTGCCGGCGGAGAGTCCGAGTCACTCGGGCGAGAGTCCCACGGGCTTCAAGCGGAGTCTCCTGGACTACCTGCATCACTACCGCCTGCCGCAGCTGGCGGTCTACGTGCACCGGGTCCAGCGCTGCGACTTCAGTCACATCAACGTGTTCCTCGTCTGCTCGGTCCCCGGCACTCATTACTCCGCGTCGTGGGGTTTCCTGCGTGTGGGTGCTCTGCTGCGTGCTCACTGCGCCGTCCCGCCCCAGGAGACTCGCTCATGGCCGCTGATCGCTCAGGCCAGTAGCCTCGGCAGCTACGGGAAGGACCCCGGGTCGTGGCTGACGGGCGACTTCCTGCATCACTTCACCAAGATAAAGGACCAGCCGCAGACCCTCACCCCGCCGCCCGACCTCAAACTCATCTACCCGTCGCTGGAGAACGTGAAGTCCTCCCACGACGGTCTGCTCGGCGGCGGCTGCCTGCCTTACTCCGCGGCCGTCCACGTCAAGCAGCCCTGGCTCAAGGACTTCTTATACCAGTGGCGGGCGCTGCACTCGGAGCGGGACCGCGCGATGCCTCACATCAAGAGCTACACGCGCGTGTCCCCCGACAACTCGCGCGCCGCCTTCTATCTGCTGACTTCCGGCAACGTGAGCAAGGCGGCCTGGGGCGTCCGCAACAAGGACGGCGGACTCCGCCTCATGAGCTACGAGGCCGGAGTCCTATTCCTGCCGCGGTTTGTGATAAACTCGGACTTCTTCCCCCTGTGCCCCTCCTCCGCCCTCCGCCTGCCGGTGCCGTACGACCTCCCCCCCCAGAGGTACTCCCCGGACATGTCACCCTGGGTCTCCGACTACTTGTACTGA

Protein sequence:

>DPOGS214973-PA
MSLSNKRMNRDKVDQHNKRIKLACSYGESCYRMNPAHFREFSHPHLESILDNYSGSGDYDIPERFSLQRNMIREQLDMMIQNKLYEGQRRGDLARKDSEDSKKDGGEPKTDKAGTSRSSDGKKKAQTTDEEQMSSRTHEQTDTRSSKTCGDKKILDSTYRPMVPPTRDPRSFLDVVVSPGGMLSKHAAAAPYHVFYTTIKDSKKTHNQKYSITLLEILDSSLGELKCSLQINFMVDAGWLLAHYYFAGYSAKKLTILYGEESAELRNISAKKPNVEAHQVKMATPFGKHHTKMMLLCYEDGSLRVVVSTANLYMDDWENRTQGLWLSPSCPQLPAESPSHSGESPTGFKRSLLDYLHHYRLPQLAVYVHRVQRCDFSHINVFLVCSVPGTHYSASWGFLRVGALLRAHCAVPPQETRSWPLIAQASSLGSYGKDPGSWLTGDFLHHFTKIKDQPQTLTPPPDLKLIYPSLENVKSSHDGLLGGGCLPYSAAVHVKQPWLKDFLYQWRALHSERDRAMPHIKSYTRVSPDNSRAAFYLLTSGNVSKAAWGVRNKDGGLRLMSYEAGVLFLPRFVINSDFFPLCPSSALRLPVPYDLPPQRYSPDMSPWVSDYLY-