Monarch geneset OGS2.0

DPOGS200103
TranscriptDPOGS200103-TA3828 bp
ProteinDPOGS200103-PA1275 aa
Genomic positionDPSCF300044 + 387051-404670
RNAseq coverage1542x (Rank: top 8%)
Annotation
HeliconiusHMEL0043170.066.34% 
BombyxBGIBMGA004554-TA5e-15082.97% 
DrosophilaThd1-PA2e-10652.72% 
EBI UniRef50UniRef50_D7EL514e-11358.18%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D7EL51_TRICA
NCBI RefSeqXP_970220.17e-11458.18%PREDICTED: similar to Thd1 [Tribolium castaneum]
NCBI nr blastpgi|910950191e-11258.18%PREDICTED: similar to Thd1 [Tribolium castaneum]
NCBI nr blastxgi|3287007928e-12642.16%PREDICTED: hypothetical protein LOC100164619 [Acyrthosiphon pisum]
Group
Gene OntologyGO:00167994.9e-88hydrolase activity, hydrolyzing N-glycosyl compounds
GO:00062814.9e-88DNA repair
KEGG pathwaytca:6587652e-113 
 K03649 (MUG, TDG)maps-> Base excision repair
InterPro domain[273-488] IPR0156374.9e-88DNA glycosylase, G/T mismatch
[310-477] IPR0051221e-48Uracil-DNA glycosylase-like
Orthology groupMCL15492 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200103-TA
ATGGAACCGCAAAATAAGCGAGACATTGGTAACCGAACGCACCTGTGCATTGAGTTGACGGAGGAGGGCGAGGAGAAGCCGAAGGCGGCGGAGAGTCAGGACGATGGGTACGAGACGAGCAACAGCGGAGAGGCGGCGCGCCGGAAACCCTCCAGCGCCGAGGGTTCACCCGCGCCGCCTCCGCCTCGCTCGCCGCCCCCCGACGTCGAGCCCGACCAGGAACCCGAGCCCGAGCCTGAGCCTGAGCCGGAGCCCGAGCATGAGCACGAGCATGAGCACGAGCACGAAACAGCATACGAGCCGCTCCTACAGCGACCCTTCCCATCCCCATACAGACACTTCGAGCCGCCGCCGGCTCATCCCCAGCAACACCAGCCCCCGCCGGCTCACGACCCGGAGCCGTACCAACACTTCCACTTCCCCAAATACCCTCAGGATCCGTACGCTTTCAAACGGGAGACGGACGCGCCATACGACATGACGCAGCATCACCAACACCCGCATCAACAACTACATCATAACCAATACGCTAAGAGAGACGATGAGATATATAACGGTGTGAAACGGGAGTGCGAGGATCCGTACTCGTTCGTCGAGGAGGAGGCGATGTGCGCGATGCTCGGGCAACAGCACCACCTGCAGCACCTGCAGCATCACGACCAGCACCACCACCAGCACATGCAGATCCACCCGCAGCAGATGATGCTCAACCAGCCCAAGAAAAGGGGCCGCAAGAAGAAAATAAAAGATGAAAACGGGTTGGAACTTAAAGTGGACGGAGTTTTAGACAGCGTGTCGGGTATCGTGCGTCCGGTGAAGGAGCGGAAGAAACATGACCGGTTCAACGGGATGAGTGAGGAGGAGGTGTCGCGACGCACACTGCCCGACCACCTCGCGGAGAACCTCGACATCATCATTATAGCACAGCCCGACCACATCGCAGAGGACCTTGTCATTATGACTATCGGTATCAACCCGGGTTTGTTCGCCGCCTACAAGGGTCATCACTACGCTGGTCCCGGGAACCACTTCTGGAAATGTCTCTACCTATCCGGACTCACGCGGGAACAGATGAGCGCTGACGAGGATTACAAGCTTCTAAACTTTGGCATCGGCTTCACGAACATGGTGTCCCGTCCTACCAAAGGCTCAGCGGATCTCACGAGGAGGGAGATCAAGGAAGGATCCGCCATTTTGTTGGAGAAGCTTCAGACCTTCCGACCCAAGGTGGCCGTGTTCAACGGGAAATTGATCTACGAAGTGTTCTCCGGGAAGAAGGACTTCTGTTTCGGGAAACAACCCGACTGTATCGCCGGGACTAACACTTACATGTGGGTGATGCCGTCGTCGTCGGCTCGTTGCGCCCAGTTGCCGCGGGCCGCCGACAAGGTCCCGTTCTACGCGGCCTTGAAGAAGTTCAGGGACTACCTGAACGGTCTCCTGCCGCGGCTGGACGACGCCGAGCTGGTCTTCCCCGACACCACCTCCAGGCGGCCGCACGAGGAGATGGAGATACGCAGACTGACGATGGAGCCCGAGCCGGGCGACACCATTATACTAGAGGACGGCACGGAAGTCCCGCTCAAGAAGAAACGCGGCCGACCCAAGAAGGTGAAGTTGGAGAACGGGGAGACGGTGCCCGCGGTCCCCCGGGCGCCGCGGCAGCCCCGGCCTCCGCCCTCCATGGAGACCGGGGACCAGCCCGCTAAGAAGAAGAGAGGCCGCCCCAAGAAAATAAGACCCGAGGAGCAGCAGTTTCTCCTGCAGCAACAGCAACAACAACAGCAGCAGCAGCAACAGCAACAACAACAACAGCAACAGCAAAACAATTCGATGCTGCAGCCGCAGCTATCGTCCATGACACAGCTGCCGCACGAGCAGTTCCTCCACAACTCTAGCGGGGACTTCCAGCAGATGTCGTCGCCGTTGGGAGTGGGCGGTGTGGGCGTGGGCGTGGGCAACGTGGGCGTCGGCAGTGTGGGGGGAGTGAGCAACGTCAGCAGTGGTGTCATGTACGGTGTGCAGCATCAGCACCAGCAGCAGATGTCCGACTCGTCGCCGTACTACCAACCTAACAATAGCGGCGGTATGGATTCTCCGTTAGACGTGGGCGGAGGTCTGGGCATGTCCCGTGGTTACGGTTCACCTGGCGGCGTGGGCGTGGGCGGCGTGGGCTTCGCGGCGTCACCACGGCACGCGCACTCGTACGCGTCACCGCGCTCACAGCCATACTCACCTGGACCACAGAGACTCGCTGCTACGCCGCAACCACAGCAACAGTTTCCATCGAGCCCGGCGGCATTTTCGGCGCCATCGCCGCCACGTGCTGGGTACGGTTCACCCGGAGTGGGCGGTGTGGGCGTCGTGGGCGGTGTGGGGGGTGTGGGCGGATCGCGCGGGTTCGCGGCTCGTTCCCCACTGTATGCGAGCTCCCCGGCCGCTTACCGCCAGCAGCCGAGCCCAGCCGCCCAGCCCCAGCCAAGGTTTACACACGATGGGATGCCTTTCGCTAGAGACACACAGAGCGGTTCAGTATCAACTTCCGGCGGGGGTTTCTCGTGTTCCCCCGGCGTGGCGGGCGTGGTGGGGGTGGGGGTGGTGGGGGGTAGTACCCCGTTCCCGGCGGCGTCCCCCGCCGCTCACTCGTACACCCCGTCTCCAGCCCACACGCCGTACTCGCACCACTCGTCCCCAGCGCCGGCCCCAGCCCCTCACACGCCGTATGATTCACATCATTTTGCCAATCAGGGAAGTGGATCGAGCGGCTCGGGCGGCGGCTCGGGCTCGGGCTACGGTGCAGAGTTGTCTAGCGACATCGGTGCGGCGATATCGTCCCCGGCGCCCGTGTCGCCAGCCTGCGCCACCCTGGACTTCGAGCCGCCCCGTGATGACTCGCCAATGGGCAGCACAGACATGCATCCGGGCAGCAACAGCAACTCCTCGCTGTCCGACTACAATAAGCAGAGTAATCCGGGCGGAGGCGAGATGTCGCCGGCCGGTGTGGGCGCGGGCTCGTTCGGGGGCGCTCTGTACGACGACACGAGACTAGCGTACAGCGACAAACCCGACTATCATTACCAGGAACAAGGCAATGGTGTGGGAGACAGCCCTCGGTTAGATCAATTACATCAACATCCCTCCATGTATCCTAGCAATTTTAACAGGTCAACGCCTACCGGAGACAGCGACTCGGGCTTCGGCCGCGGCTCGTTCCGGGCGCCCGAGTTACACCACCCTCCGCCTGCAGGTTCCCCCAGTGAGTACAGCGGCGGAGGCGAGTCTGGTAACGGAACTCCTAAAAGCAAATCACAGGACGTGGCTTCCAAGTCGCTATCGGGACTGGAGTCGCTCGTTGACCAGATACCTTCCATAGCGGACGGCCCGGCGGGCGTGGGCGGTGTGGGCGGCGTGGGCGCGGCCGTCGGCAGCGTGGGCGAACAGGGCAGCGCCCCACCAGTGCCCTCGCTGCCAGAGTACACGCCAGCGTTATACCCGCCATACCCGGCGTACGGCGCACCGGCATACGGAAATAACAGCTACGGCGCTCCGTTTGTCGGTTACGGCGGTGGTTGGGGCACCCAGCTGATGCGGCCGGCGCCGGGCTACTTACCGGACTGGCAGTACGGGTACGGTCCGCCCGCGTACGCCTCATACAACTCACCGTACTACAACGGATATCCGGGACCGCCGCCCGCGCACCACCAACAGACTCACTACCTGTCCCCGCCGCTGTTGGAGCTCCACAAAAGCGGCGAGCACGCGGCCGCCGTGTCTGCGGTGCCCGCGGTCCCCTCCGTGCCCTCCGTCGGCTTCGGGGGCTTCTGTTAG

Protein sequence:

>DPOGS200103-PA
MEPQNKRDIGNRTHLCIELTEEGEEKPKAAESQDDGYETSNSGEAARRKPSSAEGSPAPPPPRSPPPDVEPDQEPEPEPEPEPEPEHEHEHEHEHETAYEPLLQRPFPSPYRHFEPPPAHPQQHQPPPAHDPEPYQHFHFPKYPQDPYAFKRETDAPYDMTQHHQHPHQQLHHNQYAKRDDEIYNGVKRECEDPYSFVEEEAMCAMLGQQHHLQHLQHHDQHHHQHMQIHPQQMMLNQPKKRGRKKKIKDENGLELKVDGVLDSVSGIVRPVKERKKHDRFNGMSEEEVSRRTLPDHLAENLDIIIIAQPDHIAEDLVIMTIGINPGLFAAYKGHHYAGPGNHFWKCLYLSGLTREQMSADEDYKLLNFGIGFTNMVSRPTKGSADLTRREIKEGSAILLEKLQTFRPKVAVFNGKLIYEVFSGKKDFCFGKQPDCIAGTNTYMWVMPSSSARCAQLPRAADKVPFYAALKKFRDYLNGLLPRLDDAELVFPDTTSRRPHEEMEIRRLTMEPEPGDTIILEDGTEVPLKKKRGRPKKVKLENGETVPAVPRAPRQPRPPPSMETGDQPAKKKRGRPKKIRPEEQQFLLQQQQQQQQQQQQQQQQQQQQNNSMLQPQLSSMTQLPHEQFLHNSSGDFQQMSSPLGVGGVGVGVGNVGVGSVGGVSNVSSGVMYGVQHQHQQQMSDSSPYYQPNNSGGMDSPLDVGGGLGMSRGYGSPGGVGVGGVGFAASPRHAHSYASPRSQPYSPGPQRLAATPQPQQQFPSSPAAFSAPSPPRAGYGSPGVGGVGVVGGVGGVGGSRGFAARSPLYASSPAAYRQQPSPAAQPQPRFTHDGMPFARDTQSGSVSTSGGGFSCSPGVAGVVGVGVVGGSTPFPAASPAAHSYTPSPAHTPYSHHSSPAPAPAPHTPYDSHHFANQGSGSSGSGGGSGSGYGAELSSDIGAAISSPAPVSPACATLDFEPPRDDSPMGSTDMHPGSNSNSSLSDYNKQSNPGGGEMSPAGVGAGSFGGALYDDTRLAYSDKPDYHYQEQGNGVGDSPRLDQLHQHPSMYPSNFNRSTPTGDSDSGFGRGSFRAPELHHPPPAGSPSEYSGGGESGNGTPKSKSQDVASKSLSGLESLVDQIPSIADGPAGVGGVGGVGAAVGSVGEQGSAPPVPSLPEYTPALYPPYPAYGAPAYGNNSYGAPFVGYGGGWGTQLMRPAPGYLPDWQYGYGPPAYASYNSPYYNGYPGPPPAHHQQTHYLSPPLLELHKSGEHAAAVSAVPAVPSVPSVGFGGFC-