Monarch geneset OGS2.0

DPOGS212497
TranscriptDPOGS212497-TA3225 bp
ProteinDPOGS212497-PA1074 aa
Genomic positionDPSCF300222 + 157132-160356
RNAseq coverage157x (Rank: top 52%)
Annotation
HeliconiusHMEL0095080.067.58% 
BombyxBGIBMGA010155-TA9e-17774.87% 
DrosophilaCG12877-PC3e-10237.63% 
EBI UniRef50UniRef50_E2ABA91e-17737.18%RNA exonuclease 1-like protein n=1 Tax=Camponotus floridanus RepID=E2ABA9_CAMFO
NCBI RefSeqXP_392195.31e-17236.46%PREDICTED: similar to transcription elongation factor B polypeptide 3 binding protein 1 isoform 1 [Apis mellifera]
NCBI nr blastpgi|3071819455e-17737.18%RNA exonuclease 1-like protein [Camponotus floridanus]
NCBI nr blastxgi|3071819450.036.95%RNA exonuclease 1-like protein [Camponotus floridanus]
Group
Gene OntologyGO:00045277.3e-37exonuclease activity
GO:00056227.3e-37intracellular
GO:00036768.1e-26nucleic acid binding
KEGG pathway 
InterPro domain[913-1072] IPR0060557.3e-37Exonuclease
[896-1069] IPR0123378.1e-26Ribonuclease H-like
[916-1063] IPR0135202e-22Exonuclease, RNase T/DNA polymerase III
Orthology groupMCL10954 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212497-TA
ATGTTACCATCGACGGGATATTTTAAGGGAATAAATTGTCCTTTCTACGATAGTGGTATCTGTGAACGACCCTATTGTCATTTTCGTCATGTTAAAAAAGAAACCAATAATATTTCTGGAGAAGGGATAGAAAGCGGAAATGCACTTTTAAAGCTTGTGTCCGCTGCCGTCCAAAAGGTTCTTCAACAAACGGACTCTGCAGCGGGATCGCCGTCGTCGAATAATACATTCGAAACCGAAAGTGTTCAATGTTTACCAGCTTCGTCGAAAGTTACTTACAATCCCACTCCAATAGCGGAACTCAACAAAATTAATAGTGAACCTGAAATCGGCGAGGAAATCAATGAACAAAAACGAAGACATATTCCAGTGCCATACACGCCGCGAAAACCGGCCAGTCTATGCATTAAGCGGCCTGTTGATACGAATAGTTCAAAACTTACATATGTAGCACCAGTTTTATACACACCAGGTTCTGAAAGCGCTCCAGATCCATACTCACCACAAGGGTCAACGGAATCAAATGAAAAATATTTGCCAGGTGTTGAAATAGCAGTACAAGAATATGCACCAAAAGAAACACAGAACTCAAACTCCAAAGTTAATTATATCCCTTCCGATAAAAATGTAAATAGAAAGAAACTGTTAGAATATAAACCTACCAAAGTTAAAAGCCAACCTTCTTCTGTATCATACCAACCAACACCTAGATCGCTAGTTCCTTGCTTTTCTAGTGATGAAGACGAACCTGATACTAAAAAAAGGAAACTCTCTAGTGATCTTAATGGATTGGATGAATTAGGACCAGAATTTGATATATTAGATCAAATCTTAGATGAAGAAAAATCAGAGAAATGTTCCAATAATCATAAAAATGATTCTTCAATAGAATGTAAGGAGAATAAGAAGAGTCTAGAAGTTAATAGTAGTCATAAGGAAAAAGATGATAAAAGTCATAGTAAAACTGATAAAATTAATAAAGAAAAATCTGATAAAGATAAATCTGAGAAGAAACATGACAAAGTAGATAAGACTAAGAAGAAAGAAAAAAGCGAAAAGTCTGATCAAAATGGGAAAGACTGTAAACGCAGGAGTTCCAGTACTGATAAGAAAAGTTCCCACAAAAGTTCTAATCAAGATAAAAAGAGACATAGTAGCAGTGAAAGTAAGTCAGAAAAAAGATCAAGTTCAAAAAGTTCAAGTAGTAGTTCAAAGCATTCAAGAAATGATTCATTGAGGCATTCTAAACATAGCTCCAGTGACAAAACAGATAGTAAGAAAAGTAAGAGTAATAGTAGCCACCACAAGACACACAAACATTCTAAGGAATCTAAAGACAGGAAAGACAAACATAAAAAGCACAATTCGAAGAGCAATTCGGAAGACGATGACCATAAATTTGAAAACTTCATTGAAGATATTGAGGATGTCTCAGAGCCAGATGAAGAGGATATCGCTTTAGAATGTAAAAGAATATTTGAGGAATATGTTCCATCTGAAAAACCAGAAGCAAAAGATGAACCCAATGAACCTGACATTTCAATGACAGATAATGATGAATATATACCATCAAAGAAAAGGGTATCAAGAACTACTGATAAGAATATCAAAGTCACTCCTAAGGCTCCAATAAGACCAGATTTCAAACTTAATGCAGCTCAAGCCATGGCAGAACGACTTGCTAAAGTGAGAGAATTTCACGCCAAAATTACACCAGACGCAACTCCTCCTGCAAGCAATTCCAAACCAGATTCGAAGCCATTTGTACCACCTGTTACTAATCATTCCAAAATTAGAATAGCTCATGTCCCTTACGCATCAACCATGTTAACTGCAAAAAAGACTATTCCACCTAAACCTAGCTCTGCACCAAACAATAATCCTTCAACAAGCTGCACAGTGACACAAACAGTGAAAAAAGGGACTCAAAGAGTTGCACACTTGCCTAGTGAGAAGTTTATAGATAGACCAGGAGTACTTGAACCTTTGGGATCAAAAATACCAGTTAACATCAGGTCAACCTATTTAAACTTGATGATAGACAATTGCTTGAATATTTATTTACTACCATCTGATGCTTATGCAAGGGCACAAAACGAAGAATTGACAACAAGCAAAAAGTGTTCTTCAGTACCAATTTATAAAAATTCAGCAGTCCTTGCTATTAGTAGATTGAAAAAGGAAGTCATAGAATGCAATGGGGTAAAAAAATCTGGTAATGATAGTTCAGGAGCTAAATTTGTACAGGGTACAGTCACGAATGCAGCTAGTGCTGGCTCCTGGAGTATTGAAAGTAAACATAAAAAGAATTTCGAAGACTCTAAACAATTTGTTGGTGCTAACTTGTATAACAATATTAAGAAATGGATATTAACTGATGAGCAGTTAAAAGAGAATGGTTTTCCTAGACCACATACTAACGGGGAGAAGGGTAGAGCTATTATATATGGTCAAAACAAACAAAAACCTCCTAAAGGTTTTATAAGGACTTGTTGTAGATGTAAAAAAGAATATACGGTAGACAAAAAAGGCTTTCCTGTTATAAAACAAGATTGCATTTATCATCCTAATAACAAGTACAGGTTTCGGGGTGAGGTTAAATATCAGTGTTGCAGCCAAGATGAATCATCTGATGGCTGTTGCATAGCGTCAACTCATGTTTATGAATACGTAGACTTTGAAAATTTAAAAGGTTATGTTAAAACTCTAGCGCCGGACACTTTGATGGACGATTATGGTGTTTATTCCCTAGATTGCGAAATGTGTTACACTACACAAGGCCTGGATTTAACAAGGGTTACAGTTATCAATAGTTCCTGTAAAGTAGTGTATGAGACACTTATTAAACCCCTCCATCCCATCATAGATTATAATACAAGGTATTCTGGTATAACTGAGGAACAAATGGCCGATGTTAAAACTACACTTCTTGATGTGCAAGCAACACTACTTACAATGTTCAATAGTAAAACAATCTTAATAGGACATAGTTTAGAATCTGATTTCAAAGCACTGAAATTGATTCATGATACGGTAATTGATACAAGTGTGCTATTTCCTCATAAAATGGGTCCTCCATATAAAAGAGCATTAAGAAATTTATCATCCGAGCATTTGAAGAAGATTATCCAGAACTCGGTTGATGGTCATGACAGTGCAGAAGACGCTACAGTGTGTATGGAACTCTTGATGTACAAAGTTAAAGAGGATTTAAAAACAAGGTGA

Protein sequence:

>DPOGS212497-PA
MLPSTGYFKGINCPFYDSGICERPYCHFRHVKKETNNISGEGIESGNALLKLVSAAVQKVLQQTDSAAGSPSSNNTFETESVQCLPASSKVTYNPTPIAELNKINSEPEIGEEINEQKRRHIPVPYTPRKPASLCIKRPVDTNSSKLTYVAPVLYTPGSESAPDPYSPQGSTESNEKYLPGVEIAVQEYAPKETQNSNSKVNYIPSDKNVNRKKLLEYKPTKVKSQPSSVSYQPTPRSLVPCFSSDEDEPDTKKRKLSSDLNGLDELGPEFDILDQILDEEKSEKCSNNHKNDSSIECKENKKSLEVNSSHKEKDDKSHSKTDKINKEKSDKDKSEKKHDKVDKTKKKEKSEKSDQNGKDCKRRSSSTDKKSSHKSSNQDKKRHSSSESKSEKRSSSKSSSSSSKHSRNDSLRHSKHSSSDKTDSKKSKSNSSHHKTHKHSKESKDRKDKHKKHNSKSNSEDDDHKFENFIEDIEDVSEPDEEDIALECKRIFEEYVPSEKPEAKDEPNEPDISMTDNDEYIPSKKRVSRTTDKNIKVTPKAPIRPDFKLNAAQAMAERLAKVREFHAKITPDATPPASNSKPDSKPFVPPVTNHSKIRIAHVPYASTMLTAKKTIPPKPSSAPNNNPSTSCTVTQTVKKGTQRVAHLPSEKFIDRPGVLEPLGSKIPVNIRSTYLNLMIDNCLNIYLLPSDAYARAQNEELTTSKKCSSVPIYKNSAVLAISRLKKEVIECNGVKKSGNDSSGAKFVQGTVTNAASAGSWSIESKHKKNFEDSKQFVGANLYNNIKKWILTDEQLKENGFPRPHTNGEKGRAIIYGQNKQKPPKGFIRTCCRCKKEYTVDKKGFPVIKQDCIYHPNNKYRFRGEVKYQCCSQDESSDGCCIASTHVYEYVDFENLKGYVKTLAPDTLMDDYGVYSLDCEMCYTTQGLDLTRVTVINSSCKVVYETLIKPLHPIIDYNTRYSGITEEQMADVKTTLLDVQATLLTMFNSKTILIGHSLESDFKALKLIHDTVIDTSVLFPHKMGPPYKRALRNLSSEHLKKIIQNSVDGHDSAEDATVCMELLMYKVKEDLKTR-