Monarch geneset OGS2.0

DPOGS209164
TranscriptDPOGS209164-TA1617 bp
ProteinDPOGS209164-PA538 aa
Genomic positionDPSCF300061 - 118848-125269
RNAseq coverage1x (Rank: top 94%)
Annotation
HeliconiusHMEL0097473e-10773.93% 
BombyxBGIBMGA011480-TA8e-9764.75% 
DrosophilaCG13690-PA2e-6460.56% 
EBI UniRef50UniRef50_Q9VPP53e-6260.56%Ribonuclease H2 subunit A n=41 Tax=Coelomata RepID=RNH2A_DROME
NCBI RefSeqXP_001356439.22e-6461.36%GA12461 [Drosophila pseudoobscura pseudoobscura]
NCBI nr blastpgi|1984737774e-6361.36%GA12461 [Drosophila pseudoobscura pseudoobscura]
NCBI nr blastxgi|3123790868e-6263.24%hypothetical protein AND_09149 [Anopheles darlingi]
Group
Gene OntologyGO:00037235.8e-80RNA binding
GO:00045235.8e-80ribonuclease H activity
GO:00036763.8e-29nucleic acid binding
KEGG pathwaydpo:Dpse_GA124616e-64 
 K10743 (RNASEH2A)maps-> DNA replication
InterPro domain[331-536] IPR0013525.8e-80Ribonuclease HII/HIII
[332-488] IPR0123373.8e-29Ribonuclease H-like
[441-484] IPR0231602.2e-16Ribonuclease HII, helix-loop-helix cap domain
Orthology groupMCL11203 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209164-TA
ATGGCGCTTAAAAAACTGTGCTTTGTGTTCGCGATTTTGACGTGTGCTTCAGCTATAGTTGGCCCGGCTGCGAACAGCTCTAAAGTAGTTGTTGTTACTAAAGCCGACACTTACCTCGACGCTTATTCAAAATCACGAGTAGACGCAGAAGAAGCTAAGAAGGGTGTTGTTATAGTAACGGCCAGAGATGGCGAGAAGAAATTGACAAATCGTGACGACAGTTACAGCGGATACGACTACACGCCGTCGTATTCTAGCGGAGATAGTTATTCATCGGGACCGTCAAACAGCTATTTGCCTTCAGGCAATCATGGGCCGGTGAAGTTTGAGCCAGAAATCACGTACAATGCTCCCGCGAATACATACGGGCCACCGTCTCCAAGCTACGGACCTCCATCATCCAACTACGGTCCACCTTCAAGCGGCCCATCGCCGAGCTATGGCCCTCCGCAGTCCTACGGGCCTCCTACTCCCGTATACGGACCCCCAATACATAAACCATCACCCCCGGTATATGGACCACCTCTCAAACCGTTCTATGGCGTCCCATACACGGCGCCCGGTCTCAGTTTCTTCGACAAATTATCTTTAAAGTTAGACATCCTCACTATCGCAAAACTGATGCTGAAGTTCCTTATATTCAAGAAGATTGTGACCATGATCGCTGTGGTGTGCATGCTGCTCGTCATACCTAAACTAATATCTTTTAAGAAAGATAAAACTGGAGACGAGGGCGGCGACGAGGATGAACGTAGATTCGGTGGTAGACATCTTATGGAGTTAACCTCAGCTCAACAATTGTTGGACCGCGCGATGTATGTCTACGGACATCAGCGGCCGGACTGTGGGTTCGCGTGTCGCGTCAGACGCGTGTTAGACGATGTATACGAATTTCAGCCTTATTTCAGGTTTATTCGCCTGGAACGCGGGCTCATTGCGGGATCTACACGTGCCGTCGCCACACGTGTTGATGTGTCGTGTGTGCCNGCTGAAGTTATATCACCAAATTACATTTCAAACTCTATGTATAAAAGAGCCAAACACTCTCTCAACGAGGTATCAATGAATTCCGCGATATCTTTGATAAAAAAATCTATTGAATTAGGTGGGAATATAACAGAGGTGTATGTGGATACTGTCGGCCCTCCCGAGAAATATCAGGCCAGGTTAAAAGAAATCTTCCCTGATATTACGATCACTGTGGCAAAGAAAGCTGATTCCATCTACCCAATAGTGTCGGCGGCCAGTATAGTGGCTAAGGTCACGAGAGACCACGCCCTCAAGGTTTGGGAATTTCCCGAAGGTCTTGAGATCAATCACAAGGACTTTGGGAGTGGTTACCCAGGAGATCCATTGACTAAGAAGTTTATAAGGGAACAGATTGACAGAATATTCGGCTACCCCCTGTTGGTAAGGTTTAGTTGGTCCACGGCCGAGCTGGCTCTCCAGGAGAGAGCAGCGAAGTGCAGCTTCGAGGACATAGACGATGAGAATACGAAGAAACCGAAAGGAACCCAGGCCATCAGCTCGTTCTTTTCACCGAAGAACGAGCGGAAACGGAAGAGGCATAAATTTTTCGAAGAAAGAAATTTGACAATGAGCAACGCTTTCGAATAA

Protein sequence:

>DPOGS209164-PA
MALKKLCFVFAILTCASAIVGPAANSSKVVVVTKADTYLDAYSKSRVDAEEAKKGVVIVTARDGEKKLTNRDDSYSGYDYTPSYSSGDSYSSGPSNSYLPSGNHGPVKFEPEITYNAPANTYGPPSPSYGPPSSNYGPPSSGPSPSYGPPQSYGPPTPVYGPPIHKPSPPVYGPPLKPFYGVPYTAPGLSFFDKLSLKLDILTIAKLMLKFLIFKKIVTMIAVVCMLLVIPKLISFKKDKTGDEGGDEDERRFGGRHLMELTSAQQLLDRAMYVYGHQRPDCGFACRVRRVLDDVYEFQPYFRFIRLERGLIAGSTRAVATRVDVSCVPAEVISPNYISNSMYKRAKHSLNEVSMNSAISLIKKSIELGGNITEVYVDTVGPPEKYQARLKEIFPDITITVAKKADSIYPIVSAASIVAKVTRDHALKVWEFPEGLEINHKDFGSGYPGDPLTKKFIREQIDRIFGYPLLVRFSWSTAELALQERAAKCSFEDIDDENTKKPKGTQAISSFFSPKNERKRKRHKFFEERNLTMSNAFE-