Monarch geneset OGS2.0

DPOGS210404
TranscriptDPOGS210404-TA1149 bp
ProteinDPOGS210404-PA382 aa
Genomic positionDPSCF300831 + 255-3028
RNAseq coverage64x (Rank: top 68%)
Annotation
HeliconiusHMEL0156140.081.15% 
BombyxBGIBMGA006931-TA2e-4664.29% 
DrosophilaDNApol-alpha50-PA5e-8944.67% 
EBI UniRef50UniRef50_E2BLF54e-9850.72%DNA primase n=15 Tax=Coelomata RepID=E2BLF5_HARSA
NCBI RefSeqXP_001601538.12e-9650.43%PREDICTED: similar to DNA primase [Nasonia vitripennis]
NCBI nr blastpgi|3072049681e-9750.72%DNA primase small subunit [Harpegnathos saltator]
NCBI nr blastxgi|3072049687e-9650.72%DNA primase small subunit [Harpegnathos saltator]
Group
Gene OntologyGO:00062697.7e-136DNA replication, synthesis of RNA primer
GO:00038967.7e-136DNA primase activity
KEGG pathwaynvi:1001172386e-96 
 K02684 (PRI1)maps-> Purine metabolism
    DNA replication
    Pyrimidine metabolism
InterPro domain[1-318] IPR0140527.7e-136DNA primase, small subunit, eukaryotic/archaeal
[92-320] IPR0027558e-50DNA primase, small subunit
Orthology groupMCL12380 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210404-TA
ATGTTACCTGTGTATTATACAAGATTATTTCCTCAGAATATATTTTGTAGATGGCTTGCTTGTGGCAGTAGCCCTCAGCCTCTATCTAACAGAGAGTTATCATTTACACTTGCTGATGATATATACTTACGGTATCTTTCTATTAACAATCAAAAAGAATTCCAAACACTATTACAAAAGAAGTGCCCTCATAAATTAGACATTGGCGCTGTCTACAACACAAAGCCATCCATTGGGCGCCATGATGCAGTTGTGCTGTCCAGGGAACTTGTGTTCGATATTGATCTTACAGATTATGATGAAGTACGAACTTGTTGTCAGGAGGCCAAAGTTTGTAACAAATGCTGGAAATTTATTGTGGTAGCATGTGAAATTATTAACGCTGCTCTGAAAGATGATTTTGGTTTCCAGAATATTCTTTGGGTTTTCTCTGGTAGAAGAGGTTGTCATTGTTGGGTATCAGACTATGAAGCAAGAACACTAGATAGTCCTGGTCGTGCTGCTATTGCTGATTACCTTTGCCTTATTTTTGGAGGGGAAAATAAAAATAAGAAAGTACATCTTGGAAGTGATAACTTGCACTCTAGTATAAAGAGGTCTCTTAATATTATTGATAGATATTTCCTTGAAATACTAGAAGATCAAGATTTTTTGTCAACTTCGGAGGGCACAAAAAAATTTCTAAAAATAATACCAGATGACACCTTACGTAAACAAGTTGAAGATAGCTTCGGAAGAGGATTGTCTACTGTTGACAAATGGGAATGTTTCATACAAACTTACTATCAGTTCTGTAAAGAGAATATTAACGCTATCAGAAAAATGAAATACCTAGTCGAAGAAATTAAAATACAATATTGTTATCCAAGATTAGATGTGAATGTTACAAAGGGCTTTAACCATTTACTTAAATCTCCATTTAGTATACATCCTAAGACTGGTAAAGTATCCATAGTATTCAAACCAGAAAATGCCCGAAACATGAAATTAGAAGATATACCAACCATTTACAGCCTTCTAGATGATAACTCTCCAGATAAAATTCAACATCAAAATAATATGAGAACAGCTGTTAAAAATTTTCAAGAAGTGGTCTTCTCACTGGAGAAAACTGAAGCATTGAGAAGAAGGAATGAAGCTAGTAAGTAG

Protein sequence:

>DPOGS210404-PA
MLPVYYTRLFPQNIFCRWLACGSSPQPLSNRELSFTLADDIYLRYLSINNQKEFQTLLQKKCPHKLDIGAVYNTKPSIGRHDAVVLSRELVFDIDLTDYDEVRTCCQEAKVCNKCWKFIVVACEIINAALKDDFGFQNILWVFSGRRGCHCWVSDYEARTLDSPGRAAIADYLCLIFGGENKNKKVHLGSDNLHSSIKRSLNIIDRYFLEILEDQDFLSTSEGTKKFLKIIPDDTLRKQVEDSFGRGLSTVDKWECFIQTYYQFCKENINAIRKMKYLVEEIKIQYCYPRLDVNVTKGFNHLLKSPFSIHPKTGKVSIVFKPENARNMKLEDIPTIYSLLDDNSPDKIQHQNNMRTAVKNFQEVVFSLEKTEALRRRNEASK-