Monarch geneset OGS2.0

DPOGS211236
TranscriptDPOGS211236-TA1470 bp
ProteinDPOGS211236-PA489 aa
Genomic positionDPSCF300385 + 3699-10631
RNAseq coverage1x (Rank: top 94%)
Annotation
HeliconiusHMEL0164836e-9367.95% 
BombyxBGIBMGA005008-TA3e-5748.03% 
Drosophilaea-PA1e-3937.35% 
EBI UniRef50UniRef50_D9HQ952e-9368.07%Seminal fluid protein HACP049 n=2 Tax=Heliconius RepID=D9HQ95_9NEOP
NCBI RefSeqNP_001166076.13e-4539.74%serine protease 19 [Nasonia vitripennis]
NCBI nr blastpgi|2999307357e-9368.07%seminal fluid protein HACP049 [Heliconius melpomene]
NCBI nr blastxgi|2999307358e-9369.70%seminal fluid protein HACP049 [Heliconius melpomene]
Group
Gene OntologyGO:00038246e-73catalytic activity
GO:00042522.7e-53serine-type endopeptidase activity
GO:00065082.7e-53proteolysis
KEGG pathway 
InterPro domain[265-488] IPR0090036e-73Peptidase cysteine/serine, trypsin-like
[267-484] IPR0012542.7e-53Peptidase S1/S6, chymotrypsin/Hap
[80-95] IPR0013148.8e-14Peptidase S1A, chymotrypsin-type
Orthology groupMCL18297 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211236-TA
ATGAACTGTAAATGGATATATGTATTCATTTTATTCTTACGTATCGGTGACACTGCCGACGATGAATGGGTTGGAATTGCTAATCATAAAACTTGGGAATTTTTGGATACCTTCGATTGTGGATTTAATTTCGTCGATCGTATTATTGGGGGATTAAATGCAGCACCAAAACAATTTCCTTGGATCACGAGACTGGGTTATTCCACCCGAGAAGAAAAAGAACTAGATTGGATGTGTGGTGGTGCGCTCCTATCTGACCGTCATGTTATCACAGCAGCGCATTGCGTTGTGAGCTCAATCGAAGCTAAACTGGTAAAAATTCGTATGGGAGAGTACGACATTAGGACAAACCCGGATTGTCAATTTAACAAATGCGCCCCTCCAGTCCAGGATCGCGGTATAAAAACTATTATAAGTCACCCAAATTTTAACAAGCCAGCTTTTCACAATGATATAGCAATCATCGTTCTGGATGAACCCGTAGAAATGAATGACTATGTTATACCAATTTGTTTGCCGCGGGAGGAGCAATTACGTCAGTACTTAGAACTAGGAGAAAAGTTAATAGTAGCTGGCTGGGGTAAAATGAATATGACTACAGACGAAAGAGCTAAAATACTACAATATGTAACTGTACCTGTCCTGAAATTAGAAATGTGCAATACTTTTGGAAAGCGATTCACTTTAGCCGAATCGGAAATATGCGCGGGAGCACAAGAACACAAGGACGCATGTGGGGGCGATTCAGGGGGTCCTCTAATGAAGGCAACCCGAGAAGAAAAAGAACTAGATTGGATGTGTGGTGGTGCGCTCCTATCTGACCGTCATGTTATCACAGCAGCGCATTGCGTTGTGAGCTCAATCGAAGCTAAACTGGTAAAAATTCGTATGGGAGAGTACGACATTAGGACAAACCCGGATTGTCAATTTAACAAATGCGCCCCTCCAGTCCAGGATCGCGGTATAAAAACTATTATAAGTCACCCAAATTTTAACAAGCCAGCTTTTCACAATGATATAGCAATCATCGTTCTGGATGAACCCGTAGAAATGAATGACTATGTTATACCAATTTGTTTGCCGCGGGAGGAGCAATTACGTCAGTACTTAGAACTAGGAGAAAAGTTAATAGTAGCTGGCTGGGGTAAAATGAATATGACTACAGACGAAAGAGCTAAAATACTACAATATGTAACTGTACCTGTCCTGAAATTAGAAATGTGCAATACTTTTGGAAAGCGATTCACTTTAGCCGAATCGGAAATATGCGCGGGAGCACAAGAACACAAGGACGCATGTGGGGGCGATTCAGGGGGTCCTCTAATGAAGGTTTTTGACACACAAGATGGACCTAAAAGCTTCCTAGTTGGAGTCGTATCATTTGGTCCAACAATTTGCGGCATTAGAAAACCAGGCGTGTACACATCGATTAAGTTTTTTCTCGGCTGGATTCTTGATAATTTGATATAA

Protein sequence:

>DPOGS211236-PA
MNCKWIYVFILFLRIGDTADDEWVGIANHKTWEFLDTFDCGFNFVDRIIGGLNAAPKQFPWITRLGYSTREEKELDWMCGGALLSDRHVITAAHCVVSSIEAKLVKIRMGEYDIRTNPDCQFNKCAPPVQDRGIKTIISHPNFNKPAFHNDIAIIVLDEPVEMNDYVIPICLPREEQLRQYLELGEKLIVAGWGKMNMTTDERAKILQYVTVPVLKLEMCNTFGKRFTLAESEICAGAQEHKDACGGDSGGPLMKATREEKELDWMCGGALLSDRHVITAAHCVVSSIEAKLVKIRMGEYDIRTNPDCQFNKCAPPVQDRGIKTIISHPNFNKPAFHNDIAIIVLDEPVEMNDYVIPICLPREEQLRQYLELGEKLIVAGWGKMNMTTDERAKILQYVTVPVLKLEMCNTFGKRFTLAESEICAGAQEHKDACGGDSGGPLMKVFDTQDGPKSFLVGVVSFGPTICGIRKPGVYTSIKFFLGWILDNLI-