Monarch geneset OGS2.0

DPOGS215545
TranscriptDPOGS215545-TA1101 bp
ProteinDPOGS215545-PA366 aa
Genomic positionDPSCF300129 - 126463-131547
RNAseq coverage17x (Rank: top 81%)
Annotation
HeliconiusHMEL0039164e-4441.09% 
BombyxBGIBMGA002284-TA8e-3430.90% 
Drosophila% 
EBI UniRef50UniRef50_D9HQ483e-3232.66%Seminal fluid protein HACP038 n=14 Tax=Heliconiini RepID=D9HQ48_9NEOP
NCBI RefSeqXP_968633.12e-0625.27%PREDICTED: similar to putative serine proteinase [Tribolium castaneum]
NCBI nr blastpgi|2999306411e-3132.66%seminal fluid protein HACP038 [Heliconius erato]
NCBI nr blastxgi|2999306411e-3329.44%seminal fluid protein HACP038 [Heliconius erato]
Group
Gene OntologyGO:00038247.9e-23catalytic activity
GO:00042529.9e-12serine-type endopeptidase activity
GO:00065089.9e-12proteolysis
KEGG pathway 
InterPro domain[48-313] IPR0090037.9e-23Peptidase cysteine/serine, trypsin-like
[47-298] IPR0012549.9e-12Peptidase S1/S6, chymotrypsin/Hap
Orthology groupMCL26820 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215545-TA
ATGAAAGAAACGATTATTTTTTGTGAATTCGTTACGAAGATGATTGTCAGGTTGAGTTATTTTCTGGTTTTATTTCTTTTGTATTTTAGTGAAGGTCAGCGAAGAATTAGATTCGGACAAAAAGTAACAACTCCGAAAAGGTACATGGTATATCTTACACAAGCGAAGCGTCACTATGATTCTTGGATTTGTGGAGGTGCTATTATCTCCCGATTTCATATACTAACCTCAGCAGCTTGTGTTGAAGACGTCACTGATCTATACGCTATCTCTGGGACGATGTCCTACATTTCACCAAACAAGATCCATCACAATGTTTGCTCGAAAAACACCAAACGAAAAATTGTTTATACTTGCATACCCAAAGAATATAAATTGGATTACGAGAAAGTTGATGTATGGTCTTATAATGATATAGCTGTCGTAAGAGTGGACATTCAATACGATTTTAGCGATCCGTTATATGACAAACATTGCCAATTTAGACCTAACTCGATTCCAATAAACTTTAATATAGCTAACGAAAATAAAGGCAGGGCTGTCGTGACGTTCGGATTTGGACACCAGAGCCTTTGGAGGCTGAAACATACAACAGATGATTACAACTACGCTGATCTTATGTATACGTCGGCACAGATTTCCGATCAAGGTAAATGCAAGGCGGCTTACCAAAAACCTAATTTAACCGAAAACATTGAGAAGTACATGATATGCACGAATGCCTCTGGGAACATCGACAGCAACGGAAAATTGGTTCAGAGACGAACAGATAACGACCACGGAGGTCCTCTTGTGACTTGGATCAATGAAGAGGAGACAGTTCTGGGAGTGGGCTCTGTGTTCAGAGTCAATAAGGAGAGCAAATGCGAAGGACCTTACCTTTACACGAGCACAGCCCGAAACAAAGTGTTTATAGAATGTCTTCTAGAGGAAACGCATGACAGACGGAGCTTTGATATTTGTAATCAAGAAGCTAGCGAACTCGGTTTTAAAATTGTTAGGAGATATGTTGTATGGAACGAAGATGATGCCGCAATTGTGGACGACTCCGAAGTCAAAATAAGGAAACAGATTAATATGAATTTTGTAGGGATGGGATGA

Protein sequence:

>DPOGS215545-PA
MKETIIFCEFVTKMIVRLSYFLVLFLLYFSEGQRRIRFGQKVTTPKRYMVYLTQAKRHYDSWICGGAIISRFHILTSAACVEDVTDLYAISGTMSYISPNKIHHNVCSKNTKRKIVYTCIPKEYKLDYEKVDVWSYNDIAVVRVDIQYDFSDPLYDKHCQFRPNSIPINFNIANENKGRAVVTFGFGHQSLWRLKHTTDDYNYADLMYTSAQISDQGKCKAAYQKPNLTENIEKYMICTNASGNIDSNGKLVQRRTDNDHGGPLVTWINEEETVLGVGSVFRVNKESKCEGPYLYTSTARNKVFIECLLEETHDRRSFDICNQEASELGFKIVRRYVVWNEDDAAIVDDSEVKIRKQINMNFVGMG-