Monarch geneset OGS2.0

DPOGS215549
TranscriptDPOGS215549-TA1278 bp
ProteinDPOGS215549-PA425 aa
Genomic positionDPSCF300129 + 245884-248531
RNAseq coverage647x (Rank: top 20%)
Annotation
HeliconiusHMEL0116241e-17977.89% 
BombyxBGIBMGA002284-TA0.077.56% 
DrosophilaCG31220-PB4e-0925.10% 
EBI UniRef50UniRef50_D9HQ690.077.43%Seminal fluid protein HACP010 (Fragment) n=15 Tax=Heliconiini RepID=D9HQ69_9NEOP
NCBI RefSeqXP_970766.13e-0926.39%PREDICTED: similar to serine protease [Tribolium castaneum]
NCBI nr blastpgi|2999306830.077.43%seminal fluid protein HACP010 [Heliconius melpomene]
NCBI nr blastxgi|3584425800.077.78%seminal fluid protein HACP010 [Heliconius erato]
Group
Gene OntologyGO:00038242.9e-34catalytic activity
GO:00042524.5e-19serine-type endopeptidase activity
GO:00065084.5e-19proteolysis
KEGG pathwaymdo:1000123948e-07 
 K01321 (F9)maps-> Complement and coagulation cascades
InterPro domain[61-399] IPR0090032.9e-34Peptidase cysteine/serine, trypsin-like
[70-287] IPR0012544.5e-19Peptidase S1/S6, chymotrypsin/Hap
Orthology groupMCL26821 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215549-TA
ATGACGACCATAAAGATTATGCTCGTCTCAATTTTAATAGTTATAGATATAGCTTTAGCAAATCCAACAACTTTGAAAACAAATGTGTCGGATAAAGTTGAAAGTAGTAACATCACGAACGAACTAAAAGAAGAAATTCCTGATGAAGGCTGGGAATATGAAAACTCAACTTATGTCGATACTAGAAGAATATTGTACAGTAAAAGGGTTGTTGCTGGGAAAAGACCTTATATGGTCTATCTTCAACTAACGAAAGAATCAGCGAAAGCTCATAAATATCGCGGCTGGTTATGTGGTGGGGTCATATTACATCAGTACTATGTGCTAACATCAGCGGCGTGTGTTGAAGATGCTGACCATTTCTATATTGTCTCTGGAACAACTAAATACGTTGACAGCTTCGATTATAAAAATGACGATTGCGTTTGCAAACACAGACGGAAAGTTGTGTGGAAATGTATTCCTAAAAATTATAAATTCGATTTCCAAGATAGCATAAAATGGTCATCTAACGATATTGCGATCGTAAAAGTAGATAGGCCATTTAAGCTCGGAATCACAGAGAAGGATTGCGAGTTTGCCACTGATCTTGTTTGTTATAATAATATAAGTAGGGAACTTGAAAAGGCTGGAACTAAAGGCTATATTGCTGGCTGGGGTAGCGGCAACAACTTCAGGGAGGGTGTGTATCGTCGCCAAAATGGCCACATACCAACAAATTCGAAATACCTTCAAGAGGCTAAAGTTTGCGTGATGGACAATGAGCAGTGCGCTAAAAAGTGGGCCCAGCGTTTTAGGAGCATCATCACACAATACATGATCTGCACCAAGGATGTGATGAAACGTCTCAGCGAGATTTGCGATAAAAAATACGCGAATTGTACCGACGTAGAGTCAAGACGCATAGGCATGGACGACGATTTGAACGATGATCAAACAAACTACCGTCTAGACCTTGGACGCCACCTCCGAGATCCCGATGACTATACAGCAAGGCGTACTACCAAACAGGGAGGGTTTTGTGAGAACGACCATGGAGGTCCTCTAGTAGTGAAATATCAAGGAAAAGAAAGAGTCATCGGTGTGATATCAGCCTGCAAGATAGATCCAAAAACCCACAGCTGCCACGGACCCTTCCTGTACACCAGCGTCTTCATGAACAGACAGTTTATATCTTGTGCTATCAACAAGGATGTGGAAGAAAATTGCCGAAGAGTGTTCCGTACTGGTATAACACACGAAGAATGGTCTGTCAATTGGGATGACAAAGCCGAGTAA

Protein sequence:

>DPOGS215549-PA
MTTIKIMLVSILIVIDIALANPTTLKTNVSDKVESSNITNELKEEIPDEGWEYENSTYVDTRRILYSKRVVAGKRPYMVYLQLTKESAKAHKYRGWLCGGVILHQYYVLTSAACVEDADHFYIVSGTTKYVDSFDYKNDDCVCKHRRKVVWKCIPKNYKFDFQDSIKWSSNDIAIVKVDRPFKLGITEKDCEFATDLVCYNNISRELEKAGTKGYIAGWGSGNNFREGVYRRQNGHIPTNSKYLQEAKVCVMDNEQCAKKWAQRFRSIITQYMICTKDVMKRLSEICDKKYANCTDVESRRIGMDDDLNDDQTNYRLDLGRHLRDPDDYTARRTTKQGGFCENDHGGPLVVKYQGKERVIGVISACKIDPKTHSCHGPFLYTSVFMNRQFISCAINKDVEENCRRVFRTGITHEEWSVNWDDKAE-