Monarch geneset OGS2.0

DPOGS215548
TranscriptDPOGS215548-TA3585 bp
ProteinDPOGS215548-PA1194 aa
Genomic positionDPSCF300129 + 235064-245359
RNAseq coverage300x (Rank: top 37%)
Annotation
HeliconiusHMEL0116248e-13470.74% 
BombyxBGIBMGA002282-TA2e-13270.28% 
Drosophilagrass-PB2e-1126.55% 
EBI UniRef50UniRef50_D9HQ487e-15358.48%Seminal fluid protein HACP038 n=14 Tax=Heliconiini RepID=D9HQ48_9NEOP
NCBI RefSeqXP_001846625.11e-1122.03%serine protease1/2 [Culex quinquefasciatus]
NCBI nr blastpgi|2999306413e-15258.48%seminal fluid protein HACP038 [Heliconius erato]
NCBI nr blastxgi|2999307213e-15558.48%seminal fluid protein HACP038 [Heliconius melpomene]
Group
Gene OntologyGO:00038241.8e-30catalytic activity
GO:00042525.7e-19serine-type endopeptidase activity
GO:00065085.7e-19proteolysis
KEGG pathway 
InterPro domain[393-728] IPR0090031.8e-30Peptidase cysteine/serine, trypsin-like
[837-1048] IPR0012545.7e-19Peptidase S1/S6, chymotrypsin/Hap
Orthology groupMCL44357 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215548-TA
ATGATCTCCAATTGGTTATGTGGTGGTGTCATCATCAGCGCTCAGTACGTCCTCACATCAGCAGCTTGTATTGAGGATGTACAGCATTTTTACGTAATATCAGGAACACACAGATGGATACCTTCAGACATGTCAAATGATTGTATTAAATATGGCGCGAAAAAAGCTGTATGGAAATGTGTACCAAAAGATTACGTGTTCGACGGTAAAGAGTTCGAGAACATTCGTTGGATGGTCAATGACGTGGCGGTCGTTAAGGTTGAAGAGGACTTTAATTTCAATAGACGGATACGTGGTTGTGATTTTATACCGAAATTAATAGCGTTCAATAACCAGTCCGAGGATCTAGAGGCGCCGGGGACGGTAGCGTCCATAGCTGGGTGGGGATCCGTCGAGAGATTCGGGGATACTTTTGGAAGATCGACCATGAACTCGCCGGAGCTACTCGAAACTGATGTTGTATTGATATCGAAGCAAAAGTGTAAAGCGGGTTGGCCCGAGAGATATCACTACATCATAGATGGAAACATGGTCTGCGCCAAGGACGGGGCGGAGACAGACGCCATGAATACTAGGTGTAAGGAACACGAAATTAATTGCAAGGAACTAGTTTACTCCAAGGAGAGCGATCCAGACACAAGAAGATACGTCTTGAAGCCCGAGAATTTGAGAGTACATTCAGCCAAACACTTGGACACGAGACGATCTCAGATAATATCTGGTGGATTCTGTGAGAATGACCACGGAGGGCCTCTCGTTGTCGGTCACGGGAAGGGCTCTGTTGTGATTGGTGTCATATCAGCTTGTATGTCCGCTAACATCACACAGAAGTGTTACGGACCATTCCTCTACACCAGCGTGTTTAAAAACAGACACGTCATCAGCTGCGCCATTGATAAGGAGCAAGGGTCGGGCTGCCGGAGATTGCTGAAATCATTGAAGACAACATTGTTAGAAACAAAGCTTAGCTGGAGAAGTCACCCCGATGGACCAGCGAAGGATGAGCTTCCATCTGTTCGTAGCAGTACCCAGAAACAAAAGAACATCAAACCATCAAAATCAAGATCGACAACGATTAAAACACAAGCAAGGACATTTCTAGTTAATTCAATAGATGTAGTAACAACAATTCCTTTGGATGTCCAGAGAGTTTTACACAAAACTAATATTAGTGAAATAAGAGCTCATCTAAACGGATCTAGAAGAATCATATCAGGACTAGAAGCTCATAATAACAGACCATATATGGTTTATTTAAAACTACCGTCAAACAATCCCAAATACAAGAACTACCGTCACTGGTTATGTGGTGGTGTGATTATACATGACCAATACGTCCTAACATCCGCCGCTTGTATAGAAGACGCGAAACATTTCTATGTCGTTTCTGGCACATACAGGCACAACGATGAAGACGATCGTTTTAATAATATCTGTCTCAAAAATGGCGCCAAGAAGGCCATTTGGAAATGTATTCCAAGAAACTACGTGTTCGATGGCCACGAGAACGACAATATACGTTGGATGAACAATGACATAGCGATAGTTAAAGTTGAAGACGAGTTCGACTTTAACCGTCGTGTGAGAGGATGTGACTTCGTCCCTCGACCGATATGTTACAACAACCAAACCAGTCGCTACGAGGATCCAGGGAACGTGGCCTCAATCGCCGGATGGGGCAGCACTGATAAATACAATGATTGGGTAAACAAAGGAGGTTCGTCGACATCACAGGATTTATTAGAGGCAGATGTTGTTATCATCACCAAGAACAACTGCAAGCGTCAATGGGGTCCTCGATACCACAGTATCATAGATAATTATATGATCTGCTCTAGAGACACCATACCAGAGCTGAGCGAAGTCTGTAATGAGAAATATGTTGAGTGTACAGATATAATGTACTCTATGGAAGAATCACGGAGAGTTAACCCGAGTGAGTTGAGACTTCACTCCGCCTTTCATAACGATTCGGGGAGACGACAGGAGGCCGGCAGTGGAGGATTTTGTGAGAACGACCACGGCGGCCCATTAATTTATGGGCAGGGTTCTAGCGCAATAGTCATAGGCATCATATCGGCTTGCCTCGTTAAAGAACGCACCAACAAATGCTATGGACCTTATCTGTACACGAGCGTCTACAAGAACCGCATGCTCATCAACTGTGCCATCTACAAGGATATCGCAGGCGATTGCACAAAACTATTTAGAGCTAGCGACACCCACATCGAAGAGCACATAAGTTGGGCTGATCATCCTGATGGCCCAGCCAAAAACGAAATATCAAAAATGAGGCGAACGGAAGAAGAGAAAATCAAAAATAATACGTTGAGGTCGAATAGAACAGAACACAAACCGCCGTTAGACAAAGTTATCAAGCACGGCGGTGTCGTGTTAAGGGCGTTTGACGATCGTGAAGACTTCAATGAAAATACAGATGCAGTCGAAATCGCAACTCACATACCACGAGTTGTTGAAAATTTACTCAAAGACAATGAAACTCGGAGAATTATCAACGGAGACGAAGTCACAGACGGTAGACCGTACATGGTATATCTAAAATTACCACGGAATAGCAAGAAAACACAAAATTATAGATCTTGGTTGTGTGGAGGTGTGATTATTCATGAAGAGTACATCTTAACATCAGCCGCGTGCATTGAAGATGCTGAACATTTTTATGTTGTATCTGGAACGTATAAATATTCAGATGAAGATGACAGATATAATAATCCCTGCATCAAAAACGGTGCAAAGAAAGCGATTTGGAAATGCGTCCCGAAAAATTATGTCTTCGACGGTCACGAAAACGACAATATTAGGTGGATGAACAATGACATCGCTGTTGTCAAAATCGAAGATGGATTCGATTTCAGCCGACGGGTCAGGGGATGTGACTTTGTACCCAAACCTATCTGTTATAATAATCAGAGTCAAACGTTAGAAAATCCTGGTACTGTAGTGTCTATCGCCGGATGGGGAACCACATCGAGATATAACGATTGGGTAAATAGAAGGAAGGACAACCAGCAGAACCTTTTAGAAACTCACGTTGAAATAATACCTAAGAACAGATGCAAACGAAGATGGGGAGCCAGATATCATAATATCATTGAGAATTACATGATCTGTACCAAAGACATAGGACAGACGATGTCTGAAATTTGTAATGAAAAGTACGTAGACTGTCAAGACATAAGCTACTCCGACGAAGATGACGCAAGGCGAGACACAAGAATCCAGAAAAAGACAAAGTTCCTACAAATCTTTCGATGCACTCAGCATATCATAATGATAGCAGACGATTTACATCGCAAGCTGATGGTGGATTCTGTGAGACCAAAATGTAGGCGGCAGTTTAGATCTGGTGTGACTCACGTCGAGAAAGTACTAACATGGAAAAATCATCCTGACGGACCAGCCAAGAATGAGCTCGGACCTGGTCCAGTGAAAGCACAAAAAATAGTTCAGAGAGCGAACGAGAATCCTGAGGGGGACAAGGTCTTCGCTGGTAGTGGGTTCATATTACGACCAGAAAATGACGGGAAACCAGCTTCAGTAGTGAACGCTACCCTCACAGCTTGA

Protein sequence:

>DPOGS215548-PA
MISNWLCGGVIISAQYVLTSAACIEDVQHFYVISGTHRWIPSDMSNDCIKYGAKKAVWKCVPKDYVFDGKEFENIRWMVNDVAVVKVEEDFNFNRRIRGCDFIPKLIAFNNQSEDLEAPGTVASIAGWGSVERFGDTFGRSTMNSPELLETDVVLISKQKCKAGWPERYHYIIDGNMVCAKDGAETDAMNTRCKEHEINCKELVYSKESDPDTRRYVLKPENLRVHSAKHLDTRRSQIISGGFCENDHGGPLVVGHGKGSVVIGVISACMSANITQKCYGPFLYTSVFKNRHVISCAIDKEQGSGCRRLLKSLKTTLLETKLSWRSHPDGPAKDELPSVRSSTQKQKNIKPSKSRSTTIKTQARTFLVNSIDVVTTIPLDVQRVLHKTNISEIRAHLNGSRRIISGLEAHNNRPYMVYLKLPSNNPKYKNYRHWLCGGVIIHDQYVLTSAACIEDAKHFYVVSGTYRHNDEDDRFNNICLKNGAKKAIWKCIPRNYVFDGHENDNIRWMNNDIAIVKVEDEFDFNRRVRGCDFVPRPICYNNQTSRYEDPGNVASIAGWGSTDKYNDWVNKGGSSTSQDLLEADVVIITKNNCKRQWGPRYHSIIDNYMICSRDTIPELSEVCNEKYVECTDIMYSMEESRRVNPSELRLHSAFHNDSGRRQEAGSGGFCENDHGGPLIYGQGSSAIVIGIISACLVKERTNKCYGPYLYTSVYKNRMLINCAIYKDIAGDCTKLFRASDTHIEEHISWADHPDGPAKNEISKMRRTEEEKIKNNTLRSNRTEHKPPLDKVIKHGGVVLRAFDDREDFNENTDAVEIATHIPRVVENLLKDNETRRIINGDEVTDGRPYMVYLKLPRNSKKTQNYRSWLCGGVIIHEEYILTSAACIEDAEHFYVVSGTYKYSDEDDRYNNPCIKNGAKKAIWKCVPKNYVFDGHENDNIRWMNNDIAVVKIEDGFDFSRRVRGCDFVPKPICYNNQSQTLENPGTVVSIAGWGTTSRYNDWVNRRKDNQQNLLETHVEIIPKNRCKRRWGARYHNIIENYMICTKDIGQTMSEICNEKYVDCQDISYSDEDDARRDTRIQKKTKFLQIFRCTQHIIMIADDLHRKLMVDSVRPKCRRQFRSGVTHVEKVLTWKNHPDGPAKNELGPGPVKAQKIVQRANENPEGDKVFAGSGFILRPENDGKPASVVNATLTA-