Monarch geneset OGS2.0

DPOGS202640
TranscriptDPOGS202640-TA1509 bp
ProteinDPOGS202640-PA502 aa
Genomic positionDPSCF300039 - 748077-755040
RNAseq coverage42x (Rank: top 72%)
Annotation
HeliconiusHMEL0145184e-6458.11% 
BombyxBGIBMGA000840-TA4e-8670.56% 
DrosophilaCG31954-PA4e-3337.10% 
EBI UniRef50UniRef50_E1ZWT74e-4831.49%Chymotrypsin-2 n=1 Tax=Camponotus floridanus RepID=E1ZWT7_CAMFO
NCBI RefSeqXP_001660675.13e-5732.30%trypsin [Aedes aegypti]
NCBI nr blastpgi|1571254615e-5632.30%trypsin [Aedes aegypti]
NCBI nr blastxgi|1955824363e-6232.17%GD25907 [Drosophila simulans]
Group
Gene OntologyGO:00038248.5e-67catalytic activity
GO:00042521.6e-54serine-type endopeptidase activity
GO:00065081.6e-54proteolysis
KEGG pathwayani:AN2366.21e-37 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[268-501] IPR0090038.5e-67Peptidase cysteine/serine, trypsin-like
[47-252] IPR0012541.6e-54Peptidase S1/S6, chymotrypsin/Hap
[74-89] IPR0013144.2e-09Peptidase S1A, chymotrypsin-type
Orthology groupMCL19872 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202640-TA
ATGAGTTGTCATCACAAGGCAATATTGCGGGCCAAAGCCGACGAACGTGATGTGGCTTGGTTTTTAAGACCCAGTATAATATTGTCCGAAAAAATTCCTCTATGGGAGAGTTGGAAAACGCCACCCAAAGTAGATGAAAGAATTGTGGGAGGAGAAGATGCTAATATAGAAGACTATCCATACCAAGTATCCTACACCTTTAATAATTCATATTTTTGCGGAGGATTTATTGTCAGCGAAAATTATATCCTAACTGCCGGCCACTGTTCTCAAAACGTCGATCCATCAACAGTTGTACTACGTGCTGGAAGTTCCTATCGTCAAAATGGTACAATAATACCAATAGCAGAAGTAATGGCACATCCAGAATATGATGATCCCGCGTTTGATAAGGACGTCGGTTTTATGCGGACAGCTGAACCCATACAGTTTAGTGACACCATACAACCAATACAACTCGCAGAGAGAGACAGACCACTTGTAGGTGATATCCAAGTTACTGGATGGGGAAGACTCAAGCAAGGACAAAATCCAATTCCAAGTAGACTGATGCGTGTTAAAGTACCAGTTGTTAATTATTGGCAGTGTACATTATCGTACTTCAGCGTTTTGACCAGGAATATGTTCTGTGCTGGCAATTTCTTCCTTGGAGGACAAGGGACTTGTCAGGGAGACTCTGGAGGCGCCGCAATACAAGATGGGAGGGCGGTCGGTATCGTTTCATTTGGTAGAGGTGGGTACTCATCCGACATTATACAACCGGGGAATGTTCCGTTATGGAGTGACACAGAGGAGACTGAGTATGATGTGGATGAGAGGATTGTGGGGGGTCAGGAAACAACTATAGAGGAATATCCGCACCAAGTGAACTTTTTATTCCAACGAGATAATGGTACTTACTTTTGCGGGGGATTCATCATTAGTGAATATTACATTCTGACAGCAGCCCACTGTGCACAAGGTATTGATCCTACCACAGTGATTCTGAGAGCTGGTAGCACATATCGGGGCAATGGTACTATTATACCGATAGATGAGATAGTTGCACACCCAGAATATAACGATTCACCCTTTGATAAGGATGTTGGCTATATACGAACTTCTAATCCAATACAGTTTACTGGCGCTATGAAGCCCATTCCCCTCGTAAATGAATCTGAACCGTGCAGTAATAGAGTGAACGTCAGCGGATGGGGTAGACTGATGGAAGGACAAAATCCCTTGCCTCTAAGACTAAGAGCGGTGAATGTGCCTGTTGTTGATTATTTTAGATGTAAGATGGCGTATCCCAGAATATTAACTCGCAACATGGTATGTGTTGGGAATTTCGTCTTAGGAGGTCAGGGTACTTGTCAGGGGGATTCAGGAGACGCTGGGGTTGATAATGGGAGGGCTTGTGGTATTGTGTCATTTGCAAGAGGTTGTGCACGCCCTATGTCTCCGAATGTCTTCACAAATATAGCAGCTGGACCAGTTAGAAGATTTATCACAGATAATACAGGTGTCTAA

Protein sequence:

>DPOGS202640-PA
MSCHHKAILRAKADERDVAWFLRPSIILSEKIPLWESWKTPPKVDERIVGGEDANIEDYPYQVSYTFNNSYFCGGFIVSENYILTAGHCSQNVDPSTVVLRAGSSYRQNGTIIPIAEVMAHPEYDDPAFDKDVGFMRTAEPIQFSDTIQPIQLAERDRPLVGDIQVTGWGRLKQGQNPIPSRLMRVKVPVVNYWQCTLSYFSVLTRNMFCAGNFFLGGQGTCQGDSGGAAIQDGRAVGIVSFGRGGYSSDIIQPGNVPLWSDTEETEYDVDERIVGGQETTIEEYPHQVNFLFQRDNGTYFCGGFIISEYYILTAAHCAQGIDPTTVILRAGSTYRGNGTIIPIDEIVAHPEYNDSPFDKDVGYIRTSNPIQFTGAMKPIPLVNESEPCSNRVNVSGWGRLMEGQNPLPLRLRAVNVPVVDYFRCKMAYPRILTRNMVCVGNFVLGGQGTCQGDSGDAGVDNGRACGIVSFARGCARPMSPNVFTNIAAGPVRRFITDNTGV-