Monarch geneset OGS2.0

DPOGS204737
TranscriptDPOGS204737-TA1803 bp
ProteinDPOGS204737-PA600 aa
Genomic positionDPSCF300231 - 627135-633507
RNAseq coverage83x (Rank: top 64%)
Annotation
HeliconiusHMEL0115852e-10550.41% 
BombyxBGIBMGA012478-TA9e-3839.38% 
DrosophilaCG10405-PB7e-4141.42% 
EBI UniRef50UniRef50_G6DBC60.096.62%Putative uncharacterized protein n=3 Tax=Nymphalidae RepID=G6DBC6_DANPL
NCBI RefSeqXP_317173.22e-4141.92%AGAP008292-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|2999305871e-12287.61%seminal fluid protein HACP002 [Heliconius erato]
NCBI nr blastxgi|2999305873e-11987.61%seminal fluid protein HACP002 [Heliconius erato]
Group
Gene OntologyGO:00038244.8e-73catalytic activity
GO:00042525.2e-71serine-type endopeptidase activity
GO:00065085.2e-71proteolysis
GO:00055155.5e-14protein binding
KEGG pathwayani:AN2366.21e-36 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[348-597] IPR0090034.8e-73Peptidase cysteine/serine, trypsin-like
[367-592] IPR0012545.2e-71Peptidase S1/S6, chymotrypsin/Hap
[239-283] IPR0023505.5e-14Proteinase inhibitor I1, Kazal
[394-409] IPR0013141.3e-13Peptidase S1A, chymotrypsin-type
[193-226] IPR0114973.9e-08Protease inhibitor, Kazal-type
Orthology groupMCL26404 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204737-TA
ATGAAGAACGAAATATGGATTTTTACTATAACATTATTGATCCAAAGTTTTAAATCGAGAGCTGAAAACGGTATCATAACTGTTAAAAAAGTTACTGTAAGCCTTCCACAAGTATGTCCCTGTGAATACAATTATAATCCAGTCTGCGGGACGGATGGAGTTACATTTATAAATAAATGTCTTCTTGATTGTACGAATGACAGAAATGCTCGTCTGACCGATGGATTGCCACATATCTCAATCGCCCCTGACGATCAATGCAGTGGATGTGTATGTCTGAATCTCAACGTTCCCGTCTGCTCATTAAATGGAGTGACTTACGACGATGAATGTGCATTATCCTGTGAAAATAGAAATCGTATTCGTGACAATCAAACAATTGTTTATCTCGCATATAGAGCGGCTTGTAATGGACCGCCATGTCCATGCACCAACGTTGCGGCTCCGGTGTGCGGCACTGACAATATTTTATATAGAAATCAATGTGTTCTTGAATGCGCTAGTAGCAACGCCCAAGCGAAGAATCTTCCTGCAATCGAATTACAAAATAATGGTGCCTGTCTAGACGGTTGCCTCTGCCCAAAAACAGTGGAACCTGTTTGCGGAACAGATGGAAGGATTTACGATAATCTTTGCACTTTCGAATGCCAAAATAAAAAACTATCTAACTCTCAAAACGAACCCCTTAAAATTGCTAACCCATTTACTTGTAGAGAATGCGCTTGTAGAAAAGTATTTGAACCAGTTTGTGGTACCGATGGCAGAACGTATCCCAATAAGTGTGAAATTCAGTGTGCTTCATTTAGAAGAAGGGATCCCTTCCTCCAGATAGTCTCTCAAGGACCATGTCCCGAATGTTTCTGTAAAGACGAGTTTTATCCAGTTTGCGGCACAGATCACAAGACATACAAAAATGATTGCGAACTGCGATGTGCCAACAATAAGCTTTCAGCTGGTGAACAACTGATTAGCATTTTTTATCAAGGGCAATGCATGGAATACAATTGTGACTGCAATTGTGACTCAGAGTACCAGCCCGTTTGTGGGATCGACAACAGGTCCTATTGGAATTTATGCTTCCTGAACTGTCCAACACGTAGAATAGTGGCGGGTGTTAATACATCAATAGCGGCGGTTCCCTGGCAAGTCTCGTTGAGGGAAAAGACGTATCCCATATGTGGAGGGTCCGTAGTTACTACATTGTGGCTACTCACAGCAGCGCACTGCCTCTTACGGCCACGAGCCAGTGAGTTAAGTGTTCGCCTCGGCTCCTCGTGGAAGACTCATGGGGGTGAGATGTATGACGTCAAACAGTCCTATGTCCACCCGCAGTATGTGAGAAACACAAAAGTCAACGACGTCGGTCTCATCAAACTTTACTCCCCACTGAGATTCTCTTCAAGAGTTCTTCCTATTAAGATGGTGGGGAAGGGAACTCGCTTGCCGGCCGACAAAGCAGCTGTGGTCTCTGGATGGGGAAAGTTAAAGGAAGGTGGACCCAGTGCTACATTTCTTCAATCATCCACCATAAATACAATTGCGATGAAACTCTGTCGGAATTCCGGCTTAGACAGAAACCCTATAGATCCAGGGTCCATGTTCTGTGCAGGAGCCTTCAGCCAGCCCTCGCCCGATGCTTGCCAGGGTGACAGTGGTGGTCCCATAGTGAGTGAAGGTGTGTTGATCGGAGTGGTATCCTGGGGACTCGGCTGCGCCCGCGGCAACTTTCCCGGCGTCTACACTCGACTGGCCGCCCCTGTGATATGGGACTGGGTCCATGAACACATTTCACAGGACTCTTAA

Protein sequence:

>DPOGS204737-PA
MKNEIWIFTITLLIQSFKSRAENGIITVKKVTVSLPQVCPCEYNYNPVCGTDGVTFINKCLLDCTNDRNARLTDGLPHISIAPDDQCSGCVCLNLNVPVCSLNGVTYDDECALSCENRNRIRDNQTIVYLAYRAACNGPPCPCTNVAAPVCGTDNILYRNQCVLECASSNAQAKNLPAIELQNNGACLDGCLCPKTVEPVCGTDGRIYDNLCTFECQNKKLSNSQNEPLKIANPFTCRECACRKVFEPVCGTDGRTYPNKCEIQCASFRRRDPFLQIVSQGPCPECFCKDEFYPVCGTDHKTYKNDCELRCANNKLSAGEQLISIFYQGQCMEYNCDCNCDSEYQPVCGIDNRSYWNLCFLNCPTRRIVAGVNTSIAAVPWQVSLREKTYPICGGSVVTTLWLLTAAHCLLRPRASELSVRLGSSWKTHGGEMYDVKQSYVHPQYVRNTKVNDVGLIKLYSPLRFSSRVLPIKMVGKGTRLPADKAAVVSGWGKLKEGGPSATFLQSSTINTIAMKLCRNSGLDRNPIDPGSMFCAGAFSQPSPDACQGDSGGPIVSEGVLIGVVSWGLGCARGNFPGVYTRLAAPVIWDWVHEHISQDS-