Monarch geneset OGS2.0

DPOGS211237
TranscriptDPOGS211237-TA1185 bp
ProteinDPOGS211237-PA394 aa
Genomic positionDPSCF300385 + 15007-17950
RNAseq coverage766x (Rank: top 17%)
Annotation
HeliconiusHMEL0164831e-6644.04% 
BombyxBGIBMGA005173-TA1e-4838.87% 
DrosophilaSPE-PA1e-5433.77% 
EBI UniRef50UniRef50_Q7QKL17e-5935.22%AGAP003252-PA n=1 Tax=Anopheles gambiae RepID=Q7QKL1_ANOGA
NCBI RefSeqXP_307757.31e-5636.39%CLIP-domain serine protease subfamily B (AGAP003252-PA) [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3479695472e-5835.22%AGAP003252-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|910835072e-5736.00%PREDICTED: similar to proclotting enzyme [Tribolium castaneum]
Group
Gene OntologyGO:00038244.3e-83catalytic activity
GO:00042524.2e-66serine-type endopeptidase activity
GO:00065084.2e-66proteolysis
KEGG pathwaydpo:Dpse_GA159039e-46 
 K01312 (E3.4.21.4, PRSS1, PRSS2, PRSS3)maps-> Neuroactive ligand-receptor interaction
InterPro domain[129-393] IPR0090034.3e-83Peptidase cysteine/serine, trypsin-like
[140-388] IPR0012544.2e-66Peptidase S1/S6, chymotrypsin/Hap
[171-186] IPR0013146e-13Peptidase S1A, chymotrypsin-type
[30-77] IPR0066049.2e-11Disulphide knot CLIP
[35-76] IPR0227001.2e-06Proteinase, regulatory CLIP domain
Orthology group 
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211237-TA
ATGGATAAGATAATTTTAGCCATCCCACTGACTATCGTTATAATTGCCACAGTTTATTCCGATATAATTGAAGCTCCAAACCAATACTGTAAGACAAAATTTACAAATGGTACATGTGTGAAAGTAACCGATTGCCCCTATGCGTTGACATTGATATATAAACATGATTATGATACACTATCAAATCTCACTTGTGGATTCAATAAGCACCAACCTCAGGTATGTTGTCCCCAACAAGACTTTCCTATATTGTACGATACAAAAGAGGAACCGGCTACAAATAGACCGAAACCAATGAATTTAAAACCGGTAGCCACAACAACTTTTAGCCCCCACATAGAAACGAAGAATGATAGCTCTAATGTGTTACCAAACAAAACAATTTGTGGCAAAGTAAAAAATAAGGGCGTCAGTGATAGAATCGTTGGTGGATCTGTTGTTGAAGTTGATGAACATCCTTGGTTAGCTCGTATACAACATAAATTCGATGACAATACTATTTTCGGATGTTCAGCTGCACTTATCACTAATTTATATCTTCTTACGGCAGCACATTGCGTGCAAAATCACAAAATTATTCCGTTCAGTGTTCGTTTGGGAGAGTGGAACACCAAGACAGACATTGACTGTCGCAACAACATTTGTAATAACAGTACAGTTGACATAAACATTAATAAAATAATTGTCCATCCAAAATATGATGGAAAATTAGGTCATAACAGCGACATCGCCTTGATTCGTTTAAGAGATCCCGTGAATTTTACAGATTTCATACAGCCCATATGTTTACCCGCTTCTAAATACATTGCCATGCAAGACTCTGTCATCAATAACGCTTATTGGACAGCTGGCTGGGGAGAAACAGAATATGAAGAGGAATCTGTTATAAAACGCCAAGTACAACTGAATTCTGTACCAATAGAAATTTGTCGAGCTCATTTCAAAGTGGCACCTGAAACTGAGCCAAACATAATTTGCGCTGGAGGTATAAAAGGAAAAGATACATGCAATGGAGATTCAGGAGGACCATTAGTAAAAATAGAATCAGAAAATTATGAAGAAAATTGGTACATGTTTGGAATAACCAGTTCGGGCTCCAAGACATGTGGCCGGGAAGGTGTACCCGGAATCTATACAAGAGTCACCTCTTACATTGATTGGATTCTTGAAAATGTTAAAGAATGA

Protein sequence:

>DPOGS211237-PA
MDKIILAIPLTIVIIATVYSDIIEAPNQYCKTKFTNGTCVKVTDCPYALTLIYKHDYDTLSNLTCGFNKHQPQVCCPQQDFPILYDTKEEPATNRPKPMNLKPVATTTFSPHIETKNDSSNVLPNKTICGKVKNKGVSDRIVGGSVVEVDEHPWLARIQHKFDDNTIFGCSAALITNLYLLTAAHCVQNHKIIPFSVRLGEWNTKTDIDCRNNICNNSTVDININKIIVHPKYDGKLGHNSDIALIRLRDPVNFTDFIQPICLPASKYIAMQDSVINNAYWTAGWGETEYEEESVIKRQVQLNSVPIEICRAHFKVAPETEPNIICAGGIKGKDTCNGDSGGPLVKIESENYEENWYMFGITSSGSKTCGREGVPGIYTRVTSYIDWILENVKE-