Monarch geneset OGS2.0

DPOGS210980
TranscriptDPOGS210980-TA927 bp
ProteinDPOGS210980-PA308 aa
Genomic positionDPSCF300004 - 152051-157716
RNAseq coverage382x (Rank: top 31%)
Annotation
HeliconiusHMEL0250046e-10064.94% 
BombyxBGIBMGA006406-TA1e-12480.63% 
DrosophilaCG4914-PA4e-11466.90% 
EBI UniRef50UniRef50_Q9VUG25e-11266.90%CG4914 n=25 Tax=Neoptera RepID=Q9VUG2_DROME
NCBI RefSeqXP_968105.11e-12268.09%PREDICTED: similar to AGAP004571-PA isoform 1 [Tribolium castaneum]
NCBI nr blastpgi|3640236273e-15482.14%seminal fluid protein CSSFP038 [Chilo suppressalis]
NCBI nr blastxgi|3640236274e-15782.14%seminal fluid protein CSSFP038 [Chilo suppressalis]
Group
Gene OntologyGO:00038242.1e-95catalytic activity
GO:00042521.2e-91serine-type endopeptidase activity
GO:00065081.2e-91proteolysis
KEGG pathwaybta:5335475e-44 
 K01324 (KLKB1)maps-> Complement and coagulation cascades
InterPro domain[58-301] IPR0090032.1e-95Peptidase cysteine/serine, trypsin-like
[66-296] IPR0012541.2e-91Peptidase S1/S6, chymotrypsin/Hap
[93-108] IPR0013142.3e-17Peptidase S1A, chymotrypsin-type
Orthology groupMCL14852 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210980-TA
ATGTGGAAGTGCTTTCTTTTATTTGTGTTTTTGTTAACTTTCACTCTATCGGAGGGTGATTTATTGCGTACGAAGCGTGGTGTGTATTCGAAGAATTTCTTCGGTGGTGTTTGGGGCAACCGACCACCACTACTTGAAGCGGGCCAGGCCAAGACTACGTGCACATGTAAATGTGGCGAAAGAAATGAAGTCTCCCGCATCGTAGGGGGTGAGGAGGCTGGTGTCAATGAGTTCCCTTGGGTTGCCAAAATGACATATTTTAAAAAGTTCTACTGCGGCGGTATGCTGATCAACGACAGATATGTTCTTACCGCAGCACATTGTGTGAAAGGATTTATGTGGTTCATGATAAAGGTGACTTTCGGTGAACACAACCGTTGTAACGCGACCACGCGCCCCGAGACTAGATTTGTTATTCGCGTCATTGCCAACAAATTCTCTCTCGCCAACTTTGACAATGATATCGCCTTACTTCGTCTGAATGAGAGGGTTCCCATGACTGCTGCTATTAAGCCTATATGCTTGCCAAGTGACGATAGTGACCTCTATGTGGGTGTTAAAGCAGTGGCTGCAGGATGGGGAACGTTGACGGAGGAGGGAAGAGTATCGTGCACACTGCAGGAAGTTGAGGTGCCAGTATTGAGTAATGAAGAGTGTCGCAATACTAAGTACACTTCCTCAATGATCACTGACAACATGCTGTGCGCGGGATACCCCAAGACGGGACAAAAGGATTCCTGTCAGGGAGACAGTGGTGGTCCGCTCATCACAGAGAGAAAGCACGACAAACGCTATGAGCTAATCGGTGTCGTATCTTGGGGTAACGGATGTGCTCGGGTGGGTTACCCTGGCGTCTACACACGGGTTACCAAATACATAGACTGGATTAAGGAAAATACTAAAGACGGGTGTTTTTGTACAGATTAA

Protein sequence:

>DPOGS210980-PA
MWKCFLLFVFLLTFTLSEGDLLRTKRGVYSKNFFGGVWGNRPPLLEAGQAKTTCTCKCGERNEVSRIVGGEEAGVNEFPWVAKMTYFKKFYCGGMLINDRYVLTAAHCVKGFMWFMIKVTFGEHNRCNATTRPETRFVIRVIANKFSLANFDNDIALLRLNERVPMTAAIKPICLPSDDSDLYVGVKAVAAGWGTLTEEGRVSCTLQEVEVPVLSNEECRNTKYTSSMITDNMLCAGYPKTGQKDSCQGDSGGPLITERKHDKRYELIGVVSWGNGCARVGYPGVYTRVTKYIDWIKENTKDGCFCTD-