Monarch geneset OGS2.0

DPOGS213109
TranscriptDPOGS213109-TA1014 bp
ProteinDPOGS213109-PA337 aa
Genomic positionDPSCF300016 + 141063-142076
RNAseq coverage47x (Rank: top 71%)
Annotation
Heliconius% 
BombyxBGIBMGA012478-TA6e-0624.75% 
Drosophilayip7-PA7e-1026.24% 
EBI UniRef50UniRef50_A7UVF52e-0929.91%AGAP011918-PA n=1 Tax=Anopheles gambiae RepID=A7UVF5_ANOGA
NCBI RefSeqXP_001605614.13e-1326.41%PREDICTED: similar to serine protease [Nasonia vitripennis]
NCBI nr blastpgi|1955880985e-1028.28%GD13161 [Drosophila simulans]
NCBI nr blastxgi|1955880981e-1028.64%GD13161 [Drosophila simulans]
Group
Gene OntologyGO:00038244.9e-23catalytic activity
GO:00042523.8e-20serine-type endopeptidase activity
GO:00065083.8e-20proteolysis
KEGG pathwayoaa:1000874154e-07 
 K01334 (CFD)maps-> Complement and coagulation cascades
InterPro domain[73-287] IPR0090034.9e-23Peptidase cysteine/serine, trypsin-like
[72-256] IPR0012543.8e-20Peptidase S1/S6, chymotrypsin/Hap
Orthology group 
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213109-TA
ATGAAGGGAAAATTGAGTATTTTGCGATTTTTAGTTCTTATATTGATTTTATTAAAGTTTAATGAAATTTATGCTAAAGGAAAAGAGAGAAGTTTAGATAATCAAGGCGACGATGATGAGAATGATGATTCCATGAATATTGCTGATGTTCCTGAACCGGAAGCAAATGTTACAGGTTGTACCCGAAAAAAGGCAATATTAATGCATGAGATTAGAACCGCCACACAGAATCAATTTCCATTTATGGCTTCGATAATGTCCTCTAAAAATGAATACGTCTGTGCTGGAACTGTAGTTGCCAATGGTCTTATTCTTACCACAGCTCAATGTACTGACTCAATAAGCTACGTGTTAGTAAACGGAACGAGTGATAAAAAAGATGATACTACATTTTCGCTGCATGTAATCAAAAGTGAAAATTTTCCGTCTTATACTGGACCTAACACCGGAAAAGATGTGGCAGTTATTTATACCGAAAAACATAACAATACATTAGCTTCCAGGATTAATCTCAGTAACTATACTTCTGCTCAACTTTTAAATGACGTTGAAGTTTTAGGATTTGGCTTAAACGCCGAAGTAGGACAGCCAAAAAAATTACAGTATGTGGGCTTAGAAGCTAGAGAGAAAAATGAAGACGTAATTAGGGCATACATAGATTGTATTGAAACAAAAGTTTCAACATGCTTTAGAGACAAAGGAGGTCCAGTTTTATTTAATAATGAACTCGTTGGAATAGTAACTAAAGGTCAACCAGAGTGTACTGTAGAAATATTATCAACGTACGCTTTGAATAAACATATGGTAGATGCTCTACCAACGTATTCTTTTAAAGCTTGGATCGATGAGAAAATAGCGAAAGTTGCCGAGCCTGGAGTGGAAGCTTTAGAGGCGTATCCAAAGAAACCTAGCTATAGAGAAGCGGTGCTGCATATAATGACAAGCTCGGGCTATTTAAAAAGTAATAACGTTTTCTTTATTACAATTGCTTACACATCAGTATTATATTTTTAA

Protein sequence:

>DPOGS213109-PA
MKGKLSILRFLVLILILLKFNEIYAKGKERSLDNQGDDDENDDSMNIADVPEPEANVTGCTRKKAILMHEIRTATQNQFPFMASIMSSKNEYVCAGTVVANGLILTTAQCTDSISYVLVNGTSDKKDDTTFSLHVIKSENFPSYTGPNTGKDVAVIYTEKHNNTLASRINLSNYTSAQLLNDVEVLGFGLNAEVGQPKKLQYVGLEAREKNEDVIRAYIDCIETKVSTCFRDKGGPVLFNNELVGIVTKGQPECTVEILSTYALNKHMVDALPTYSFKAWIDEKIAKVAEPGVEALEAYPKKPSYREAVLHIMTSSGYLKSNNVFFITIAYTSVLYF-