Monarch geneset OGS2.0

DPOGS208871
TranscriptDPOGS208871-TA1302 bp
ProteinDPOGS208871-PA433 aa
Genomic positionDPSCF300009 - 1556376-1558722
RNAseq coverage123x (Rank: top 57%)
Annotation
HeliconiusHMEL0171042e-12651.16% 
BombyxBGIBMGA008101-TA2e-3655.28% 
Drosophilagd-PA6e-2526.58% 
EBI UniRef50UniRef50_A1IIA62e-11847.59%Serine proteinase n=1 Tax=Samia cynthia ricini RepID=A1IIA6_SAMCR
NCBI RefSeqXP_320261.42e-4337.45%AGAP012277-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1213088686e-11847.59%serine proteinase [Samia cynthia ricini]
NCBI nr blastxgi|1213088685e-11747.59%serine proteinase [Samia cynthia ricini]
Group
Gene OntologyGO:00038241.2e-58catalytic activity
GO:00042528.8e-38serine-type endopeptidase activity
GO:00065088.8e-38proteolysis
KEGG pathway 
InterPro domain[157-424] IPR0090031.2e-58Peptidase cysteine/serine, trypsin-like
[169-419] IPR0012548.8e-38Peptidase S1/S6, chymotrypsin/Hap
Orthology groupMCL25422 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208871-TA
ATGAAAGCGATATATTTAGTTTTAGCGTTTTTTGTTTTGGAGGTGCATTCAATGTCCCTCGTGGGGCCGGTTGCGACATGGTACCGGCCGTGCGGACTTGGAGCTGTGTTTTTTAAAAACCTTACTTCAAACTTTTGGCTCGCAAAAATAAATGTCAGTTTATATAATTTAGACAAAGCTAATATATCGATTCAATTCGAACAAGAAGTGCAGATTGTGGCGGTCCCAATAAAGTCTTTAATCAAGTTTTATAAAAAGTCGAATACTTACACATTTAATTCGCTGGAAACGATTCCTAATGAATATAGTATTTATATGAAAATTGTCAACGGCTCCGGTACGGACATACCGAAAGTTTCAAGTATTAAATTGAATAATATGGTGCTGTGTAACGAAACTGTAAAGAATACGGTAAATGTTAAGTCCTACAATGTTACTAAAGAAAATGACAACACCTTTAAATATATGTGCGGCCACCGATCGTTAAAAAGCTCAGAAGTTAATCAAGTCATGGGAGATGCCAAGGCTGGTGACTGGCCCTGGCATGTAGCTATACTGATAAGGAGAGGAACAAAAAATCTCGCCAACTACCAATGTGGAGGAACTATTATTTCTAGTACCGCAATTCTCACTGCCGGTCATTGTGTTTTCATAAATGGGACACGTATTGAAAGTGAAAAACTTGTAATCGAAGCTGGTGTGGTCGATCTCAGGGCAAAAGACCAAAAAGGAAAACAAACACTAAACGTTGACAAAGTGATTTTGCACCCCGAGTACAGTATAGAACACGCAAGTTCAGATCTCGCTATTCTTGTGGTCAATAAACTACGGTACACTGAATATGTCCAACCAATTTGTATTTGGGGGCCAGTGTATGACAAAATAACGCTCTTTGGGCGGACAGCTATGATTACTGGGTTTGGAACAACAGAAAACGACGTACTTTCAAACACTCTCAGATCTGCGTACACTACTATACAAAATGACACCACTTGCATTGCGTTCAACCAAAATTTATATTCAAAATTGCTAAATGAATTCACATTTTGTGCTGGTTTAGGACCCGAAGTTGGAGTGAATCCTCGCAACGGGGATAGTGGGGGAGGCTTAACGGTACCAGTGGTGCGAGCTGACAACAAAGTGACCTGGTTTCTACGAGGTGTTCTGTCCAAATGTGGCTTACCTACCGGTCACAAATTATGCTCTCCTAATTTTTACGTAGTTTACACAGATGTTGCTCCTCATTACGGCTGGATATACCATAACGCGGGATTATATTTTTCAAGTAACATCATTTATTGA

Protein sequence:

>DPOGS208871-PA
MKAIYLVLAFFVLEVHSMSLVGPVATWYRPCGLGAVFFKNLTSNFWLAKINVSLYNLDKANISIQFEQEVQIVAVPIKSLIKFYKKSNTYTFNSLETIPNEYSIYMKIVNGSGTDIPKVSSIKLNNMVLCNETVKNTVNVKSYNVTKENDNTFKYMCGHRSLKSSEVNQVMGDAKAGDWPWHVAILIRRGTKNLANYQCGGTIISSTAILTAGHCVFINGTRIESEKLVIEAGVVDLRAKDQKGKQTLNVDKVILHPEYSIEHASSDLAILVVNKLRYTEYVQPICIWGPVYDKITLFGRTAMITGFGTTENDVLSNTLRSAYTTIQNDTTCIAFNQNLYSKLLNEFTFCAGLGPEVGVNPRNGDSGGGLTVPVVRADNKVTWFLRGVLSKCGLPTGHKLCSPNFYVVYTDVAPHYGWIYHNAGLYFSSNIIY-