Monarch geneset OGS2.0

DPOGS204095
TranscriptDPOGS204095-TA1161 bp
ProteinDPOGS204095-PA386 aa
Genomic positionDPSCF300553 + 4701-8919
RNAseq coverage95x (Rank: top 62%)
Annotation
HeliconiusHMEL0173777e-12078.63% 
BombyxBGIBMGA013962-TA6e-10775.86% 
DrosophilaCG5255-PA3e-4639.39% 
EBI UniRef50UniRef50_D6WK625e-5445.49%Serine protease P145 n=5 Tax=Tenebrionidae RepID=D6WK62_TRICA
NCBI RefSeqNP_001166054.12e-5746.61%serine protease 120 [Nasonia vitripennis]
NCBI nr blastpgi|3504006112e-5646.81%PREDICTED: tripartite motif-containing protein 2-like [Bombus impatiens]
NCBI nr blastxgi|2891913352e-5646.19%serine protease 120 precursor [Nasonia vitripennis]
Group
Gene OntologyGO:00038241.6e-81catalytic activity
GO:00042522.3e-72serine-type endopeptidase activity
GO:00065082.3e-72proteolysis
KEGG pathway 
InterPro domain[126-370] IPR0090031.6e-81Peptidase cysteine/serine, trypsin-like
[144-362] IPR0012542.3e-72Peptidase S1/S6, chymotrypsin/Hap
[172-187] IPR0013149.5e-10Peptidase S1A, chymotrypsin-type
Orthology groupMCL24753 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204095-TA
ATGGAAAGATTGTCTCATGACTTAAACCTGGCGCTGGAAGAGAGTAATTGTGGCAGACAGGAGAGACGGAAACTGGGACTGCGAAGACGTACGAGGTCCGCTGGGAACTTGCCAGCTGCTGTAACAGTAAGTCTGGTCGAAGATGGCAGTTCAAGCAGTCCCCCACAAGCTCCACTCATCACTCAACCTCTCTCCGACTCTGATGATCCACATCTCAGCATACACAAGTCTACCAACCTGAAGAGTAGACACTACTGTCAGATTGGCAACATTGAATCTGATTCATTCAATGAAAATTTCTCACCAACCCGTCCAGCGAATGCGAGAAGGAAACGGAAGTATAAGAAAATGCCAGTTGAATATACGGATGGAAAATGCTCACCACCGATTGAAATATTAAGCATTGCTACCGAGAAGAACGACTGGGAGAGGATCGTGGGGGGAACCAAGGCCCCTAACGGCTCCGTGCCTTACCAGGTGTCTCTACGAATTTGGGGGGTTCGTCATTTCTGTGGGGCCTCCCTCATAGCCCCGAGGGTAATACTAACAGCTGCGCATTGTGTCGATGGTTTAAAACCCCAGTACTTTCAAGCCATAGTGGGGACTAACCAGCTCCTAGCGGGCGGCACCGCTTACACCATCCGTAAGGTCGTAAGGCATAAAGAGTATGACGAAGAAATAATTGTCAACGATATAGCTATCATTTTTACTGATAAGGAAGTAGAATTCAATGATAAAGTAGATGCTATTGAGTTGAATGATGAGCCAGTAGACGCGGGAGAAGAATTACTCTTGACAGGATGGGGTACTACATCTTATCCCGGGCATCTCCCCAACGATTTAATGCAATTACAACTAAAGGCGGTTTCCTACGAAGACTGTAAAGAGGCACATAACTCAACTAACGCTGTGTTCGAAACAGAAATATGCGCAATGACAAAATCTGGAGAAGGAGCCTGTCATGGTGACTCCGGCGGTCCGTTGGTGCGTGAAGGTCGCCAAGTGGGTATCGTGTCATGGGGCATTCCCTGCGCTAGAGGGAAACCGGATGTTTACACGAAGGTGGAATCCTATATGGCGTGGATAGAACAGGTTCTGAATGAGGATGATGAAGACCACCTCCACTTCATACAAGGACACAGTCGCAAAAACAGACATTAA

Protein sequence:

>DPOGS204095-PA
MERLSHDLNLALEESNCGRQERRKLGLRRRTRSAGNLPAAVTVSLVEDGSSSSPPQAPLITQPLSDSDDPHLSIHKSTNLKSRHYCQIGNIESDSFNENFSPTRPANARRKRKYKKMPVEYTDGKCSPPIEILSIATEKNDWERIVGGTKAPNGSVPYQVSLRIWGVRHFCGASLIAPRVILTAAHCVDGLKPQYFQAIVGTNQLLAGGTAYTIRKVVRHKEYDEEIIVNDIAIIFTDKEVEFNDKVDAIELNDEPVDAGEELLLTGWGTTSYPGHLPNDLMQLQLKAVSYEDCKEAHNSTNAVFETEICAMTKSGEGACHGDSGGPLVREGRQVGIVSWGIPCARGKPDVYTKVESYMAWIEQVLNEDDEDHLHFIQGHSRKNRH-