Monarch geneset OGS2.0

DPOGS215001
TranscriptDPOGS215001-TA3105 bp
ProteinDPOGS215001-PA1034 aa
Genomic positionDPSCF300256 - 3550-18492
RNAseq coverage91x (Rank: top 63%)
Annotation
HeliconiusHMEL0109583e-10347.59% 
BombyxBGIBMGA012217-TA7e-10465.56% 
DrosophilamodSP-PA6e-3532.20% 
EBI UniRef50UniRef50_Q69BL07e-12051.42%Pattern recognition serine proteinase n=2 Tax=Obtectomera RepID=Q69BL0_MANSE
NCBI RefSeqXP_001607879.12e-5329.18%PREDICTED: similar to ENSANGP00000018359 [Nasonia vitripennis]
NCBI nr blastpgi|396550533e-11951.42%pattern recognition serine proteinase precursor [Manduca sexta]
NCBI nr blastxgi|396550531e-12453.19%pattern recognition serine proteinase precursor [Manduca sexta]
Group
Gene OntologyGO:00038242.8e-61catalytic activity
GO:00042521.2e-36serine-type endopeptidase activity
GO:00065081.2e-36proteolysis
GO:00055151.8e-10protein binding
KEGG pathwayaga:AgaP_AGAP0123721e-25 
 K04550 (LRP1, CD91)maps-> Malaria
    Alzheimer's disease
InterPro domain[766-1034] IPR0090032.8e-61Peptidase cysteine/serine, trypsin-like
[776-1030] IPR0012541.2e-36Peptidase S1/S6, chymotrypsin/Hap
[15-61] IPR0021721.8e-10Low-density lipoprotein (LDL) receptor class A repeat
[254-325] IPR0160604e-10Complement control module
[262-322] IPR0004365.4e-06Sushi/SCR/CCP
Orthology groupMCL17556 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215001-TA
ATGTTTGTTACATTGTTATTTCTGTCCGTCTTTCCAAAAATCTTCACCGCTGTCGTACCTCAAGCTGTTTGTTCTCCCGACGAGTTCACGTGTTCAGACGATGTCTGTATCAGCCAGGGTCTGGTGTGTGATGGTCACAGTGACTGCTGGAACGCAGCTGATGAAATGGCTTGCAACGGACTATCGGACCCGCTCTCCGATTTGATGATCCGCAGACCTAAACGTCAGACGCAAAACTGTCGCAAGAACCAGTGGCAGTGTCGTGACGGCACCTGCATAGGGTTCGACGGTAAATGTGACGGTGTGGTCGACTGTCCCGACTTCAGCGACGAGACCTTCGCGCTGTGCAGGGACATGCAATGCCAGAGCAATTGGTTCCGCTGTACTTACGGCGCCTGCGTCGACGGCAGCGCCCCTTGTAATGGTGTGCAAGAGTGCGCTGATAACTCCGACGAGTTGCTGCCTAGGTGCCGCAATCAAACAATTGGTTCCAGGGGTAAGCACACGTGCGACAATGGTCAGGTGATATCCTCGGTGGATATATGCGATGGGAAGAAGGACTGCGCTGATGGCTCTGACGAGACCCTCGCCACCTGCGCCGGGAACAGCTGTCCGTCATACGTGTTCCAATGTGCGTATGGAGCCTGTGTGGACCAGAACGCGAAGTGCAACAAGGTGGAAGAGTGTGCTGATGGTTCTGACGAAACAGACGAGCTCTGCAACAGGCTGGCGCCGGGTCAGCCGGTGACTCCAGCCACGAGACCACCACCTCAGGGGGGTAATTGTCTGTTGCCTCCATACCCTCAGTATGGGTCGTACAAGGTCAGACAGTACCCCAACGCGGTCCCCGGCCAGAGGTATCCCAACGTGAGGCTGGACGTCACCTGTAACCCTGGCTTCCAGACTGAAAACAATAACAGCATCTTCTGCGATAACGGAGAGTGGTCAGGACCTATGCCAGCGTGTCTCCGTTTCTGCAGGCTTAACAAACACCCGAGCGTGGAGTACCGCTGTCTGTTGTCTGGCAACTCGGTGACAGGGTCCAGAGAGTGTGGCTCATTGGAGCCGTCTGGGACCGTCGTCACCCCCATCTGCCGCTCCCCCAATTACTACTCCTCGGGGGTAATGTCCAACATGCACTGCGTTGAAGGCAGTTGGGACTATATAGCTGTGTGCAAACCAGGTTTGACCAACGTTACAATAAGTATAGATAGTTTAGAAATTATCATAACATCGGATAACGCCCACGTAATAATTAACAATTACGGGAACAAGGAGGTTAAGGTCGTCAACAATATTAGTAACGCTGATAGGATTGTGTTTGAAGACAGTAGAACGACCACCAGTAGACCAACCGCTAGTAGAACGACTACCAGTGGACCGACTAGCGCTAATTATGATAATGAAATCGATGAGGGTGACTGGAGAATGGCCTCCGTTGACACAATAGGTTTCCAAGCTCAGCCCGTCCGGCCCAAAAAGTGCGGTACAATAACTCCTGAGGGTATCCAGCTGGTGATCGGCGGGCGGTCTGCCAAGCGCGGGGAACTCCCGTGGCATGCGGGGATTTACAGCAAATTATTCACACCTTACATGCAGATATGTGGCGGGTCGCTCATCAGTACAACCACTATTATATCCGCCGCACATTGTTTCTGGAGCGACACCAAGAAGCTGCTGCCCGCGTCCGAATACGCGGTGGCTGTTGGGAAGCTGTACCGACCTTACAACGAAAAACACGACGCTGACGCGGAGAAATCTGATGTACGACGAAAATATATCACAAGCAATACGCTTAACAAACACCCGAGCGTGGAGTACCGCTGTCTGTTGTCTGGCAACTCGGTGACAGGGTCCAGAGAGTGTGGCTCATTGGAGCCGTCTGGGACCGTCGTCACCCCCATCTGCCGCTCCCCCAATTACTACTCCTCGGGGGTAATGTCCAACATGCACTGCGTTGAAGGCAGTTGGGACTATATAGCTGTGTGCAAACCAGGTTTGACCAACGTTACAATAAGTATAGATAGTTTAGAAATTATCATAACATCGGATAACGCCCACGTAATAATTAACAATTACGGGAACAAGGAGGTTAAGGTCGTCAACAATATTAGTAACGCTGATAGGATTGTGTTTGAAGACAGTAGAACGACCACCAGTAGACCAACCGCTAGTAGAACGACTACCAGTGGACCGACTAGCGCTAATTATGATAACGAAATCGATGAGGGTGACTGGAGAATGGCCTCCGTTGACACAATAGGTTTCCAAGCTCAGCCCGTCCGGCCCAAAAAGTGCGGTACAATAACTCCTGAGGGTATCCAGCTGGTGATCGGCGGGCGGTCTGCCAAGCGCGGGGAACTCCCGTGGCACGCGGGGATTTACAGCAAATTATTCACACCTTACATGCAGATATGTGGCGGGTCGCTCATCAGTACAACCACTATTATATCCGCCGCACATTGTTTCTGGAGCGACACCAAGAAGCTGCTGCCCGCGTCCGAATACGCGGTGGCTGTTGGGAAGCTGTACCGACCTTACAACGAAAAACACGACGCTGACGCGGAGAAATCTGATGTGGCAGATATTATAATTCCGTCCCGCTTTCGAGGGTCTGGTGCCAACTTCCAGGATGACATCGCGCTGGTTTTGGTCGTGACGCCCTTCATATACCAGGTCTTCATTAGACCTGTCTGTCTGGACTTCGACGTCAACTTCGACAGAACCCAGCTCTCGGAAGGGAATATGGGCAAGGTAGCCGGCTGGGGTCTGACTGACAAAAACGGTAAAGCGTCCCAAGTGCTGAAGGTGGTAGATCTTCCTTACGTCAAAATTGAGGACTGCTACGCCATGTCCCCGCCGACGTTCCGCGCTTACATCACAAGTGACAAGATCTGCGCCGGTTACACTAACGGCACGACGCTCTGCCAGGGCGACAGCGGCGGCGGCCTGGCGTTCCCCGCCTACGAACTCAACACCCAGAGGTACTACCTGCGAGGCATCGTGTCCACAGCTCCCAGGAACGACGATCTTTGCAACGCCCACACCCTCACCACGTTTACGGCTGTATCGAAACACGAGCATTTCATCAAACAGTACCTCTAG

Protein sequence:

>DPOGS215001-PA
MFVTLLFLSVFPKIFTAVVPQAVCSPDEFTCSDDVCISQGLVCDGHSDCWNAADEMACNGLSDPLSDLMIRRPKRQTQNCRKNQWQCRDGTCIGFDGKCDGVVDCPDFSDETFALCRDMQCQSNWFRCTYGACVDGSAPCNGVQECADNSDELLPRCRNQTIGSRGKHTCDNGQVISSVDICDGKKDCADGSDETLATCAGNSCPSYVFQCAYGACVDQNAKCNKVEECADGSDETDELCNRLAPGQPVTPATRPPPQGGNCLLPPYPQYGSYKVRQYPNAVPGQRYPNVRLDVTCNPGFQTENNNSIFCDNGEWSGPMPACLRFCRLNKHPSVEYRCLLSGNSVTGSRECGSLEPSGTVVTPICRSPNYYSSGVMSNMHCVEGSWDYIAVCKPGLTNVTISIDSLEIIITSDNAHVIINNYGNKEVKVVNNISNADRIVFEDSRTTTSRPTASRTTTSGPTSANYDNEIDEGDWRMASVDTIGFQAQPVRPKKCGTITPEGIQLVIGGRSAKRGELPWHAGIYSKLFTPYMQICGGSLISTTTIISAAHCFWSDTKKLLPASEYAVAVGKLYRPYNEKHDADAEKSDVRRKYITSNTLNKHPSVEYRCLLSGNSVTGSRECGSLEPSGTVVTPICRSPNYYSSGVMSNMHCVEGSWDYIAVCKPGLTNVTISIDSLEIIITSDNAHVIINNYGNKEVKVVNNISNADRIVFEDSRTTTSRPTASRTTTSGPTSANYDNEIDEGDWRMASVDTIGFQAQPVRPKKCGTITPEGIQLVIGGRSAKRGELPWHAGIYSKLFTPYMQICGGSLISTTTIISAAHCFWSDTKKLLPASEYAVAVGKLYRPYNEKHDADAEKSDVADIIIPSRFRGSGANFQDDIALVLVVTPFIYQVFIRPVCLDFDVNFDRTQLSEGNMGKVAGWGLTDKNGKASQVLKVVDLPYVKIEDCYAMSPPTFRAYITSDKICAGYTNGTTLCQGDSGGGLAFPAYELNTQRYYLRGIVSTAPRNDDLCNAHTLTTFTAVSKHEHFIKQYL-