Monarch geneset OGS2.0

DPOGS214422
TranscriptDPOGS214422-TA5055 bp
ProteinDPOGS214422-PA1684 aa
Genomic positionDPSCF300069 + 430410-450604
RNAseq coverage8087x (Rank: top 2%)
Annotation
HeliconiusHMEL0108740.061.01% 
BombyxBGIBMGA011362-TA0.052.36% 
Drosophilascaf-PA3e-4236.74% 
EBI UniRef50UniRef50_D2A6A63e-6148.13%Serine protease H164 n=2 Tax=Tribolium castaneum RepID=D2A6A6_TRICA
NCBI RefSeqXP_966561.12e-6248.13%PREDICTED: similar to AGAP008091-PA [Tribolium castaneum]
NCBI nr blastpgi|910850634e-6148.13%PREDICTED: similar to AGAP008091-PA [Tribolium castaneum]
NCBI nr blastxgi|2700092954e-5948.12%serine protease H164 [Tribolium castaneum]
Group
Gene OntologyGO:00038242.7e-46catalytic activity
GO:00042525.6e-30serine-type endopeptidase activity
GO:00065085.6e-30proteolysis
KEGG pathwayecb:1000566503e-17 
 K01324 (KLKB1)maps-> Complement and coagulation cascades
InterPro domain[1182-1442] IPR0090032.7e-46Peptidase cysteine/serine, trypsin-like
[1200-1429] IPR0012545.6e-30Peptidase S1/S6, chymotrypsin/Hap
Orthology groupMCL25840 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214422-TA
ATGAAGCCACTGCTATCAGTGGTGTCGCTGTGCCTTCTGCTGTGCGTACACGCGCTGCCGGAAAACATCGAAGACGTCAAAGAATTACCTCAATCCAAAGAACCGATAGTGGAAGCAGAAGAATCGAAGCCTCAAGCGAGAGCCGAAAGATGCACGACCTGTAGCACACTTAAACTAGGCCTTAAGTCCCCAAAAGAAGTATTAGCAGCCTTGCATTCCTTGCCGGGAGCGGAAGTACACACTCAGCAGTCCTTCGAAGGCTGCTCCAGCGATAAAGGATGTGCAGGACTGAAGCTCAAGGACGGAAAAGTCATAGAACGTTTCGGGAACGTAGAGGCCTTCAAAGCCGCAGCTGCCTCCGATGTGAACAACGAGTTCAACTTCCACGCCGGTTTCGGAAGCGATTTGTTCAAAAGCGCCAACTCACCATTCTGGTGGATGAACCAAGACAGTCCTTTCAATGGCGGAGCTAACGGTGCTAGCTTTGAGAAATTCAGCAAATCCTCAAGTTTCTCCTCCGGTAGTGGAGGTAACGCTGCATTCACTGGAATGGACCTATCAGCTAACCCGTTCTTAAACGGACAGTTTTCCAACCTGGGTCTTGCTGGTGATGCTAGCTCTGTAAATCCCTTCCAATCATCAACCTTCGAGTCATCGGCATTCAGTGCCTCTAGCAAAACCGGTCAGGCTGGACTCCAAGGTTTCGGAGCTCAAAACGCTGCTTCGAACTTCGGAGCTAGTGCCTTCAACTCTGGTTTCGCTGGCAATAAATTATCTGGCTTCAGTGGTTCAAGCCCAGCACCTTTCGGATCTGGAGCTAACGTCAACCTCATTCAGAACGCCCAAAAGAACGACTTCGATTTCGAGCAACAGCAAACTCAACAGAACATCGACGAAATCTTCCAAAACGCTGGAAACCTCGGCGTGGACGCGGGGGTTACAGCTGGGGAGTTGCAGCAGACCTGCTCCGGCTTGGGATACGCCTGCGTGCTGAAAACACAATGCAACAACGGCGTCGTAAACATCAACGGAGCGGCTGCGCTACAAGCTAAAACTAAGAAGCAATACTGCAACCTTGCGACGGAAATATGCTGCAGAATCGAGACCGCTCAAGGAGCCGTTGGATCTACCGCCGGCCAGGGTTCTGGTCTCTTCGCTGGACAAATTGGTTCCGTATCCGGAGGTTACGGCAGTCAGACAACCCAAAGTGGATTTTCAAACGGATTTGGTGCTAAAGGCACCTTTGCAACCGGAGCTTCTTTCGGATCTGGCATCGGATCTGCCAACAGAGGCATCACGGTCGAATCCACTAAATTCGGATCTGGATATGGCTCGACTCTCGCACCCACCACATCCAGATTTGGATCCAACGGCTTCAAATCCACGAGCCAAACGAACTTCATCGACGCTGATTCACTAACCGCTGGCAGTGAAGCTGCTGGTGTTTACCGACCTGGTGCTGTCGGATCTGGTTTGAAACCTGGTATCCCCTACCTCCCACCCATCGACGTCACAGGCAGTGGCAGTAATGTCGTCTCCACAACCGTCTTCCCTACCCCTACCATAATCACGACTCCTAGACCATTCACAACCCCAAAACCGACCTATCTGCCCCCTATTTCATCAACCTCAGCCCCAGGTTACTTACCACCTATCGGGGAACCAACCAACAACAGAGAGACCATCGTCCCTAAACCTGATTATCAAGATGGTTCTATAATCCTGGACGAAAACAGATTCCCCACAGCTAGACCTACCCCCGTGCCTGCACCGAGTGAAATCCCCGCTGGATGTGCCGCCGCCCTAAAGTGTACTGCCGTCGAGTTCTGCACAGCTGAAGGTGTGATCTCAAACGTTACTGTCTTTTTGACCAGAGATCAAGAGGCTTACAGAGTACCTCTCACGGATTGCCGTGACTTGGAGACTGGACGCATTGGTAAATGCTGCCGGGATCCTTACTACACCGACCCCTGGCCTGTGAACCAGCTGGGTAAGTGGGTGCCCGGGGTATTCGGGGGTAACGACGGTAAATACGTTCCGGATAGCAGAGTTAGTCCAAACAATATCAGACCCAGTGTCACGGTCCGCCCTCCTGTCACCGGTTCCGTCATATCACCAGCCTTCCTGACTAAACCCACGCCTACACCATTTGGGCCCAACCAAGTTTCTCCTGGTTTTGGCTCCACTGTAACTCCATTGAATCAGAGAGGTCAGGGTCAGTTCCCTATCGGAGGTCAAGGACAATACAATAAAGGTGGTCTGGGACAATTCTCTCAAGGGGGACAGGGGCAATTCACATCAGCTGGACAAGGACAACTTGGCATCATAGGACAAGGTCAAATTGGATCTGGATCTGCAATCAACACAGCGTTCGCCCAAGGACAGGTTGCACAAAAGGGACAAGGGTCGTTTGTGTCCCAAGGACAGGGAGTGGTTGCATCCAGGGGCCAAGGTCAAGTGGTAAACAGAGGTCAAGTTAGCAAGGGACAAGGTTTCTTGGTGAATCAGGGTGCGGGGGTTGGAATCAATAAACAGCAGGGACAGTTCGTCAGTCAGGGTCAAGGACAAATAGTGTCCCAAGGTCAAGGACAAATTGTTTCCCAAGGAGTGGGACAGGGAGTCAGACAAGGAGTCGGGCAATACGGCCAAGGACAGCTTGGTATCCAAGGACAAGGTGTCCAGTCGCAATTTGGTGCTGGGCAAAATGGCTTAGGCGTAGCAGCAATTGGAGCACAAGGAGTGAACGGTCAGGGACAGCTCGTAAATCAAGGACAGGGCCAATTCGTATCAAAAGGTCAAGGCAGCGCTATCAATCAAGGATTTGGTACTGGCATCCGTCAGGGAAGCGGCGTGGTTGCGTCTCAAGGATTCGGGCAGGGAGTGCGACAAGGACAAGGCACGGTTGTGTCGCAAGGATTCGGTCAGGGAGTCCGTCAAGGACAAGGACTCCTGGTCAATCAGGGAGAGGGACAAGTATCTTCGCAAGGACAGGGACAATTTGTAAGCCAGGGTCAAGGACAACTCCTAAATCAGGGACAGGGACAATATGTATCGCAAGGTGAAGGACAGCTAGTCTCTCAAGGACAGGGTGCTCTTGTGTCCCAGGGTCAAGGACAGCTGGTCTCTCAAGGACAGGGTGCTTTTGTATCCCAGGGTCAAGGATCTCTTGTTTCCCAAGGATTCGGACAAGCCATCCGTCCAGGACAAGGCGCTTTCCTGACTAATGGCCAGGGACAAATAGTCTCTCAAGGAGGAGGAGCTCTGATCAATCAGGGTGAGGGAGCATACGTCACAAATGGCTTCGATCAAATCCGCCGAGCTCAAGCCCAACTCGTATCTACAAAGGAAGGGCAGTTGGTTACGCAAGGGGAAGGAGAGCTTGTTTCACAAGGCCAGGGGCAGAGAGTGTCGCAAGGATTCGGTCAGGGTGTCCGCCAGGGGCAAGGATTCTCTGTGACGCAGGGCGGAGGGTATGGTGTTGAAAACGAGTACGGTGAATCAGTGCAGAGGGTTTTCCTTCAACAGTACAACGCTGGAGGACAATGTGGTGTTCTGAATGGCCAACGTCCTTTTGGCAACCGCAATGAATTGGAAGCCGATTTCGCTGAGATACCCTGGCAGGCGATGGTGCTGTTGCAAACTAACAGAAGCCTGCTGTGCGGCGGAGTCATCACCAGACCTGATGTGGTCGTAACCTCAGCCGCCTGTGTTGAAGGCCTGGATGCCAAGAACGTGCTGATTAAAGGAGGTGAATGGAAGCTCGGGATAGACGACGAGCCTCTGCCGTTCCAGATCGTCCAGGTCAAGACGATTCTCCGCCATCCGCTGTACAAACACAGCAACCTCCACTACGACGCTGCTATCCTGGTACTCGCTGAGAACTTGAGATTCGCTAAAAACATCTATCCCATCTGTCTCCCTGACAAGGATGACAGTTTGGACAAATACTACAACGGCGTCGGAGAGTGTATCGTAACGGGATGGGGCAAGCAAGTCCTCCAAGCTCACCTTCAAGGCAGTATAATGCACAGCATCAACGTCTCGCTCATCAGCCCAGGTGAATGCCAGTCCAAATTATCATCAGAATACCCTCACCTCCTGGACCTGTACGATGAAGACAGCTGCGTCTGTGGCCAACCTTCGAACCCTCTAAATAATATTTGCAGGGTTGACATTGGCAGTGCTCTTGCCTGCACGACTGGCGACGGTCATTACACCTTCCGAGGAGTGTACTCCTGGGATTCCGGATGTCAAGTCGGAAACCAAGTGGCTGGTTTCTATAGATTCGACCTGGAATGGTACCAGTGGGCCATCGGTCTCATCGAAAGCGTCAGATTCGCTCAATACAGTACAGTTACCAAGGTCACCACGGGGATATACACTGGTCAAATAAAGGGTGGAGTGAAGGGCTTCTCTGGAGTCAAAGGAGTCAAGGGTTCGTCAAACTCTGGCTCATCCATCAGAGCTGGTGCTGTAGCTTCAGTTTCATCTGGAGCAGTCTCTTCGGGATCATCAGGAGTCATAAGTGGCCTTAATAGCTTCAACTTTGGAAAAGGTCAATTCGGATTTGGACAAAGTCAAGGCCAGCTATCTGGTAACCAAGGGCTGGTCAGTCAGGGACAGTTCGCTGGTAAAGTGAACCAGTTCCAGGAAAAAATAAACAGTGGTAGCTCAAGCCAAGCCGGTTTTGGAGATGGATTCAACTTCAGCGAAATCAAACCGATCACTAACGGCTTCAGCGCCACCTTCTCCGAGAAGAAGGTCTTCAAGACCGAACCGAAATTCGTGACATTCACAACGAAACCAGAGATCGTGACGTATACAACTAAACCAGAAATCTTCACATTTACAACCAAACCCAAAATTATTACTTACACAACCAAACCCAAAATCATAACCTACACAACCAAACCCCAGATCATCAGATACGAGACATCCGGCAGTGGGACCAACCCCCAATACGTAGCCCCAGGGGTGACCTTCAACCCCTCCTTTTCAGAATTAGTGGGTAAGCACGAACACACAGCCAAATGCAAATGTTTAGAAGGTAAATGA

Protein sequence:

>DPOGS214422-PA
MKPLLSVVSLCLLLCVHALPENIEDVKELPQSKEPIVEAEESKPQARAERCTTCSTLKLGLKSPKEVLAALHSLPGAEVHTQQSFEGCSSDKGCAGLKLKDGKVIERFGNVEAFKAAAASDVNNEFNFHAGFGSDLFKSANSPFWWMNQDSPFNGGANGASFEKFSKSSSFSSGSGGNAAFTGMDLSANPFLNGQFSNLGLAGDASSVNPFQSSTFESSAFSASSKTGQAGLQGFGAQNAASNFGASAFNSGFAGNKLSGFSGSSPAPFGSGANVNLIQNAQKNDFDFEQQQTQQNIDEIFQNAGNLGVDAGVTAGELQQTCSGLGYACVLKTQCNNGVVNINGAAALQAKTKKQYCNLATEICCRIETAQGAVGSTAGQGSGLFAGQIGSVSGGYGSQTTQSGFSNGFGAKGTFATGASFGSGIGSANRGITVESTKFGSGYGSTLAPTTSRFGSNGFKSTSQTNFIDADSLTAGSEAAGVYRPGAVGSGLKPGIPYLPPIDVTGSGSNVVSTTVFPTPTIITTPRPFTTPKPTYLPPISSTSAPGYLPPIGEPTNNRETIVPKPDYQDGSIILDENRFPTARPTPVPAPSEIPAGCAAALKCTAVEFCTAEGVISNVTVFLTRDQEAYRVPLTDCRDLETGRIGKCCRDPYYTDPWPVNQLGKWVPGVFGGNDGKYVPDSRVSPNNIRPSVTVRPPVTGSVISPAFLTKPTPTPFGPNQVSPGFGSTVTPLNQRGQGQFPIGGQGQYNKGGLGQFSQGGQGQFTSAGQGQLGIIGQGQIGSGSAINTAFAQGQVAQKGQGSFVSQGQGVVASRGQGQVVNRGQVSKGQGFLVNQGAGVGINKQQGQFVSQGQGQIVSQGQGQIVSQGVGQGVRQGVGQYGQGQLGIQGQGVQSQFGAGQNGLGVAAIGAQGVNGQGQLVNQGQGQFVSKGQGSAINQGFGTGIRQGSGVVASQGFGQGVRQGQGTVVSQGFGQGVRQGQGLLVNQGEGQVSSQGQGQFVSQGQGQLLNQGQGQYVSQGEGQLVSQGQGALVSQGQGQLVSQGQGAFVSQGQGSLVSQGFGQAIRPGQGAFLTNGQGQIVSQGGGALINQGEGAYVTNGFDQIRRAQAQLVSTKEGQLVTQGEGELVSQGQGQRVSQGFGQGVRQGQGFSVTQGGGYGVENEYGESVQRVFLQQYNAGGQCGVLNGQRPFGNRNELEADFAEIPWQAMVLLQTNRSLLCGGVITRPDVVVTSAACVEGLDAKNVLIKGGEWKLGIDDEPLPFQIVQVKTILRHPLYKHSNLHYDAAILVLAENLRFAKNIYPICLPDKDDSLDKYYNGVGECIVTGWGKQVLQAHLQGSIMHSINVSLISPGECQSKLSSEYPHLLDLYDEDSCVCGQPSNPLNNICRVDIGSALACTTGDGHYTFRGVYSWDSGCQVGNQVAGFYRFDLEWYQWAIGLIESVRFAQYSTVTKVTTGIYTGQIKGGVKGFSGVKGVKGSSNSGSSIRAGAVASVSSGAVSSGSSGVISGLNSFNFGKGQFGFGQSQGQLSGNQGLVSQGQFAGKVNQFQEKINSGSSSQAGFGDGFNFSEIKPITNGFSATFSEKKVFKTEPKFVTFTTKPEIVTYTTKPEIFTFTTKPKIITYTTKPKIITYTTKPQIIRYETSGSGTNPQYVAPGVTFNPSFSELVGKHEHTAKCKCLEGK-