Monarch geneset OGS2.0

DPOGS206224
TranscriptDPOGS206224-TA3180 bp
ProteinDPOGS206224-PA1059 aa
Genomic positionDPSCF300334 - 134702-140422
RNAseq coverage32x (Rank: top 75%)
Annotation
HeliconiusHMEL0112310.069.60% 
BombyxBGIBMGA009743-TA3e-8970.23% 
DrosophilaCG8213-PC3e-15856.02% 
EBI UniRef50UniRef50_F4WIT93e-17438.10%Serine proteinase stubble n=1 Tax=Acromyrmex echinatior RepID=F4WIT9_ACREC
NCBI RefSeqXP_002048702.13e-16359.80%GJ21187 [Drosophila virilis]
NCBI nr blastpgi|3320257271e-17338.10%Serine proteinase stubble [Acromyrmex echinatior]
NCBI nr blastxgi|910876810.041.05%PREDICTED: similar to CG8213 CG8213-PA [Tribolium castaneum]
Group
Gene OntologyGO:00038242.3e-87catalytic activity
GO:00042524.1e-79serine-type endopeptidase activity
GO:00065084.1e-79proteolysis
KEGG pathway 
InterPro domain[802-1057] IPR0090032.3e-87Peptidase cysteine/serine, trypsin-like
[812-1052] IPR0012544.1e-79Peptidase S1/S6, chymotrypsin/Hap
[845-860] IPR0013142e-11Peptidase S1A, chymotrypsin-type
Orthology groupMCL25643 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206224-TA
ATGACATCACTCTTAGTCATTTCACTGGAATTCCAAGAAGACAGAAAACTGTTTGGCGGTTACAAAATAACACCGTCGTATTGTAAGGCAAGCAGAGCGGCGAGGTACAACCGGGGTAACACTATATGCATGTTCAACCACGAGTGTGTTCGAAGAGGTGGTGAAGTGGTCGGTTCCTGTATGGACGGCTTCCTGTTTGGGGCGTGCTGCCAGTTACCTTCCAGCAGTCAGTCACATATACCAAAAGGACCCGGTGTTGTTATGACGAGCTACATAGACTATCCGGATGCAGAAGCAGAAACTGACGATTACGACGCTGAACATCTAAGCGCTTACCATAACAGCTTCAAACCTGTAGTGACGCCGGGATATAAGCCTAGTAGTAGCGTTTCGACGTCCACTGTGAGGTCTGAGTCAGACGCGAGCACGGAGATTCACCAAGCTGAAATAATATCCGAAGGTCTCACCCAAATCACAAACACCTTACTAAGTACGCCGCCCAAGGAGGACAGTTTCATATACATAAAGCCTCAAGGAGTCTACACACACAGCACGATCAGTCACCCGGTCGCTGATACTATTCTGTTCCACAAGAACGGTTCGATGGTTGACGACATAGCGAGACCATCAGACTTCAATGTACAAATATCTTCAATGCAAACCAAACCCACCGTGTCGCCGAGCACGAGTTCGGTCTCGTCTCCCGGTATAATAGTTTCGTCGACGCACAGACCTATATTCAAACCGAAGCCGAATAACAAAGTATCGACAAAACGACCGACCACCGACAACTATGTCATGGTTCAGACGGTCACCAAAGACGCTCAGAAGGTGCCGGAGCTGTCTTCGATTAACAGCATCATACAGATGCTCAATGACAGTACTCCGAGTCTTAGTGATGATGTAAGTTCACCCTCTTCGATCGATGTCATGGAAACTAAATCCTCGCCGAGCCCTTCCACAGTCACTCCAGTGTTGTACAGCAGCAGTTACCCTATTTTCACAACCGGACACTACGTCACTTTAAAACCATCTTCATTTATAAGTAGCGTCTCACCGATAGCTGGTACAAAGAAACCGTTAACAACTAAAAAACCGTACATTACAATAAACACCACACCGAACAGTGCGGGAGGAAAGCCGTCGAAGCCTTATAATTCTTCTCCCAGACCAAACCAATCAACCAGTCAGGCTATCGAAGCTTTCAATAACTATCCAACCGATCCGCAAGACTTTGGACAATCAATCACCACATTCAGCTATGTGAGTTCAACGACAACTTTGAAACCAACGTCCACGACCAGAAAACCGCCTTCGACGAGTTACGTAACCGGATCGAAACCCTTAAGAAGACCAGCTACTCCGCCGACGAGTTTCGTATCTTCCTATGAAGCTGCATCAGACACTTTCTCGAGTGTGACCCCAACCGTTATAGTGCTAAATGGACTCAGCACAAAACCAGAATCCTCATCAGAGGATACGGAATTTGTTGAAATATCACAGGAGCCCTTCAAGAAACCAGTCAGCCAAATTACTGTAAACAACCATATAGAATCTACAAACAATATCTACATGGGTAAACCGCCGCAGACGTACGATCAACCGAAGCCTTCGAGACCATCTTCTCCTACCGTTGTCATAACCCCTAAACCCTCACCAACCACGCCCTATCCCATCAAAGGATCAACTCGTCCCGTTCCAATCACACCGAACGTGCCTCTCTACGATTCCTACCCAGACTTCTCACCGACAACGACCTCTAAAACAGAAATGCAGACCTCTCCCGATGACCTCATAAACTTCCCTCCCGTCAGGAATCCTCTCCTCAACGCGACGGGATCCAACCCTGCTCTGTATAACACGAGCGTAGCCATTGACAACGACTTAGATATTCTACACGACGTAGACTTCTCGACGCCGACCTGGCAGGACGACGAGAAGCTGGGCGAGAAAATGAACTTGTTCGTTAACAAGATCGTCGGCAGCCTTCAGGGCTCGTTCCAGGATCTTCACGACATAGTTGTGTTGGATAAGAAACCCAGTTCCACACTGAACCGTGACAAAACGACAACCGCCAAGCCGCCGAAGAAAACCGTGCCAACAAGAAAACCTGTTACCACCAAGAAACCTTTGAGATTGTCCACAACGTCCAAGAAGCCTCCGGTGAAGACGACGAAGAAGCCGCTCAAGACCACCACCGTCCCCAAGAAACCCACCACGATCACCACTCAGACGCCCACCACCACCGTTATAACCACGACCACAACCAAAAAGCCGGTGACCACCACCAAGAAACCCATCAAGAGAGTGACCACCAGCCTCGTCACCACCGTCACAGAACAGTACGATGACGTCACCACCGAGGGATACTCAGAGCCTATCGATTACAACGACAAGAATTTGTGCGGCGTGCGGCCGCTGATGAAGTCCGGTCGCATCGTGGGCGGCAAGAACGCCAGGTTCGGGGAGTGGCCCTGGCAGGTGCTGGTGCGCGAGTCCACGTGGCTGGGCCTGTTCACCAAGAACAAGTGTGGCGGAGTGCTCATCACCAACAGATTTGTGACCACGGCGGCGCATTGTCAACCCGGGTTCCTGGCGTCGCTGGTGGCGGTGTTCGGCGAGAACGACATCTCCAGCGACTACGAGCCCAAGAGACCCGTCACCAAGAACGTGAGGAGAGTCATCGTCCACCGCCAGTACGACGCCGCCACCTTCGAGAACGACCTGGCGCTGCTGGAGCTCGACTCGCCCGTACAGTTCGCCGCGCATATAGTTCCTATCTGCATGCCGCCTGATGACGCGGACTACACGGGCCGCGTGGCGACCGTCACCGGCTGGGGCAGGCTCCGGTACGGAGGCGGAGTCCCCGCGGTGCTGCAGGAGGTTCAGGTGCCGGTCATAGAGAACAGCGCGTGTCAGGAGATGTTCCACACGGCCGGTCACGCCAAGAAGATATTGAACTCGTTCATATGCGCTGGATACGCCAACGGGCAGAAGGACTCCTGTGAGGCGAGAGGTGACAGCGGCGGGCCGCTGGTGCTGCAGCGCGACGACGGCAGGTGGCAGCTGGTGGGGACCGTGTCCCACGGGATAAAGTGCGCCGCGCCCTACCTGCCCGGCGTCTACATGAGGACGACGTACTACAAACCCTGGCTGAGATCGATCACCGGAGTTCGTTGA

Protein sequence:

>DPOGS206224-PA
MTSLLVISLEFQEDRKLFGGYKITPSYCKASRAARYNRGNTICMFNHECVRRGGEVVGSCMDGFLFGACCQLPSSSQSHIPKGPGVVMTSYIDYPDAEAETDDYDAEHLSAYHNSFKPVVTPGYKPSSSVSTSTVRSESDASTEIHQAEIISEGLTQITNTLLSTPPKEDSFIYIKPQGVYTHSTISHPVADTILFHKNGSMVDDIARPSDFNVQISSMQTKPTVSPSTSSVSSPGIIVSSTHRPIFKPKPNNKVSTKRPTTDNYVMVQTVTKDAQKVPELSSINSIIQMLNDSTPSLSDDVSSPSSIDVMETKSSPSPSTVTPVLYSSSYPIFTTGHYVTLKPSSFISSVSPIAGTKKPLTTKKPYITINTTPNSAGGKPSKPYNSSPRPNQSTSQAIEAFNNYPTDPQDFGQSITTFSYVSSTTTLKPTSTTRKPPSTSYVTGSKPLRRPATPPTSFVSSYEAASDTFSSVTPTVIVLNGLSTKPESSSEDTEFVEISQEPFKKPVSQITVNNHIESTNNIYMGKPPQTYDQPKPSRPSSPTVVITPKPSPTTPYPIKGSTRPVPITPNVPLYDSYPDFSPTTTSKTEMQTSPDDLINFPPVRNPLLNATGSNPALYNTSVAIDNDLDILHDVDFSTPTWQDDEKLGEKMNLFVNKIVGSLQGSFQDLHDIVVLDKKPSSTLNRDKTTTAKPPKKTVPTRKPVTTKKPLRLSTTSKKPPVKTTKKPLKTTTVPKKPTTITTQTPTTTVITTTTTKKPVTTTKKPIKRVTTSLVTTVTEQYDDVTTEGYSEPIDYNDKNLCGVRPLMKSGRIVGGKNARFGEWPWQVLVRESTWLGLFTKNKCGGVLITNRFVTTAAHCQPGFLASLVAVFGENDISSDYEPKRPVTKNVRRVIVHRQYDAATFENDLALLELDSPVQFAAHIVPICMPPDDADYTGRVATVTGWGRLRYGGGVPAVLQEVQVPVIENSACQEMFHTAGHAKKILNSFICAGYANGQKDSCEARGDSGGPLVLQRDDGRWQLVGTVSHGIKCAAPYLPGVYMRTTYYKPWLRSITGVR-