Monarch geneset OGS2.0

DPOGS206129
TranscriptDPOGS206129-TA3102 bp
ProteinDPOGS206129-PA1033 aa
Genomic positionDPSCF300028 + 1041061-1051589
RNAseq coverage303x (Rank: top 37%)
Annotation
HeliconiusHMEL0028240.069.87% 
BombyxBGIBMGA000719-TA0.082.88% 
DrosophilaS1P-PA0.047.85% 
EBI UniRef50UniRef50_Q147030.055.48%Membrane-bound transcription factor site-1 protease n=81 Tax=Coelomata RepID=MBTP1_HUMAN
NCBI RefSeqXP_001812491.10.057.99%PREDICTED: similar to membrane-bound transcription factor protease, site 1 [Tribolium castaneum]
NCBI nr blastpgi|2700035630.058.20%hypothetical protein TcasGA2_TC002816 [Tribolium castaneum]
NCBI nr blastxgi|2700035630.058.31%hypothetical protein TcasGA2_TC002816 [Tribolium castaneum]
Group
Gene OntologyGO:00042522.8e-69serine-type endopeptidase activity
GO:00065082.8e-69proteolysis
KEGG pathwaytca:1001423460.0 
 K08653 (MBTPS1)maps-> Protein processing in endoplasmic reticulum
InterPro domain[10-742] IPR0155000Peptidase S8, subtilisin-related
[162-466] IPR0002092.8e-69Peptidase S8/S53, subtilisin/kexin/sedolisin
Orthology groupMCL14222 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206129-TA
ATGGGGCTCGTTCAACTTGTTTATTTGTTTTGGTTAAGTTATTATAATTTTGTGGTTTTTGCTGAGGATACCAATATCCTTTGTAATGTGACGGTTAACGAGCGTTTGGAATATAAATTTGATTCAGATATTGTCAACACTGAACATATAATTACATTCAAAGGATATTATTCCAAAACTACCAGAGAAAACTATGTGAATGCTGCACTGAAAAATGCCCAGGTATCAAATTGGACCATACTCCAGCGTAATAATCCCGCTATGGAATATCCTAGTGACTTCGACGTCATAGTGTTCGGGGAGAAGATAAGGGAGGGGATCGATGCTTTACGTGACCACCCAGCTGTACGCCGGGTAACTGCGCAGCGGCAGGTGCAACGGACCATAAAATACGTGCGCGAGGATGACTGTGGGCCGTCTGGTTGCATGTACTCCGGATGGAGGAACCACCGCCGTTCGAGGGTGCTTCATTCATTACGTAAAACTAGAGAAAATGGAGGCTACACCTCTAGAAAACTTCTCCGTACTGTACCTCGTCAAATAACATCTGTTCTGAAAGCTGATCTGCTGTGGTCTTTGGGAGTAACCGGGGAGGGCATCAAAGTGGCGGTGTTCGATACGGGACTAGCGCGACACCATCCCCACTTCGGGCGGGTTAGGGAGCGTACAGACTGGACCGGCGAGAATACATTGGACGATGCCTTAGGTCACGGCACCTTCGTAGCTGGTGTGATAGCGTCTCGTTCGGACTGCCTCGGCTTCGCTCCGGACGCGGACCTACACATCTTCAGAGTTTTCACAGATAATCAGGTGTCATACACTTCGTGGTTCCTGGACGCATTTAACTACGCCATAATGCGTAAGATAGATGTCCTGAACCTCAGTATTGGTGGTCCAGATTTTATGGACCATCCGTTTGTGGATAAAGTATGGGAACTTAGCGCTAACAAGGTTATAATGGTCTCTGCTATCGGCAATGACGGCCCATTATACGGGACCCTGAACAATCCAGCTGATCAGATGGATGTCATCGGAGTGGGAGGCATCGGGTTTGATGATCGCATCGCCAAGTTCTCGTCGAGAGGCATGACGACCTGGGAATTACCTTATGGCTACGGTAGAATGAAACCAGACATCGTGACCTATGGCAGCGGCGTCCGTGGTTCAAGCGTTAATGGCGGCTGCAGATCACTCAGTGGTACGTCTGTAGCTTCCCCAGTGGTCGCTGGTGCTATAGCACTCCTCGCTAGTGGTGTTCCCCGTCAGAATTTAACACCAGCTGCTGTCAAGCAAGCTTTGTGCATAACAGCACGCCGTTTGCCCGGTTATAATATGTTTGAACAGGGACACGGGAAACTAGACCTTATTAGCGCGTACCAGTTTCTTCGCGAGTACGAGCCGCAAGCGACTTTGAGCCCATCATACATTGACCTCACCGAGTGTCAGTACATGTGGCCGTATTGCACTCAGCCGCTCTACTATAGCGCTCAACCCACCATCGCCAACGTCACCGTTATCAATGGGCTCGGCGTGGTGGGTGAAGTGAAAAAGGTCAGCTGGCATCCTCATTTGCCTCACGGTACAATACTGGCTGTTGGGGCGGACTACAACGAAGTGCTTTGGCCTTGGTCCGGATGGTTGGCACTCAGCTTCACAGTTTTGGAAGCGGGCGCTAACTTCGACGGCGTCGTTGAAGGTCACATGAACATTACGATTGAGAGTTACGACGAGGTCAATGACCGTGTCATGAAAAATACGACTCTCATGCTTCCAATACGTGCTCGCGTTATCCCGGTGCCAGTACGCGGTCGTCGTCTGTTGTGGGACCAGTTCCATAGTCTCCGGTACCCTGGCGGTTACTTCCCGAGGGATGATCTTCGTGCCAAACACGATCCACTCGATTGGCACGCCGACCACGTGCACACCAATTTTAGAGACATGTATAGAAGATTAAGGGAGCATGGATTTTATGTCGAGGTTATGGGTAATCCCCTAACTTGTATCGACACTTCGTTGTATGGAGCGTTGCTGCTCGTTGATCCCGAGGACGAATACTTCCCCGAAGAAATGGCGACTTTGAAGAGGGCTGTAGACTCCGGTCTTTCACTGATTGTTTTTGCGGACTGGTACAATGCTTCCCTGTTGAGACACGTCAAATTCTATGATGAAAATACACGACAATGGTGGATTCCTGAAACTGGTGGTACAAACGTTCCGGCGCTGAACGACCTACTAAGCATGTTTCAAGTAGCGTTTGGTGATCGCGTGTTTGAGGGGTCGTTCAAGTTGGCTGGCCATCCAATGTACTACGCTAGCGGCACACACATACATAGCTTTCCAGAACATGGTGTCTTGGTGTCAGCGAAGCTATCGGATCAGGGGCAGCAGATAATGTCAGGCGAAAAGTCTGGAGGGGGTCAGACTCGTAAGACGGTGGAAGTGCCGATATTGGGATTGCTGCAGACTGACCCTGAAACGCGTGACTACACCAATGACACTAATGATAAACTACCCAAGGCTGGGCGATTGGTTGTTTACGGCGACTCCTCCTGTCTGGAAGGAGGAGCGGCCAGACCTTGTCACTGGTTACTTCTGGCAGCTCTGCAATACGCATTGGTCGGACATATGCCGTCATCGCTCTTGGACGCAACGACATCTACACAACACAGAGACGTTAACATAATACCATCAGATCTCCCGAAGCGTGCTGAAGGTGGTCGTCTCCACGCGTACTCTCGGGTTCTGTCACCAGATGGCAGCGGTCCGAGACCATTGCCCGATTGCGTGGTGACAAACCCCATGGACCCTGAACCCGTACATGCACCACCATCCGCTAGGACCCTTGCACCAAGACACAAACCCACCGACCCCAAGAGCATTGGCGCACCGGAAATCGAAGGCACGGAAGCAGCACCCCGAGCGTGGCGTGGAGCTGGAGTCGCAGCAGCTCGCAGCGTCGAGGCCGATCCCATCCAGACATCATTCATCAGTCGACTCATATCAATATGCTCCGTGTTCGTGATAATATATTGCATTGCTGTATTCTGGAAACGATGTGCCCGTATTATCAAGAGACGCAGACTTGTCTCACTGGCCACCTAG

Protein sequence:

>DPOGS206129-PA
MGLVQLVYLFWLSYYNFVVFAEDTNILCNVTVNERLEYKFDSDIVNTEHIITFKGYYSKTTRENYVNAALKNAQVSNWTILQRNNPAMEYPSDFDVIVFGEKIREGIDALRDHPAVRRVTAQRQVQRTIKYVREDDCGPSGCMYSGWRNHRRSRVLHSLRKTRENGGYTSRKLLRTVPRQITSVLKADLLWSLGVTGEGIKVAVFDTGLARHHPHFGRVRERTDWTGENTLDDALGHGTFVAGVIASRSDCLGFAPDADLHIFRVFTDNQVSYTSWFLDAFNYAIMRKIDVLNLSIGGPDFMDHPFVDKVWELSANKVIMVSAIGNDGPLYGTLNNPADQMDVIGVGGIGFDDRIAKFSSRGMTTWELPYGYGRMKPDIVTYGSGVRGSSVNGGCRSLSGTSVASPVVAGAIALLASGVPRQNLTPAAVKQALCITARRLPGYNMFEQGHGKLDLISAYQFLREYEPQATLSPSYIDLTECQYMWPYCTQPLYYSAQPTIANVTVINGLGVVGEVKKVSWHPHLPHGTILAVGADYNEVLWPWSGWLALSFTVLEAGANFDGVVEGHMNITIESYDEVNDRVMKNTTLMLPIRARVIPVPVRGRRLLWDQFHSLRYPGGYFPRDDLRAKHDPLDWHADHVHTNFRDMYRRLREHGFYVEVMGNPLTCIDTSLYGALLLVDPEDEYFPEEMATLKRAVDSGLSLIVFADWYNASLLRHVKFYDENTRQWWIPETGGTNVPALNDLLSMFQVAFGDRVFEGSFKLAGHPMYYASGTHIHSFPEHGVLVSAKLSDQGQQIMSGEKSGGGQTRKTVEVPILGLLQTDPETRDYTNDTNDKLPKAGRLVVYGDSSCLEGGAARPCHWLLLAALQYALVGHMPSSLLDATTSTQHRDVNIIPSDLPKRAEGGRLHAYSRVLSPDGSGPRPLPDCVVTNPMDPEPVHAPPSARTLAPRHKPTDPKSIGAPEIEGTEAAPRAWRGAGVAAARSVEADPIQTSFISRLISICSVFVIIYCIAVFWKRCARIIKRRRLVSLAT-