Monarch geneset OGS2.0

DPOGS214246
TranscriptDPOGS214246-TA2982 bp
ProteinDPOGS214246-PA993 aa
Genomic positionDPSCF300014 + 1248224-1255101
RNAseq coverage727x (Rank: top 18%)
Annotation
HeliconiusHMEL0050030.070.17% 
BombyxBGIBMGA005965-TA0.064.11% 
DrosophilaCG3107-PA4e-1731.19% 
EBI UniRef50UniRef50_E2BTY20.040.68%Uncharacterized protein C05D11.1 n=2 Tax=Neoptera RepID=E2BTY2_HARSA
NCBI RefSeqXP_001653480.10.044.01%hypothetical protein AaeL_AAEL008862 [Aedes aegypti]
NCBI nr blastpgi|1571199980.044.01%hypothetical protein AaeL_AAEL008862 [Aedes aegypti]
NCBI nr blastxgi|1571199980.044.14%hypothetical protein AaeL_AAEL008862 [Aedes aegypti]
Group
Gene OntologyGO:00468725.6e-31metal ion binding
GO:00038245.6e-31catalytic activity
GO:00065084.3e-08proteolysis
GO:00042224.3e-08metalloendopeptidase activity
GO:00082704.3e-08zinc ion binding
KEGG pathway 
InterPro domain[20-246] IPR0112495.6e-31Metalloenzyme, LuxS/M16 peptidase-like, metal-binding
[22-234] IPR0112371.2e-26Peptidase M16, core
[195-368] IPR0078634.3e-08Peptidase M16, C-terminal
[49-133] IPR0117655.3e-07Peptidase M16, N-terminal
Orthology groupMCL17359 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214246-TA
ATGTCTCATTTTAAACTAATATCGTCAACAAAGGCTTCCGATGTGATACCTGTAAACAAATATTTGTCCGAAAAGACTGGCTTAACCGTAATTATAGCAAACGTTGAAGGACCTGTTGTAAAAGGATTTTTTTGCTTAGCAACGGAAGCTCACGATGATGACGGTTTGCCTCATACATTGGAACACTTGATCTTTTTGGGATCAGAGCGTTACCCTTACAAGGGTATTCTCGATCTTTTGGCGAACCGATGTATGGCTCACGGAACGAACGCGTGGACGGATGTAGACCACACTTGTTATACTATATACACTGCGGGAGATGCGGGTATGTTGACTCTGTTACCCATCTACCTGGACCATATACTGAGACCAACTCTTACGGATCAAGGATTTCTGACGGAGGTTCATCATGTTGATGGTGACGGAGATGACGCTGGTGTGGTGTACTGCGAGATGCAGGGTAGGGAGAATACAGCGGATAGTAAATGTGAGTTAAGAATGCTCCGTGCTATGTATCCCAATAATGGCTATTCTTCTGAAACTGGGGGTATCATGAAAAACCTGAGGGAGTCCACTGATAATACTAAAGTGCGAGATTTCCACAAGAAATTCTATAGAGCTGAAAACCTAACAATAATTCTAACAGGACAAATTGACGCCCAAGATGTTTTCAATGTTCTCACCACAGTTGAGGATGACATCATTGCTAAGCGGGAGAAGGAATCTCAGGAAGAGTGGGTGAAACCCTGGCAGACTATACCCCCACCACCAGCTTATGGAGAACTTATAGAGAAGTGGCCAGCGGATACCGAAGACTGTGGACAGGTATTGTTCGGTTGGCGTGGACCTCTGTTGATTCAGGTCGGTGCGTTGCACGAGTTGACTGCTTGTGCGGTGCTGCTGCGGTATCTATGCGACACGGCTGCGGCGCCGCTACAGCGTGCACTTGTCGAGAGAGAGGACGCGTTGGCTGGAGATGTATCATACAATCTCACAGAGAACATGGCCTCATTGATTAAGATAGAGCTGGATAATGTACCAGTTGATAAACTGACTCAAGCTAAAGAGGAGGCGCTGAAGAGTTTGAGAAGCGTCAGGTCCGGGGAGGAGGCTATCAATATGGACCGCATGAAGAGATTACTCAGGAAACAGTTGAGGGAATGTATGGCCAGCCTTGAATCTGAACCACATCATGCTGTGGCTTTTAGATGTATCGGAGATGCACTTTATTCCCAAAATGAAGACGATTTTATAAAACGGATGAATCCACAACAAACGATGCATGATCTACTAAAAGAGAGCAGTGAATTCTGGGTTGATTTGTTGAACAAGTACTTCAATGATGATCTGGTGGTCATAGTTGGATCACCTAGCATTGAGTTGCAAGCAAAAAGACTCCTTGAAAAAATCTTCTTGGTTAAGGAATTTATAGAACCTGATGTATATTATTTCCAGCGCCCTCCTCCCCCCGGGACCCTGGCGTCTGTTCCTGTACCGTCCTGTGACTTCAAGTGTCATTCCATCCGGTCATGGAGTTCCGGAGAAGACTGTCCATATCTCGACCTAAAACAAATGCCGCTGTTCACCAGACTGCACAGCCTAACAACTAATTTTGTATATGTTAGTGCTATGTGTGAAATATTATTATATACAAAATTACCTATCGGTTCTCATGCACTTGACAGTTATTGGCTCCCTCTACATATGAACGCCTTGGGCGAGTGCGGCGTCTGGCGCGGGGACACCTTGATACCCCATCAAGACGTTATATCGACAACGGAACAGCTCACTGTGTCTTTCCAGAAAGATATTGGTTTCGGCAGAAGCGGGAACTTCTCTGTGGGACAGTTCGGAAACTTTATCAATATTGACGTTAGGTGTGAACCGGCAGATTACGAGGAAGTTGTAAACCATCTCTATGAAGTTTTGTACTGTGCTGAAATTACTAAGGAGAGATTATTGGTGTTCGCCCAGAGACTGATTAATGAGGTTTCACAGACGAGAAGAAACGGGCACAAGATGGTTCACGATTTACTGAGAGATTCTCTATACAGTAAAGATAGTAATGTCCACTGGTGCACGGTGCTAAGACAACAGAAGTTCCTCAAGGAGCTCATGGAGCAGCTTAACGCTGGCGGGGACTCGGCCGACGGCGCCATCTCCGACGCCAAGAGGACGTTCAAAGACATCACAGAGAACGCCTGGCTCCACCTCGCCAGTGACTTCGACAGATACAAGCTGAGTGCTGCGCCTTGGAAGAGATTCGCCAGAGAAAATGAAATTGTACCGGCGGAACCTCGTCGCTACTTGGACAGTGAGCTGTTGAGCGAGTGTCGTGTTAAGGCGGTGGTTGGTGTTGGAGGTCTGGAGTCGTCGTTCGCGGCCCAGGCCAGCCCGGGGCCCGTCGGCTTCGATATCAAAAATAACGCCCCGCTCGCTGTAGCACTCAACTACTTCACACAACTTGAGGGTCCAATGTGGCGTCTGATCCGTGGCGGAGGTCTCTCGTACGGGTACAGCATGTGCGAGGCGTCCGCTGAAGGAAGAGTATTCTTCTCATTATATCGCGCCACCAACGCGGTCGCTGCTTACACTAAGGCCAAATCTATCGTTGAGGAGTATTTGTCTGATGGTAAATTCGATGAGGACTTGTTCGCATCAGCCAAGAGTGCGATGGTGTTCGAGACGGTGGAGGCGGAGAAGTGTCCCGCCGACGTCGTGAAGCAGTCGCTCCTGAACTACATGAGACAAGTCGGTGACGATTACAACAGGAAGCTGGTGTGTTCGCTGTCGTCGGTGTCCCCGGAACAGGCGGCGGCGGCGGCGGCTCGCTGGCTCCCGGGACTCTTCTGTCCCGAGAACGTCGCCCTGGCCCTCGTCTGCCACCCCGCCAAGGTAGCCGACATGCAGGCCGCCTTCCAAAAGATAAATATTCCGCTAGACGCATACGAGTCTATGGAGGCGTCCTACATCAACAACTAG

Protein sequence:

>DPOGS214246-PA
MSHFKLISSTKASDVIPVNKYLSEKTGLTVIIANVEGPVVKGFFCLATEAHDDDGLPHTLEHLIFLGSERYPYKGILDLLANRCMAHGTNAWTDVDHTCYTIYTAGDAGMLTLLPIYLDHILRPTLTDQGFLTEVHHVDGDGDDAGVVYCEMQGRENTADSKCELRMLRAMYPNNGYSSETGGIMKNLRESTDNTKVRDFHKKFYRAENLTIILTGQIDAQDVFNVLTTVEDDIIAKREKESQEEWVKPWQTIPPPPAYGELIEKWPADTEDCGQVLFGWRGPLLIQVGALHELTACAVLLRYLCDTAAAPLQRALVEREDALAGDVSYNLTENMASLIKIELDNVPVDKLTQAKEEALKSLRSVRSGEEAINMDRMKRLLRKQLRECMASLESEPHHAVAFRCIGDALYSQNEDDFIKRMNPQQTMHDLLKESSEFWVDLLNKYFNDDLVVIVGSPSIELQAKRLLEKIFLVKEFIEPDVYYFQRPPPPGTLASVPVPSCDFKCHSIRSWSSGEDCPYLDLKQMPLFTRLHSLTTNFVYVSAMCEILLYTKLPIGSHALDSYWLPLHMNALGECGVWRGDTLIPHQDVISTTEQLTVSFQKDIGFGRSGNFSVGQFGNFINIDVRCEPADYEEVVNHLYEVLYCAEITKERLLVFAQRLINEVSQTRRNGHKMVHDLLRDSLYSKDSNVHWCTVLRQQKFLKELMEQLNAGGDSADGAISDAKRTFKDITENAWLHLASDFDRYKLSAAPWKRFARENEIVPAEPRRYLDSELLSECRVKAVVGVGGLESSFAAQASPGPVGFDIKNNAPLAVALNYFTQLEGPMWRLIRGGGLSYGYSMCEASAEGRVFFSLYRATNAVAAYTKAKSIVEEYLSDGKFDEDLFASAKSAMVFETVEAEKCPADVVKQSLLNYMRQVGDDYNRKLVCSLSSVSPEQAAAAAARWLPGLFCPENVALALVCHPAKVADMQAAFQKINIPLDAYESMEASYINN-