Monarch geneset OGS2.0

DPOGS211291
TranscriptDPOGS211291-TA3657 bp
ProteinDPOGS211291-PA1218 aa
Genomic positionDPSCF300161 + 356606-370364
RNAseq coverage40x (Rank: top 73%)
Annotation
HeliconiusHMEL0118730.056.55% 
BombyxBGIBMGA012511-TA0.059.58% 
DrosophilaCG7402-PA1e-10338.09% 
EBI UniRef50UniRef50_Q8MPH93e-15449.90%Glucosinolate sulphatase n=3 Tax=Plutella xylostella RepID=Q8MPH9_PLUXY
NCBI RefSeqXP_975218.24e-11142.11%PREDICTED: similar to arylsulfatase b [Tribolium castaneum]
NCBI nr blastpgi|224501231e-15349.90%glucosinolate sulphatase [Plutella xylostella]
NCBI nr blastxgi|224501237e-15150.00%glucosinolate sulphatase [Plutella xylostella]
Group
Gene OntologyGO:00081528.8e-107metabolic process
GO:00038248.8e-107catalytic activity
GO:00084844.7e-65sulfuric ester hydrolase activity
KEGG pathwaydme:Dmel_CG74021e-101 
 K01135 (ARSB)maps-> Lysosome
    Glycosaminoglycan degradation
InterPro domain[699-1213] IPR0178508.8e-107Alkaline-phosphatase-like, core domain
[574-589] IPR0178496.8e-96Alkaline phosphatase-like, alpha/beta/alpha
[134-590] IPR0009174.7e-65Sulfatase
Orthology groupMCL22297 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211291-TA
ATGGTGGATGATATGGGATGGAATGATGTTTCGTATCATGGATCTAATCAAATATTGACGCCTAACTTGGATGTATTGGCCAGTAGGGGTGTGATACTTCAACAGTACTACAGCGAGGCGATATGCACTCCGGCACGTACAGCGCTACTCACCGGCAAATACCCCATGCGACTAGGAATGCACGGAATGCCGCTATATAATTCAGAAGATCGAGGCATACCGGTGTCAGAGCGACTGCTGCCTTCATACTTAAAGGAAAGAGGTTACAAAACTCATTTGGTCGGTAAGTGGCATGTGGGTATGTCACGAAATCAGTTCCTACCAACCAGAAGGGGATATGACAGCCATTACGGAATGCTCGGCGGATTTGTAGATTATTACACGTACAATAAAGTTGAACTGGGATGGAATGATGTTTCGTATCATGGATCTAATCAAATATTGACGCCTAACTTGGATGTATTGGCCAGTAGGGGTGTGATACTTCAACAGTACTACAGCGAGGCGATATGCACTCCGGCACGTACAGCGCTACTCACCGGCAAATACCCCATGCGACTAGGAATGCACGGAATGCCGCTATATAATTCAGAAGATCGAGGCATACCGGTGTCAGAGCGACTGCTGCCTTCATACTTAAAGGAAAGAGGTTACAAAACTCATTTGGTCGGTAAGTGGCATGTGGGTATGTCACGAAATCAGTTCCTACCAACCAGAAGGGGATATGACAGCCATTACGGAATGCTCGGCGGATTTGTAGATTATTACACGTACAATAAAGTTGAACTGTTGCCTAATGGGAAAGAGTTCTATGGAGCTGACCTTACGGATAACGATATCCCACAAGACGACGAGGACCGATATATTGTAGACGCACTAACTGAAAGGGCTATAGACATCATACAAAATCACAATGACTCAAGTCCAATGTTCCTTCACTTGGCGCATAACGTACCACATGCGGGCAACGATGGAGGGCTCCTCCAGCCTCCAAATGTACCACTGTCCAAGAGAAATCAACACATTGCTCATTCTAATAGAAGACTCTATGCAGAAATGGTTACTCATTTGGACCTCAGTGTTGGAAAAGTTGTAAAGGCTTTAGCAGATAACGGAATGTTGCAAAACACTATCATCATATTCGCGTCTGATAACGGAGCGCCGACTGTGGGTATGTTTAATAACTGGGGAGTGAATTTACCTTTTCGAGGGAAGAAGCAAACTCCTTGGGAAGGGGGCGTTAGAGTCCCGGCCTTTATATGGCATCCTTCATTAAGACCGAAAGTTTGGGATGGTCTGATGCACGTTACCGATTGGCTCCCCACTCTCGTGGGGGCTGTTGGGGGTGAAGTGAATGTCCAGATTGACGGTGTCAACCAGTGGGATTCTATATCAAAAGATGCAAAACCTAAAAGAAAAGAAGTATTGATTGCTATTGAAGACAGTGATACCAATATTTACGCCGCTTTTAGAGCTGGTGATTATAAGATCGTTGTTGGAAATGTGACCGGCTTAAGCAACGGTTACTATGGGGCTGACTTCATGACCTATAGAGCGTGCCCACCTGATTATTTCACTACTCTCAAGTCTTCAGAGGTAGCTAAGGTTTTCGAATCATTTAATATGAAATTGGACTACGACGAAGTGTTGGCTATGCGAGAAGCAAGTATTATCAAACAAACAGACCCAGTACGAGACCTCATTCCGTGTGAGCCTAGTCCTGAACGTGGTTGTCTATACAATGTCAAACGGGATCCGTCGGAGAGCCACGACTTATGGAGCAGAGGAACTAAGATAACAGATTTACTGTGGAGTAGATTGAAGACCTTATGGTCAATGCAATTAAGAAGAGGTCCAGTAACGATAGACCCTCGGGCCGATCCAGCAAATTTCGGTTACAGATGGATGCCGTGGCTTAATGACAGTTTGCCAGCCAATACCTTGAATAACACTAATTCATCCAAAAATGAAATAGCTTCAAATTTTAGTGAAAAATATTATATAGTGCCCTATAGTGACGGTTCTTCAGATGGAAAGACAGTAACAACAACAGTTAACTGTCAAGATGTCAAAGTGAATATTATTAAAGGAGGTGCAACTCGTCCAAACATCGTCTTTTTTATTGCTGATGACATGGGGTGGGACGACGTGAGCTTTCACGGGTCGGATCAAATTTTGACACCGAATATAGACCTGCTCGCCTATACCGGCGTCGCTTTAGAGAGATATTACAGTCATTGCATATGCACGCCATCGCGCGCCTCTCTCCTCACTGGGAAATTCGCACATGTCATAGGTATGCAGGGCTACCCATTGACAAATGCGGAAGATAGAGCACTACCTCTTGGAGAGAAAATTCTACCCCAATATTTAAAGGATCTCGGTTATGCCACACATTTGGTTGGAAAATGGCACGTTGGACAAGCAAGAGCCGAACATTTGCCCACATTCCGAGGTTTTGACACGCATTTCGGTCACAGGGGTGGCTATATAGATTACTACGAATACACGTTATTGGAAAACTGGGATGAAGGGGACGTTTCTGGATTTGATCTTTTCCGAAATATGACGGCTGCTTGGGAAGTTGAAGGATATATAACAGATGTTTATAATGAAGAAGCTAAATCAATTATAAAGGCACATGACGTCTCAAGGCCATTATTCCTTATGGTTGCACATAACGCTCCTCACTCCGCAAACGAAGGTGCTTTCTTGCAGGCACCGTCGGACGAGGTTCGAGCGATGCGGCATATTGAATTGCCACAAAGAAGATATTATGCTGCTATGGTAAAAAAACTTGATGACAGCATTGGAGACATCGTTAAAACCCTTTCCGAGAAGGGCATATTAGATAACACTATAATAGTATTCGTATCTGATAATGGTGGTATAACGTCACAGATGTCCGCTAATTATGCCTCCAATTATCCCCTGAGGGGACTTAAAATGAGTCCATTTGAAGGGGGTATCAGGGTAAACGGGCTGATATGGAGTAAAAATTTAACACAAAGTAACCATTTGTGGAAAGGCTACATGCATGTTTCTGATTGGCTGCCGACACTTTTGAAGGCTGTGGGAGCAGAATCGGCTAAGGAAATTGATGGTTTTGATTTATGGGATAATATAGTAACCAATACCATATCGAAAAGAGAGATGATTGTGGAAATTAATGATTATACTGGTTTTTACTCCATAACTCATAATGATTTTAAACTAGTAGTTGGTTCAGTATTAACTAGTTATAGTGATCATCAAGGGAAACAATTTAGGGGCATTATTGGTAAACCACCCTCATATGAAGATGCTATCAAGAAAAGCAAAATTTATTCCGTACTTTCGGATAATGGGATAAATTTTGGATTTAACGAGACAGCACTTAGAAATAAAATTAAAATTAAATGTAATGATTTGAAACCCAATCAAGAAATATGTTTTCCTTCAAAAGAGAAATGGTGTTTATTTAATATCAAAGAAGATCCTTGCGAAATAGTGGATTTAATGGACACTCACAGTGATGTTGCCAAAGAACTGCATACGAAATTGGAAAGAGAGATAGCCAGAACAATACCACGTACGATCCCTCATGAAACAAATCTAAAAGCTATGCCCAAATTCCACAATTATACTTGGGATATTTGGAAAACTTCGGATGAATAA

Protein sequence:

>DPOGS211291-PA
MVDDMGWNDVSYHGSNQILTPNLDVLASRGVILQQYYSEAICTPARTALLTGKYPMRLGMHGMPLYNSEDRGIPVSERLLPSYLKERGYKTHLVGKWHVGMSRNQFLPTRRGYDSHYGMLGGFVDYYTYNKVELGWNDVSYHGSNQILTPNLDVLASRGVILQQYYSEAICTPARTALLTGKYPMRLGMHGMPLYNSEDRGIPVSERLLPSYLKERGYKTHLVGKWHVGMSRNQFLPTRRGYDSHYGMLGGFVDYYTYNKVELLPNGKEFYGADLTDNDIPQDDEDRYIVDALTERAIDIIQNHNDSSPMFLHLAHNVPHAGNDGGLLQPPNVPLSKRNQHIAHSNRRLYAEMVTHLDLSVGKVVKALADNGMLQNTIIIFASDNGAPTVGMFNNWGVNLPFRGKKQTPWEGGVRVPAFIWHPSLRPKVWDGLMHVTDWLPTLVGAVGGEVNVQIDGVNQWDSISKDAKPKRKEVLIAIEDSDTNIYAAFRAGDYKIVVGNVTGLSNGYYGADFMTYRACPPDYFTTLKSSEVAKVFESFNMKLDYDEVLAMREASIIKQTDPVRDLIPCEPSPERGCLYNVKRDPSESHDLWSRGTKITDLLWSRLKTLWSMQLRRGPVTIDPRADPANFGYRWMPWLNDSLPANTLNNTNSSKNEIASNFSEKYYIVPYSDGSSDGKTVTTTVNCQDVKVNIIKGGATRPNIVFFIADDMGWDDVSFHGSDQILTPNIDLLAYTGVALERYYSHCICTPSRASLLTGKFAHVIGMQGYPLTNAEDRALPLGEKILPQYLKDLGYATHLVGKWHVGQARAEHLPTFRGFDTHFGHRGGYIDYYEYTLLENWDEGDVSGFDLFRNMTAAWEVEGYITDVYNEEAKSIIKAHDVSRPLFLMVAHNAPHSANEGAFLQAPSDEVRAMRHIELPQRRYYAAMVKKLDDSIGDIVKTLSEKGILDNTIIVFVSDNGGITSQMSANYASNYPLRGLKMSPFEGGIRVNGLIWSKNLTQSNHLWKGYMHVSDWLPTLLKAVGAESAKEIDGFDLWDNIVTNTISKREMIVEINDYTGFYSITHNDFKLVVGSVLTSYSDHQGKQFRGIIGKPPSYEDAIKKSKIYSVLSDNGINFGFNETALRNKIKIKCNDLKPNQEICFPSKEKWCLFNIKEDPCEIVDLMDTHSDVAKELHTKLEREIARTIPRTIPHETNLKAMPKFHNYTWDIWKTSDE-