Monarch geneset OGS2.0

DPOGS211903
TranscriptDPOGS211903-TA1722 bp
ProteinDPOGS211903-PA573 aa
Genomic positionDPSCF300011 - 145781-151864
RNAseq coverage595x (Rank: top 21%)
Annotation
HeliconiusHMEL0177200.071.22% 
BombyxBGIBMGA001098-TA7e-18065.62% 
DrosophilaCG8646-PB9e-17554.40% 
EBI UniRef50UniRef50_Q8SZ721e-17254.40%CG8646 n=26 Tax=Pancrustacea RepID=Q8SZ72_DROME
NCBI RefSeqXP_624454.15e-17455.14%PREDICTED: similar to CG8646-PA [Apis mellifera]
NCBI nr blastpgi|2700053030.060.19%hypothetical protein TcasGA2_TC007349 [Tribolium castaneum]
NCBI nr blastxgi|2700053030.059.33%hypothetical protein TcasGA2_TC007349 [Tribolium castaneum]
Group
Gene OntologyGO:00081522.1e-118metabolic process
GO:00038242.1e-118catalytic activity
GO:00084842.2e-69sulfuric ester hydrolase activity
KEGG pathwaydme:Dmel_CG86467e-173 
 K01135 (ARSB)maps-> Lysosome
    Glycosaminoglycan degradation
InterPro domain[57-564] IPR0178502.1e-118Alkaline-phosphatase-like, core domain
[57-418] IPR0178497.8e-101Alkaline phosphatase-like, alpha/beta/alpha
[59-411] IPR0009172.2e-69Sulfatase
Orthology groupMCL12070 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211903-TA
ATGCGGCTTGAGACTCGATGTACCAATCACATGAATAATTCATGTGACAGCTCCGTTGAGTGTCCGCGGCTGACAGCTGAGGAACTTTTTGAATTTGAATCTCCCAGCAGTATGTTGTTGGTCTTATTATTGTTTATTGTGACCAGTCTGTCTGATTGTGAGTGTCACGAAAGGCCTAATATTGTGTTAATAATAGCCGACGATTTAGGCTGGAACGATGTTGGATTCCACGGATCGAACCAAATACCGACCCCCAATATCGATATTATGGCCTGGTCTGGTGTATCGTTGCACAATTATTACGTGACGCCCATATGCACGCCGTCTAGAGCTGCGCTCATGACGGGGAAGTATCCGATACATACTGGTATGCAACACACTGTAATTTTCGCGGCTGAACCTCGAGGGTTGCCGCTCACTGAGAAAATTTTACCCCAATATTTAAAGGAGCTAGGTTATAAGACACATCTAGTGGGCAAGTGGCATCTCGGATCATACAAAAAGGAATACTTGCCGTTAAATAGGGGATTCGACAGCCATCTTGGATTTTGGAACGGAAAAATAGACATGTACGATCACACGAACCAGGAGAAAGGATATTGGGGATTTGATTTCAGGCGAGACTTCTCCACGGCCCACGACCTGTTCGGGCAGTACGCCACAGATGTCTACACTAACGAAGCTGTCAAGATAATAAAGTCCCACAACACGAGCTCCCCGCTGTTCCTGATGCTGTCTCACTCCGCGGTCCACACCGGCAACCCCTCCGAGCCGATCCGGGCTCCAGAAAAGCTATTCGTCAACTTCACACATATTCAGGATTTCCAACGGAGAAAATTTGCCGCCGTGCTCACGAAACTGGACGAGTCGGTCGGGGAAGTGGTCGCCGCGTTGAAGGCGAAGGGTGTGTTGAACGACAGTATCGTGGTGTTCACGACGGACAACGGCGGGGCCGCGGCCGGGTTCAACGACAACGCCGCCTCCAACTACCCTCTTAGAGGGGTAAAGAATACTCTGTGGGAAGGAGGCGTGCGCGGGGCGGGCTGGCTGTGGAGTCCCTTCATAGACAAGAGATCCCGAGTCGCCACACAGAGGATGCATCTAGTGGACTGGCTGCCGACCTTGCTCAGCGCGGCCGGCATGAACGTTAGTTCGATTAAACATATAGATGGCGTCGATCAGTGGTGCGCGCTGTCCCAGGACCTCCCGTCCGCCAGAGAGTCCTTAGTCCACAACATAGACGATGAGTCCGGCAGCGCTTCCATCACGTACAAACAGTGGAAGGTACATAAAGGCACCAACTACGGCGGGTCCTGGGACGGGTGGTACGGTCCGGCGGGGCGCGAGGGAGCGTACGACACCACACGATTACTAGCATCTAAGGCGGCCGGCGCCCTACTGGATATAGGGATGTTGCCGGATACGGAGCATATACTGAGACTGAGATCTGAAGCGACCGTGGAGTGTGGAGACCGCGAGGCGCTCCCGTGTCGACCGCTGGAGGCGCCGTGCCTCTTTAACATAGACGAAGACCCGTGCGAAACCAGGAACCTCGCCGACATACATCCAGATGTCTTACAAGTGATGTTGAAGGAGCTCGACAGGGTGAACCGCACCGCGGTCCCCCCGAACAACCAGCCGCTGACCCCCGGAGGTGACCCCAAGTATTGGGGCTACGTGATAACGAACTTCGGTGATTATATTAATAATGAAATAAAATAG

Protein sequence:

>DPOGS211903-PA
MRLETRCTNHMNNSCDSSVECPRLTAEELFEFESPSSMLLVLLLFIVTSLSDCECHERPNIVLIIADDLGWNDVGFHGSNQIPTPNIDIMAWSGVSLHNYYVTPICTPSRAALMTGKYPIHTGMQHTVIFAAEPRGLPLTEKILPQYLKELGYKTHLVGKWHLGSYKKEYLPLNRGFDSHLGFWNGKIDMYDHTNQEKGYWGFDFRRDFSTAHDLFGQYATDVYTNEAVKIIKSHNTSSPLFLMLSHSAVHTGNPSEPIRAPEKLFVNFTHIQDFQRRKFAAVLTKLDESVGEVVAALKAKGVLNDSIVVFTTDNGGAAAGFNDNAASNYPLRGVKNTLWEGGVRGAGWLWSPFIDKRSRVATQRMHLVDWLPTLLSAAGMNVSSIKHIDGVDQWCALSQDLPSARESLVHNIDDESGSASITYKQWKVHKGTNYGGSWDGWYGPAGREGAYDTTRLLASKAAGALLDIGMLPDTEHILRLRSEATVECGDREALPCRPLEAPCLFNIDEDPCETRNLADIHPDVLQVMLKELDRVNRTAVPPNNQPLTPGGDPKYWGYVITNFGDYINNEIK-