Monarch geneset OGS2.0

DPOGS209387
TranscriptDPOGS209387-TA1554 bp
ProteinDPOGS209387-PA517 aa
Genomic positionDPSCF300118 + 280114-283614
RNAseq coverage312x (Rank: top 36%)
Annotation
HeliconiusHMEL0131084e-14574.01% 
BombyxBGIBMGA005529-TA0.073.35% 
DrosophilaCG14291-PA4e-16360.55% 
EBI UniRef50UniRef50_P516882e-16458.27%N-sulphoglucosamine sulphohydrolase n=43 Tax=Eumetazoa RepID=SPHM_HUMAN
NCBI RefSeqXP_001605082.10.063.37%PREDICTED: similar to ENSANGP00000024797 [Nasonia vitripennis]
NCBI nr blastpgi|1565544070.063.37%PREDICTED: N-sulphoglucosamine sulphohydrolase-like [Nasonia vitripennis]
NCBI nr blastxgi|1565544070.063.37%PREDICTED: N-sulphoglucosamine sulphohydrolase-like [Nasonia vitripennis]
Group
Gene OntologyGO:00081521.2e-93metabolic process
GO:00038241.2e-93catalytic activity
GO:00084845.9e-61sulfuric ester hydrolase activity
KEGG pathwaytca:6592610.0 
 K01565 (SGSH)maps-> Lysosome
    Glycosaminoglycan degradation
InterPro domain[25-496] IPR0178501.2e-93Alkaline-phosphatase-like, core domain
[25-358] IPR0178491e-82Alkaline phosphatase-like, alpha/beta/alpha
[27-456] IPR0009175.9e-61Sulfatase
Orthology groupMCL13553 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209387-TA
ATGGCTCGACACCCGGTCGCTGTTATCACACTGTTATTGTGTCTCATTATCACGAACACTGTGTTATCAAACAAGAATCGCAACGTTCTCATACTCTTAGCTGACGATGGAGGTTTTGAAATCGGAGCGTATAGAAACAAAATTTGCCAAACTCCGAACATCGACGAGTTTGCGAAGCGCAGCGTCATCTTCAACAATGCCTTCAGTTCCGTCAGTAGCTGCTCTCCGAGTCGCGCGGCTCTGCTGACCGGCACCCCGAGTCATCAGAACGGCATGTACGGGCTCCATCACGGCGTTCATCACTTCAACTCCTTCGATAACGTCACCAGCTTACCGAACATACTGCGCGAGCACGGAGTCTACACTGGTATAATCGGTAAGAAGCACGTGGGTCCGAGCAGCGTGTACAAGTTCGACATGGAGTGGACGGAGGAGGGACACAGCATCAACCAGGTCGGAAGGAACATCACGCACATGAAGCTGCTGGCCAGGAAGTTCCTGCGGGAGGCGAACAGACTCGACAAACCGTTCCTGCTGTACGTCGGGTTCCACGACCCTCACCGCTGCGGCCACGAGGCTCCTCAGTACGGCCCCTTCTGTGAGAGGTTCGGTTCCTCGGAGGAGGGGATGGGGGTCATCCCCGACTGGAAGCCCTGGTACTATCAGTGGGACGAGGTGCAGCTGCCTTATCATGTCCAGGACACTGAGGCGGCCCGGAGAGACATCGCGGCGCAGTACACCACTATGTCCAGATTAGATCAAGGTGTGGGGCTCATGCTGAAGGAGCTGGAGGCGGCGGGCCACGGACATGACACGCTGGTCATATACACCTCCGACAACGGGATTCCCTTCCCTTCGGGGAGGACCAACTTCTACGACCCCGGGCTGAGGGAACCTCTGATCATGCACTCGCCAGACCCTGGAGCTCGCAGAAACGAAGCCTCCGGCGCACTGGTGAGTCTGCTGGACCTCACGCCCACTGTCCTCGACTGGTTCGGCGTCCGCACACCGCGACACATCGACCACGAGTGGCGCGACCGGCCCAGGAGTCTGCTGCCCATATTGAACAAAGAGCCGCCGCCGAGTGAGCAAGACGCCGTGTTCTCGTCCCAGACGCACCACGAGATCACCATGTACTACCCCATGAGGTCGGTCCGCACCCGTCGGTACAAACTCATCCACAACCTCAACTTCGGGATGCCCTTCCCCATCGACCAGGACCTGTACGTGTCGCCGACCTTCCAGGATATATTGAACCGCACTCGAAGCAAGCAGCCTCTGCCGTGGTACAAGACTCTGAAGCAGTACTACTACCGACCGCAGTGGGAGATGTACGACCTCAAGAACGACCCCTTGGAGACCCACAACCTGCACGGTAAGCCGTCGCTGTCCGAGGTGGAGGCTTCTCTCAGGGAGCGGCTTCACTCGTGGCAGCTCGCTACCGGCGACCCCTGGCTCTGCTCGCCGGCCGCGGTCCGGGACCCGCGACCCGGGGCAAACTCCGCCGTCTGCGACGCGCTCGACAACGGCCTCACACACTACATGCACACCTAG

Protein sequence:

>DPOGS209387-PA
MARHPVAVITLLLCLIITNTVLSNKNRNVLILLADDGGFEIGAYRNKICQTPNIDEFAKRSVIFNNAFSSVSSCSPSRAALLTGTPSHQNGMYGLHHGVHHFNSFDNVTSLPNILREHGVYTGIIGKKHVGPSSVYKFDMEWTEEGHSINQVGRNITHMKLLARKFLREANRLDKPFLLYVGFHDPHRCGHEAPQYGPFCERFGSSEEGMGVIPDWKPWYYQWDEVQLPYHVQDTEAARRDIAAQYTTMSRLDQGVGLMLKELEAAGHGHDTLVIYTSDNGIPFPSGRTNFYDPGLREPLIMHSPDPGARRNEASGALVSLLDLTPTVLDWFGVRTPRHIDHEWRDRPRSLLPILNKEPPPSEQDAVFSSQTHHEITMYYPMRSVRTRRYKLIHNLNFGMPFPIDQDLYVSPTFQDILNRTRSKQPLPWYKTLKQYYYRPQWEMYDLKNDPLETHNLHGKPSLSEVEASLRERLHSWQLATGDPWLCSPAAVRDPRPGANSAVCDALDNGLTHYMHT-