Monarch geneset OGS2.0

DPOGS202663
TranscriptDPOGS202663-TA1404 bp
ProteinDPOGS202663-PA467 aa
Genomic positionDPSCF300039 + 135214-136851
RNAseq coverage32x (Rank: top 75%)
Annotation
HeliconiusHMEL0022287e-10165.41% 
BombyxBGIBMGA001291-TA0.065.82% 
DrosophilaCG12014-PA9e-12646.82% 
EBI UniRef50UniRef50_Q1WJM27e-12948.60%Iduronate 2-sulfatase n=11 Tax=Endopterygota RepID=Q1WJM2_ANOGA
NCBI RefSeqXP_967324.15e-13453.46%PREDICTED: similar to iduronate 2-sulfatase [Tribolium castaneum]
NCBI nr blastpgi|910794141e-13253.46%PREDICTED: similar to iduronate 2-sulfatase [Tribolium castaneum]
NCBI nr blastxgi|910794149e-13453.04%PREDICTED: similar to iduronate 2-sulfatase [Tribolium castaneum]
Group
Gene OntologyGO:00081522.4e-76metabolic process
GO:00038242.4e-76catalytic activity
GO:00084841.5e-42sulfuric ester hydrolase activity
KEGG pathwaytca:6556691e-133 
 K01136 (IDS)maps-> Lysosome
    Glycosaminoglycan degradation
InterPro domain[3-467] IPR0178502.4e-76Alkaline-phosphatase-like, core domain
[4-365] IPR0178492.3e-72Alkaline phosphatase-like, alpha/beta/alpha
[10-376] IPR0009171.5e-42Sulfatase
Orthology groupMCL10621 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202663-TA
ATGCGATACCTTTCAGAGGAAGTTTACTTGCCAAATTTTCAAAAATTGGCGGCAAAAGGAATCACATTTCAAAAGGCTTTTGCACAACAAGCATTGTGTGCACCAAGTAGAAATTCTATTTTAACCGGTCGTCGACCAGATGAGTTACGCTTGTATGACTTTTATAATTATTGGCGCGACACTGTTGGAAATTTTTCTACATTTCCTCAAATATTCAAGGAACACGGATACGATACGTACTCAGCTGGAAAAATATTCCACCCAGGAAAGAGTTCCAATTTTACGGACGACTATCCTTATAGCTGGACACTAAAACCTTATCATCCTCCAACCGAAAAATATAAAGACGATGCATTGTGTAAAGATAGACATAGTATAACTTTACACAAAAATCTGATTTGTCCAATCAACGTTAAGGAACAGCCCGATAATACATTACCTGACCTCGAAACCCTCAAATACTCAATTGATATTATTAAAAATAGAAACCAAACTAAACCCTTCCTGCTAGCTGTCGGATTTCACAAGCCTCATATTCCTTTAAAATATCCTCATAAATACTTGAAAAATGTTCCAATTAGTTCAGTGAATCCGCCACGTGTGTCGTCTATCCCTAAGGGTCTACCGCTGGTATCTTGGCATCCTTGGACGGATGTCCGGCGAAGAGATGACATTAAGAAACTAAACCTTACTTTCCCATTTGGTATAATGCCTCCGAAATGGACGTTAAAGATAAGGCAAAGTTATTATGCTGCGTCACTATACATAGATGATCTTTTGGGAAAACTTATGAGCCATGTAAATCAAACCAACACCATAATTGTTGTTACTAGTGATCATGGTTGGTCTTTGGGTGAAAATGGACTTTGGGCAAAGTATAGCAACTTTGATGTCGCCCTGAGGGTGCCCTTGCTTTTTAAAATACCCGGATTTCAGCCCAAGGTCATAACTAATCCTGTTGAATTGGTCGACATATACCCAACTTTACTTGAAGTGGGTTTAAATATATTTGTACCAAAATGTAAGAATAATGATGATAAATCCACTTTATGTTCGAGTGGAAAAAGTTTAGTACAATTAATGTCAAACAAACATAATACTGGTAGATCATTTGCCATATCCCAGTATCCACGGCCACAGGTACAACCTACAAAAAGTTCTGATAAACCAAAACTGAAAGATATAAAAATAATGGGTTATAGCATCCGAACGGAAAAATATAGATACACTGAATGGATATCATTTAATAATACACATTTCACTAGGAACTGGAATAAAATACACGGGATCGAACTATACAACCATGTTTATGATGACGAAGAATCAAATAATCTGTACCTAGTACCATATTATCAGGATATAAAAAAACAATTATCAGCATTACTGAGGTCAACAATAAATTAG

Protein sequence:

>DPOGS202663-PA
MRYLSEEVYLPNFQKLAAKGITFQKAFAQQALCAPSRNSILTGRRPDELRLYDFYNYWRDTVGNFSTFPQIFKEHGYDTYSAGKIFHPGKSSNFTDDYPYSWTLKPYHPPTEKYKDDALCKDRHSITLHKNLICPINVKEQPDNTLPDLETLKYSIDIIKNRNQTKPFLLAVGFHKPHIPLKYPHKYLKNVPISSVNPPRVSSIPKGLPLVSWHPWTDVRRRDDIKKLNLTFPFGIMPPKWTLKIRQSYYAASLYIDDLLGKLMSHVNQTNTIIVVTSDHGWSLGENGLWAKYSNFDVALRVPLLFKIPGFQPKVITNPVELVDIYPTLLEVGLNIFVPKCKNNDDKSTLCSSGKSLVQLMSNKHNTGRSFAISQYPRPQVQPTKSSDKPKLKDIKIMGYSIRTEKYRYTEWISFNNTHFTRNWNKIHGIELYNHVYDDEESNNLYLVPYYQDIKKQLSALLRSTIN-