Monarch geneset OGS2.0

DPOGS213821
TranscriptDPOGS213821-TA1212 bp
ProteinDPOGS213821-PA403 aa
Genomic positionDPSCF300106 + 451500-457590
RNAseq coverage41x (Rank: top 72%)
Annotation
HeliconiusHMEL0122770.091.45% 
BombyxBGIBMGA006776-TA5e-12890.21% 
DrosophilaHs3st-A-PA4e-8062.02% 
EBI UniRef50UniRef50_E1ZZA73e-14868.90%Heparan sulfate glucosamine 3-O-sulfotransferase 5 n=8 Tax=Endopterygota RepID=E1ZZA7_CAMFO
NCBI RefSeqXP_396407.28e-14966.49%PREDICTED: similar to Heparan sulfate glucosamine 3-O-sulfotransferase 5 (Heparan sulfate D-glucosaminyl 3-O-sulfotransferase 5) (Heparan sulfate 3-O-sulfotransferase 5) [Apis mellifera]
NCBI nr blastpgi|3838522383e-15266.08%PREDICTED: heparan sulfate glucosamine 3-O-sulfotransferase 5-like [Megachile rotundata]
NCBI nr blastxgi|3838522387e-14866.08%PREDICTED: heparan sulfate glucosamine 3-O-sulfotransferase 5-like [Megachile rotundata]
Group
Gene OntologyGO:00081461.9e-32sulfotransferase activity
KEGG pathwayame:4129562e-148 
 K08104 (HS3ST5)maps-> Glycosaminoglycan biosynthesis - heparan sulfate
InterPro domain[137-387] IPR0008631.9e-32Sulfotransferase domain
Orthology groupMCL13668 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213821-TA
ATGGAGTCGCGGGCGCAGATGAATCAGTCGCTGCTGAACAAAAATCGTCTTGAAGCCAACTACAGGTCAAGCTGTTCGCAATCATTGCCGCTTGATATTCTATGGTTACCGCGGTCATCGTTCCGGCCGGCGGAGGGCGAGCTGGCAGACTGCGTGCTGGTGGTGGGAGTGTCGCGGCCAAAGTTAGCCGCGGCCCTCCTGTCCGTAACTCTTTTATCTCTGTTCCTCACCTTCCACGTGCTCTATGACAGTGCCCTTTATAGCATACAAGCGGCTAGTATGGCGTCAGCTGCCAGCGAGGGCCGCAAGATATTACAAAATCAAGAGAGCAGTCGCACAATAAGCCACCCTATGGCTCTAAGAAAAAGGTTAAAAACACCTCGTGCCCCACGGCCAGCCCGTCGTCTCCCCCAGGCTCTTATTATAGGAGTTAGGAAATGCGGAACGCGAGCTCTTCTTGAGATGCTCTATTTACATCCTATGGTACAAAAAGCCTCCGGCGAAGTTCATTTTTTTGATCGCGATGAAAATTATGCTCTCGGTCTAGAATGGTACAAGAGTAAAATGCCTCTTTCATTTAAGGGACAAATAACTATAGAAAAAAGTCCCAGCTACTTCGTTACTCCCGAGGTACCAGAGCGAGTTCGTGCTATGAATTCGTCAGTGAGGCTACTTCTTATTGTGCGAGAACCAGTAACTCGTGCAATATCGGACTATACCCAGCTTCGTAGTCGAGCCACCCCTTCAGCTCCTACTGTTTCGTTGGTTGGACACCCCTTGCCTGATACTGTTAAGCCTTTTGAACACCTTGCTTTGGCACCAGATGGTTCAATCAATGTTGCGTATAGGCCAATAGCTATATCACTGTATCATGCATACTTTCATCGCTGGTTGGAAGTGTTCCCCAGAGAACAGATTCTTGTTGTAAACGGAGATCAGCTGATTGAAGATCCAGTACCACAATTACGACGCATTGAGAAATTTCTTGGCCTTGAACATAAAATAGGAAGAAGAAACTTCTACTTCAACGAAACTAAAGGATTCTACTGTTTGCGTAACGATACCACGGATAAGTGTTTGCGAGAGACAAAAGGTCGCAAGCATCCCCGCGTAGACCCAGCGGTTGTCACAAAGTTACGTAAGTTTTTTGTCCAACATAATCAACGTTTCTACGACCTGATCGGCGAAGATCTCGGCTGGCCCGAGGATTGA

Protein sequence:

>DPOGS213821-PA
MESRAQMNQSLLNKNRLEANYRSSCSQSLPLDILWLPRSSFRPAEGELADCVLVVGVSRPKLAAALLSVTLLSLFLTFHVLYDSALYSIQAASMASAASEGRKILQNQESSRTISHPMALRKRLKTPRAPRPARRLPQALIIGVRKCGTRALLEMLYLHPMVQKASGEVHFFDRDENYALGLEWYKSKMPLSFKGQITIEKSPSYFVTPEVPERVRAMNSSVRLLLIVREPVTRAISDYTQLRSRATPSAPTVSLVGHPLPDTVKPFEHLALAPDGSINVAYRPIAISLYHAYFHRWLEVFPREQILVVNGDQLIEDPVPQLRRIEKFLGLEHKIGRRNFYFNETKGFYCLRNDTTDKCLRETKGRKHPRVDPAVVTKLRKFFVQHNQRFYDLIGEDLGWPED-