Monarch geneset OGS2.0

DPOGS213537
TranscriptDPOGS213537-TA1143 bp
ProteinDPOGS213537-PA380 aa
Genomic positionDPSCF300033 - 540323-559852
RNAseq coverage68x (Rank: top 67%)
Annotation
HeliconiusHMEL0079044e-11686.04% 
BombyxBGIBMGA011817-TA6e-11188.63% 
Drosophilapip-PL9e-7647.18% 
EBI UniRef50UniRef50_E9IF141e-11963.72%Putative uncharacterized protein (Fragment) n=1 Tax=Solenopsis invicta RepID=E9IF14_SOLIN
NCBI RefSeqXP_969659.11e-12872.73%PREDICTED: similar to heparan sulfate 2-o-sulfotransferase, partial [Tribolium castaneum]
NCBI nr blastpgi|2700157245e-12872.73%pipe [Tribolium castaneum]
NCBI nr blastxgi|2700157247e-12672.73%pipe [Tribolium castaneum]
Group
Gene OntologyGO:00081461.9e-136sulfotransferase activity
GO:00160211.9e-136integral to membrane
KEGG pathwayspu:5935462e-27 
 K03193 (UST)maps-> Glycosaminoglycan biosynthesis - chondroitin sulfate
InterPro domain[2-377] IPR0077341.9e-136Heparan sulphate 2-O-sulfotransferase
[87-345] IPR0053311.3e-20Sulfotransferase
Orthology groupMCL15596 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213537-TA
ATGGAGCGCCTCAACGACGTCACCGCCTTCAACCACCTAACTGACAAGATTCAGACTGATCGTTCCGAACACGCTTATGCATCAGCTCAAGAATACAAAGAAGCCTTGGAGGCTATACGCAGAAACACAGGAGCCGAACGCAGTAGACCGCAGCGTACTTTAGAAGAGACGAATAGATCGGAGGAAGATCTGGATGCACCAGATGAAGTGATACCAGAGCCCTGGGAACTGAATAATACTGCGAGAGCTGATATAGAGCTGTTGTTCTTTAATAGAGTGCCGAAGGTCGGCAGTCAGACCTTCATGGAATTACTAAGACGACTCGCTATAAAGAATCAGTTTGGTTTCCACCGAGATTCTGTGCAGCGTGTCGAGACGATCCGCCTGGCTCCTGCTAACCAGCAAGTTCTAGCCAGTGTGGTGACGTCACACGCGCCGCCGGCCTCGTACATTAAACATGTCTGCTACACTAACTTTACCAGATTCGGTTATCCTTCTCCGATATACGTGAACGTAGTTCGCGATCCCGTAGAACGCGTCATCTCGTGGTACTACTACGTGCGCGCCCCCTGGTACTACGTGGAAAGGAAACAAGCCTTCCCTGACCTTCCACTACCGGATCCAGCGTGGTTAAAGAAGGACTTCGAGACGTGCGTGTTAAGCGGCGATCGCGAGTGTCGTTATTTGGAGGGCGAGACTCACGAGGGTATTGGAGACCACAGGAGACAGACGCTGTTCTTCTGCGGACACGAACCACAGTGCACGCCATTCAACAGTGTGGAGGCGCTACAACGAGCTAAGCGAGTTGTCGAACAGCAGTACGCTGTGGTTGGAGTGCTGGAAGACCTGAATTCAACGCTGCTGGCCTTCGAGAGATACATACCCAGGTTCTTTACGGGCGCACTCAAAATGTACTGGGAGGAGCTGAACACATTCAACAGAATAAATAGGAACCACTTCAAACTACCCGTCTCCGAGGCTGTTAAGCAAATCGTCCGAGCAAACTTCACCAGAGAAATTGAATTCTACGAGTTCTGCAAACAGAGACTTCACTTACAACTGAAGGCGCTCAGAGATCCATCGATCATACTTCCAACACACAAACAAACTCAATCAACACACTTATATAACAATATGGTATAG

Protein sequence:

>DPOGS213537-PA
MERLNDVTAFNHLTDKIQTDRSEHAYASAQEYKEALEAIRRNTGAERSRPQRTLEETNRSEEDLDAPDEVIPEPWELNNTARADIELLFFNRVPKVGSQTFMELLRRLAIKNQFGFHRDSVQRVETIRLAPANQQVLASVVTSHAPPASYIKHVCYTNFTRFGYPSPIYVNVVRDPVERVISWYYYVRAPWYYVERKQAFPDLPLPDPAWLKKDFETCVLSGDRECRYLEGETHEGIGDHRRQTLFFCGHEPQCTPFNSVEALQRAKRVVEQQYAVVGVLEDLNSTLLAFERYIPRFFTGALKMYWEELNTFNRINRNHFKLPVSEAVKQIVRANFTREIEFYEFCKQRLHLQLKALRDPSIILPTHKQTQSTHLYNNMV-