Monarch geneset OGS2.0

DPOGS213961
TranscriptDPOGS213961-TA1620 bp
ProteinDPOGS213961-PA539 aa
Genomic positionDPSCF300226 + 198517-253525
RNAseq coverage218x (Rank: top 45%)
Annotation
HeliconiusHMEL0028466e-9791.43% 
BombyxBGIBMGA007552-TA7e-11480.72% 
DrosophilaHs6st-PA3e-12963.01% 
EBI UniRef50UniRef50_Q7PXK21e-15853.94%AGAP001444-PA n=6 Tax=Endopterygota RepID=Q7PXK2_ANOGA
NCBI RefSeqXP_002000598.12e-12964.64%GI10319 [Drosophila mojavensis]
NCBI nr blastpgi|3479659154e-15853.94%AGAP001444-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|3479659156e-14853.28%AGAP001444-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00081462.9e-158sulfotransferase activity
GO:00160212.9e-158integral to membrane
KEGG pathwaydme:Dmel_CG44512e-127 
 K02514 (HS6ST1)maps-> Glycosaminoglycan biosynthesis - heparan sulfate
InterPro domain[13-503] IPR0106352.9e-158Heparan sulphate 6-sulfotransferase
[109-316] IPR0053312.2e-48Sulfotransferase
Orthology groupMCL12494 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213961-TA
ATGCATCCAGCCGAAATGCAAAAAAGTGTTTCAGTGCAGGTAGATAGTGATTTTGCCCTTTTACTGCCCAGTGGTGCTTATGAGATAAAAAAGATGGAGCCAAGATCCAGGTTACGAAAAGTGATACTGTTCTGTTTACTGCTCTCACTGTTTGGGATGGTGGCATTTGGATACTTTTGCCCCGATCAGGTGTGTGCATTATCACCGCGAGAGAGAGTAGCAGAATCCTTGGCGCTGACATACGCGGAGGCTCCGGGGCGGCCGCTGCCGGATGTGATATCACAGCCTGGTCTATCATACGACGAGGTCCTTCACGATGACTTCCGGTTTGATCTGAACGCTCATGACGTCATGGTGTTTTTGCACATACAGAAGACGGGGGGAACGTCTTTTGGAAGGCACTTGGTTATGGATCTAGATTTGAAGAGGCCGTGCAACTGTCAGCGGGCTCGAAAACGTTGCCATTGTTTTCGGCCTCACAGCAACGAAATATGGCTTTTTTCGAGATACTCCACTGGCTGGAAATGCGGCCTGCACGCTGATTTTACGGAGTTGACAGCGTGTGTGGGCGGGGAGTTAGATAGACACGAAGGTTCCGCAGTACATCGCAGATACTTCTACATAACTCTCCTAAGAGAACCTGTTTCAAGGTATTTGTCCGAGTTCCGTCACGTGAAGCGAGGGGCGACGTGGAAGGGATCCAGACACTGGTGTCAGGGGCGAACCGCTACTGCTGCGGAAGTTCCGCCGTGCTATACTGGAGAGTCCTGGCGCGGGGTTACCCTGGAAGAGTTCGCGTCTTGTTCCTGGAACCTGGCTAATAATAGACAGACGCGGATGTTGGCTGATCTTGCTCTGGTCGGATGTTACAACGGCACCCTCAGACATCGAACACCTGACACGGACAGAGTACTACTGGCATCCGCGAAGAGAAACCTTGCAGCTATGTATTTGTCCGAGTTCCGTCACGTGAAGCGAGGGGCGACGTGGAAGGGATCCAGACACTGGTGTCAGGGGCGAACCGCTACTGCTGCGGAAGTTCCGCCGTGCTATACTGGAGAGTCCTGGCGCGGGGTTACTCTAGAAGAGTTCGCGTCTTGTTCCTGGAACCTGGCTAATAATAGACAGACGCGGATGTTGGCTGATCTTGCTCTGGTCGGATGTTACAACGGCACCCTCAGACATCGAACACCTGACACGGACAGAGTACTACTGGCATCCGCGAAGAGAAACCTTGCAGCTATGTCATATTTTGGGCTTACTGAGTTTCAGAAGATATCCCAATACGTCTTCGAGGAGACCTTCAACCTTCGTTTCGCTGTGCCCTTCACTCAGCACAACGCGACCGTCTCCGGAGCAACCCTGGCCGCACTCACGCCAGCCCAAGTGGATCACATCAGACGACTCAACTCTCTCGACCTCGAACTGTACGAGTTCGCGAAGAATCTCATGTTCAAACGTTTCGAAGCACTAAAGCACCGTGACAGTGATTTCGAGTACCGTTGGCGTCACCTGGGCGAGGTCGCCCGCAGCGGAGTCACTGAGTTCGATTGGGACAGCAACTTGGAAGACGCCACCACGGAGAAATCTAGAGGTTTGGACAACACGCGAACAATTTAA

Protein sequence:

>DPOGS213961-PA
MHPAEMQKSVSVQVDSDFALLLPSGAYEIKKMEPRSRLRKVILFCLLLSLFGMVAFGYFCPDQVCALSPRERVAESLALTYAEAPGRPLPDVISQPGLSYDEVLHDDFRFDLNAHDVMVFLHIQKTGGTSFGRHLVMDLDLKRPCNCQRARKRCHCFRPHSNEIWLFSRYSTGWKCGLHADFTELTACVGGELDRHEGSAVHRRYFYITLLREPVSRYLSEFRHVKRGATWKGSRHWCQGRTATAAEVPPCYTGESWRGVTLEEFASCSWNLANNRQTRMLADLALVGCYNGTLRHRTPDTDRVLLASAKRNLAAMYLSEFRHVKRGATWKGSRHWCQGRTATAAEVPPCYTGESWRGVTLEEFASCSWNLANNRQTRMLADLALVGCYNGTLRHRTPDTDRVLLASAKRNLAAMSYFGLTEFQKISQYVFEETFNLRFAVPFTQHNATVSGATLAALTPAQVDHIRRLNSLDLELYEFAKNLMFKRFEALKHRDSDFEYRWRHLGEVARSGVTEFDWDSNLEDATTEKSRGLDNTRTI-