Monarch geneset OGS2.0

DPOGS202450
TranscriptDPOGS202450-TA1053 bp
ProteinDPOGS202450-PA350 aa
Genomic positionDPSCF300174 - 198444-199730
RNAseq coverage56x (Rank: top 69%)
Annotation
HeliconiusHMEL0156533e-15772.78% 
BombyxBGIBMGA009973-TA1e-10757.57% 
DrosophilaCG14024-PA4e-5943.40% 
EBI UniRef50UniRef50_B0W1J72e-6041.98%Chondroitin 4-sulfotransferase n=2 Tax=Culicinae RepID=B0W1J7_CULQU
NCBI RefSeqXP_001660640.16e-6548.45%chondroitin 4-sulfotransferase [Aedes aegypti]
NCBI nr blastpgi|1571251921e-6348.45%chondroitin 4-sulfotransferase [Aedes aegypti]
NCBI nr blastxgi|1571251922e-6148.45%chondroitin 4-sulfotransferase [Aedes aegypti]
Group
Gene OntologyGO:00160512e-79carbohydrate biosynthetic process
GO:00081462e-79sulfotransferase activity
GO:00160212e-79integral to membrane
KEGG pathwaymdo:1000151101e-25 
 K01017 (CHST11)maps-> Glycosaminoglycan biosynthesis - chondroitin sulfate
    Sulfur metabolism
InterPro domain[61-348] IPR0180112e-79Carbohydrate sulfotransferase-related
[106-340] IPR0053312.6e-36Sulfotransferase
Orthology groupMCL18889 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202450-TA
ATGCCATGTTTGAGATGCGGGGCGAAGACCGCTACTTTCATTTTTTGGGCCGCTCTCTATGTGGCGGTCATGAAAGTGACGTTTTTAAGGAACAACGACAACGAAGATGCGAAGGAATTATCGATGGAACCTGACAATTATACTAAGTGGCTGATGCAAGGTCCTGTATTGGGTGACGATGAGGAACAGGTCCAAAGCAACAGCGATTGGCTCGAACCGGATAACACTACTATAAACGAGCTCGAACAACGTGTCAACAAGGTTAAAGAGACTTGCCATTTAAGATCTCTTGACGGTCAATCTATTAACAGTAAAGAATTCTTCGTGGATCACGCTCACAATCTTGTTTGGTGCAACATATTTAAGGCGGCCAGCTCTTCGTGGTTATATAATTTCAATATATTAGGTGGATATGACAAAGCTTTCCTCGCTCGGACTAGACACACGCCATTGACGTTGGCTAGAGACGCTATTGATACCCCAGGAGTGTTGTCGCTATTGATTGTAAGGGAGCCTTTTGTACGATTGTTATCAGCCTACAGGGATAAACTGGAGAATATAACGCCTCCGTATTACAGAAAACTAGCCAGAGCTATTGTGGCTGAACATAGAGAAGCTGCGACGAAAGTTTTAGGACCGATAAAGTCTTTTGGTCCAACGTTTTACGAATTCGTCGCCTATCTCATTTCGAAATATGAATCTGGAACGTTGACCTTCGATGAGCATTGGGCGCCATTTTACCAATTCTGTTCTCCGTGCGCCCTTAATTACACGGTAGTGGCTAAAGTTGAAACGCTATCGAGAGATTCGTCGTATGTAGTACAACAACTAGGACTGGGCGATATTTTAGGACGCAAAGTTGTTAGTCGTAGAACGCGTCTCAGAACTGTCATGAACAAATCGAGGGACGGAAAAAACACATCAGCGCTGATAAAACACTATTTCCGACAGCTGGACATGGATATGCTAGAAAAATTATTACTTATTTACGGCATAGATTTCGAAATGTTTGGATATAATTCAGATATATATCGAAGTTATGTGAGAAATTAA

Protein sequence:

>DPOGS202450-PA
MPCLRCGAKTATFIFWAALYVAVMKVTFLRNNDNEDAKELSMEPDNYTKWLMQGPVLGDDEEQVQSNSDWLEPDNTTINELEQRVNKVKETCHLRSLDGQSINSKEFFVDHAHNLVWCNIFKAASSSWLYNFNILGGYDKAFLARTRHTPLTLARDAIDTPGVLSLLIVREPFVRLLSAYRDKLENITPPYYRKLARAIVAEHREAATKVLGPIKSFGPTFYEFVAYLISKYESGTLTFDEHWAPFYQFCSPCALNYTVVAKVETLSRDSSYVVQQLGLGDILGRKVVSRRTRLRTVMNKSRDGKNTSALIKHYFRQLDMDMLEKLLLIYGIDFEMFGYNSDIYRSYVRN-