Monarch geneset OGS2.0

DPOGS202474
TranscriptDPOGS202474-TA2274 bp
ProteinDPOGS202474-PA757 aa
Genomic positionDPSCF300326 - 21125-24175
RNAseq coverage572x (Rank: top 22%)
Annotation
HeliconiusHMEL0041564e-9178.87% 
BombyxBGIBMGA012390-TA2e-1125.76% 
DrosophilaCG43313-PA5e-7932.02% 
EBI UniRef50UniRef50_Q16SL42e-9129.71%Chondroitin synthase n=4 Tax=Diptera RepID=Q16SL4_AEDAE
NCBI RefSeqXP_001601123.12e-9529.96%PREDICTED: similar to chondroitin synthase [Nasonia vitripennis]
NCBI nr blastpgi|1565470333e-9429.96%PREDICTED: chondroitin sulfate synthase 2-like [Nasonia vitripennis]
NCBI nr blastxgi|1700677614e-8532.47%chondroitin synthase [Culex quinquefasciatus]
Group
Gene OntologyGO:00325803.7e-85Golgi cisterna membrane
GO:00167583.7e-85transferase activity, transferring hexosyl groups
KEGG pathwaynvi:1001166945e-95 
 K03419 (CHPF2)maps-> Glycosaminoglycan biosynthesis - chondroitin sulfate
InterPro domain[46-668] IPR0084283.7e-85Chondroitin N-acetylgalactosaminyltransferase
Orthology groupMCL12076 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202474-TA
ATGTTATCACGCTACGTGGTATCGCAAGTGAAACATAACTCCTACTTCCTGGTGGGGTTGGGTATCGGTCTATGGCTCGCACTGGCGACGGTGCCACTTGAAGAGGATGTGGTGTCTTGCGAGGACACCACTGCCGCGGCCCTGGACCCAGGCCTGGACGAATTCCAGCCGCAGCGTGAGGAGCGACCTCCGGGCGCGGTGGGACCCGCGGGTCGCACGGTCACTAGACCACGATACTACAGTACAGAACTGGGCATGCGAGCCGCCCTACTGGCAGGTGTCCTGAGTTCCGAGGCAGCCCTAGAGTCTCGCGCCGCCGCCTTCAACCAGACGGCCGCAGATCTTAAGCCCGCCCTGCGCTTCTTCATCACGGCGAGCGCTCTACAAGGCGCCCCGGGCCGAGCCAACGTGGTGGGCTTCACAGACACACGCGAGATGCTGAAGCCGTTCCACGCGCTCAAGTACCTCGCCGATAACTTCCTCGAGGAGTACGACTTCTTCTTCCTCGTGTCGGACTCCACTTTCGTGAACGCGCGTCGTCTGAACCGGCTCGTGGCCAGCCTCAGCGTGAGCCAGGACCTGTACATGGGAGCCGTCTCCGGCGACGACACTCACTACTGCACGCTGGAGGCCGGCATCCTCATGTCCAACTCTGTGCTGCGAGCCGTGCACGAGGAGCTGGACTGGTGCGTCAGGAACTCCTACTCCCCGCACCACCACGAGAACCTGGGCCGCTGCGTGCTGCACGCGGCCGGCCTCCGGTGCGTCGCCGGCCTCCAGGCCGTCTCTTACGACACGGCCCACCTCCGCCCCGCTCACCCGGACGGCCCCGCCAGTTTGCACCCCGCCTTGGCGGACGCAGTAACCGTCCACCCGGCGCTGACCCCCGAGGACTTCTACCGCCTGCACGCCTACGTGTCCAGGGTGAACCTGGAGCGTGTTGGGGAGGACGAGGCGCGGACTCGAGCGGAGGCGGCGCTCAGCTCCCGTCACCATCCCCGGGGGTACAGGAACGTGTCGTGGCCAACCGCCCTACGAGCGGACGCAGGTCTAGCGCCGCCACCCCCACCCACCAGGTCCGAGTTCGACCTCCTCCGCTGGACGCGGTTTAATCTCACACACGCCCTCCAGCTGGACGACCACCGCGCCGTCTCCAAGCTGAGCGCATCCTACAAGCAAGCCGTGGCCCTGATCGTAGAGGAGGCACGGGCGTGGGTGGAGCGGAGATGGGGCGGCGAGGAAGGCGGGGCGCTTTCGGTGAGCCTCGAGGAAGGAGCGTGGTGCTGGGAGCCGCCCCGGGCGCTCCGGTACCGCCTCTTGCTGAGAGTGACCGCGGAGGGAGGCGGGCGTCTGCTGCAAGTGGAGGCGGCGCGAGCGCTGGGAGCGGCCCGCCTCGCACCCGCAGCCTACGTCACGGAGAGCGCCCGCGTCCACCTCGTGCTGCCAGCCCCCGACCAGCGCTCACACCTCACCGCTTTCCTGGAGCGGTACGAGACGGTCTGCCTCCAGAGAGACGACAACACGGCTCTGTATGTGGTCGTGATACCGGCCAGTGACGGAGGACATCTGACAGCAGAAGAGCGAGCTCATCTGGAGGAGGTCAAGGAGATGGTGAGGGCGGTCGGAGAGAAACACCGCGCGGGACAACACATGGACGTTATCGTGTCCAGCATCGGGCGCGGCGCGGGGACGGGGGGTGGTGTCTCCGGGAGTGGGGAGAGAGCGCGGGAGGACGTACGGCTCGCCCTGAGGGCGGCACTCGTTCGGGCCGCGAAGGATGCGTTGCTGTTGGTGGCCGACCATAGCATGGAGTTCACCGAAGACTTCCTCAACAGGGTCCGCATGAACACGATCGCGGGCTCGCAGTGGTTCAGTCCGCTGGCCTTCGCTCGCTTCGCGCAGTACGCTCACCCTCGCTTCGTGGAGGCGGACGGGTCGCGGCCGACTCTCCACACGGGCCGCTTCTCTCACACCGAGCTGCTCTCCGTGTACAAGGGCGACTACTCGGACGCTCTCCGCAGCTGGCTGGAGGCGGGAGGCTCCGAGGAGGCGTCACCGTCCGCGGTCCTCGCCGCTAGCCCCCTACGCGTGCTGCGCGCCCCTGAGCCGGCCCTGCTACTCCCGCCCCGGCCCCGCCCCTGCACACCCTCCTCCCCCTCCGAGGAGAGGGCGTGTCTGGTCCGTGAGCGCGAGCGTGGTTTCTCTGACCTGTTGCTGGGCGCTCGTCAGTCGCTCGCCAAGTTGCTGCTGCAGACTCAGGCGGAGCTCGAGTGA

Protein sequence:

>DPOGS202474-PA
MLSRYVVSQVKHNSYFLVGLGIGLWLALATVPLEEDVVSCEDTTAAALDPGLDEFQPQREERPPGAVGPAGRTVTRPRYYSTELGMRAALLAGVLSSEAALESRAAAFNQTAADLKPALRFFITASALQGAPGRANVVGFTDTREMLKPFHALKYLADNFLEEYDFFFLVSDSTFVNARRLNRLVASLSVSQDLYMGAVSGDDTHYCTLEAGILMSNSVLRAVHEELDWCVRNSYSPHHHENLGRCVLHAAGLRCVAGLQAVSYDTAHLRPAHPDGPASLHPALADAVTVHPALTPEDFYRLHAYVSRVNLERVGEDEARTRAEAALSSRHHPRGYRNVSWPTALRADAGLAPPPPPTRSEFDLLRWTRFNLTHALQLDDHRAVSKLSASYKQAVALIVEEARAWVERRWGGEEGGALSVSLEEGAWCWEPPRALRYRLLLRVTAEGGGRLLQVEAARALGAARLAPAAYVTESARVHLVLPAPDQRSHLTAFLERYETVCLQRDDNTALYVVVIPASDGGHLTAEERAHLEEVKEMVRAVGEKHRAGQHMDVIVSSIGRGAGTGGGVSGSGERAREDVRLALRAALVRAAKDALLLVADHSMEFTEDFLNRVRMNTIAGSQWFSPLAFARFAQYAHPRFVEADGSRPTLHTGRFSHTELLSVYKGDYSDALRSWLEAGGSEEASPSAVLAASPLRVLRAPEPALLLPPRPRPCTPSSPSEERACLVRERERGFSDLLLGARQSLAKLLLQTQAELE-