Monarch geneset OGS2.0

DPOGS209383
TranscriptDPOGS209383-TA981 bp
ProteinDPOGS209383-PA326 aa
Genomic positionDPSCF300118 + 248540-249937
RNAseq coverage509x (Rank: top 25%)
Annotation
HeliconiusHMEL0131144e-10666.04% 
BombyxBGIBMGA005525-TA1e-10160.53% 
Drosophilabeta4GalNAcTB-PA2e-6242.49% 
EBI UniRef50UniRef50_E3XGC36e-6639.37%Putative uncharacterized protein n=1 Tax=Anopheles darlingi RepID=E3XGC3_ANODA
NCBI RefSeqXP_001689009.16e-6839.38%AGAP008285-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582968441e-6639.38%AGAP008285-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1582968441e-6639.38%AGAP008285-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00167576e-98transferase activity, transferring glycosyl groups
GO:00059756e-98carbohydrate metabolic process
KEGG pathwaydre:5617569e-48 
 K07968 (B4GALT3)maps-> Glycosphingolipid biosynthesis - lacto and neolacto series
    Glycosaminoglycan biosynthesis - keratan sulfate
    N-Glycan biosynthesis
InterPro domain[101-326] IPR0038596e-98Galactosyltransferase, metazoa
Orthology groupMCL17791 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209383-TA
ATGCCGAAATGGTTCTCATGTCAGAGGCTTCTGCATCTACAGCTCAGCACGTCTCTGTTCGTCACTCTAGTTTTAATTGCTGTAGTCCAATTTTTTGCTTGCTACACCACCTACAACTACAATCACCTGACGGCCGATAATCTTGGCAGCAGCCTTTATCATGTCACAAATCCCAAACTAGTAATTAACAAATCGAAACCAGAATGCAAGTATGAAGATCTACTCCTAAGTTCCAAGTCAACGGCTGTGTGGGACGTACCAAAGAATTACGATGACTTTTTACCAACTGGAATTCTGAATGGAAGCTACTTCCCCGATGTCTGTAATCCTTTATACAGTGTTGCCATTCTAGTAACATACAGAAACCGGCAGAGCCAATTAGACATATTTATACCATACATGCACAATTTTCTACGTAAACAAAGGATACATTACAAAATATATGTAATAGAGCAGCAGGACAATAAGCCTTGGAACAAGGGTATGTTATACAACATAGGAGCAAAACAGGCCATAGCAGACAAGTTCCCATGTCTCATCCTGCATGATGTGGACCTCTTGCCGCTGGATGAGGCTAACCTGTACGCCTGCCTCAAACAACCCAGACACATGAGTGCTAGCATTGACAAATTCCGATATGTGCTCATCTATAGCAGTCTAGTGGGCGGAGTGCTCGCAATAACATCAGAGCAATATATGGAAGTCAACGGCTTCTCCAACAAATATCAAGGCTGGGGCGGAGAGGATGATGATTTTGCAAATAGACTTATGATGTATGACCTCGAGATGATGAGGTTGCCACCGACCCAGTCCAGGTACACCATGTTGCGGCACAGACAGGAAAAGAAGAATAAGAATCGGCACAGAATAATGTCTGCTAATAAGAATAAGATACACTTGGATGGAGTCAGAGCACTCCCGCACTACACGGCCAGCGTCCGAGACCACAGGCTGTACAGCATGGTGAGCGTCAGGTTATGA

Protein sequence:

>DPOGS209383-PA
MPKWFSCQRLLHLQLSTSLFVTLVLIAVVQFFACYTTYNYNHLTADNLGSSLYHVTNPKLVINKSKPECKYEDLLLSSKSTAVWDVPKNYDDFLPTGILNGSYFPDVCNPLYSVAILVTYRNRQSQLDIFIPYMHNFLRKQRIHYKIYVIEQQDNKPWNKGMLYNIGAKQAIADKFPCLILHDVDLLPLDEANLYACLKQPRHMSASIDKFRYVLIYSSLVGGVLAITSEQYMEVNGFSNKYQGWGGEDDDFANRLMMYDLEMMRLPPTQSRYTMLRHRQEKKNKNRHRIMSANKNKIHLDGVRALPHYTASVRDHRLYSMVSVRL-