DPGLEAN17520 in OGS1.0

New model in OGS2.0DPOGS211665 
Genomic Positionscaffold2327:+ 776-10031
See gene structure
CDS Length2046
Paired RNAseq reads  1065
Single RNAseq reads  2509
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA001365 (5e-125)
Best Drosophila hit  GalNAc-T1, isoform B (4e-87)
Best Human hitpolypeptide N-acetylgalactosaminyltransferase 1 (7e-55)
Best NR hit (blastp)  GE11714 [Drosophila yakuba] (4e-91)
Best NR hit (blastx)  GG20529 [Drosophila erecta] (9e-89)
GeneOntology terms

  
GO:0004653 polypeptide N-acetylgalactosaminyltransferase activity
GO:0009312 oligosaccharide biosynthetic process
GO:0005795 Golgi stack
InterPro families
  
IPR000772 Ricin B lectin
IPR008997 Ricin B-related lectin
Orthology groupND

Nucleotide sequence:

ATGGGACACTTCACATGGATCGAGGTTCCAGAGAGGGAGAAAAGGAGACGCGGATCAGAC
ATAGCGCCGACGTGGTCACCGACTATGGCGGGTGGTCTGTTCGCCATCAACCGACAGTAC
TACTGGGAGCTGGGAGCGTATGATGAGCAGATGGCTGGGTGGGGGGGCGAGAACCTGGAG
ATGTCGTTCCGGATATGGCAGTGTGGTGGCACGCTGGAGACGGTGCCGTGTTCTCGCGTG
GGTCACGTGTTCCGAGCCTTCCATCCTTATGGGCTGCCAGCTCACACAGACACACACGGC
ATCAACACGGCTCGCATGGCCGAGGTGTGGATGGACGAGTACGCTGAGCTGTTCTACCTG
AACCGACCCGACCTCAGGAAAAGTCCCAAGATCGGTGACGTCACGCACCGTAAGATCCTC
CGGGAAAAGCTGAAGTGTAAGAGCTTCCAGTGGTACCTGGACAACATCTACAAGGAGAAG
TTCGTGCCTGTCAGGGATGTCTTTGGATACGGGAGGTTCATGAATCCGTCCTCGGCGATG
TGTCTCGACACTCTTCAAAGGGAAGGTGAAGCGACAGCGTTGGGTCTGTATCCGTGTCAC
AGTCGCCTGGAGCCCACGCAGCATCTAGCGCTGTCCCTCGCCGGGGAACTCCGGGACGAA
GAGAAGTGCGCCGAAGTTCAATACGAGAGAAGTCCAGTGGGTTCCAACGAGAACGTCAGC
AGGAGAGTGTTGATGGTCACCTGTCACGGAAAACATCGAGGCCAGCACTGGCGATATCTG
CCGACACAACAGATCCAGCATACGGAGAGCGGCCTCTGCCTGCACAGTACAGGCATCTCG
GGGTCTGACGCTCTGGTGATGCGATGCAGAGCTGGCGGCGCGCAGGTGTGGGTCATCGAT
TACAGCGAGATCAATGATTTTAGAATGAACGATAACGAGGTGCCGAGTGAGCAAGAAATG
AAACTAAAAAAGCTGAGAGGCCAGCGCCGTATCTCGCGATCCCTCCTGTCGTACGAGGAC
ACGACCACCAGGCGACACCACAGGAAGAAGAAGCACAAGAAGAAGAACAAGTTCATCCTC
AAGCTGGTGAGGAACGAGACGGAACACCTGGAGGTGGACGTGTACTGCAAACACGCGCGG
CTCTACCCCAACAACAGCTTCGTGCGGGACCTGGTCACCATCCTCAACGACAAAGACATC
AAGGTCATCAACAACGGACGTGTCTTCGAACGGAACAAGGTGTCTGATGTCGTGAACCGG
GAGCACTCCGCCGTGACACTCACCACCGACACCGTCAACACCCCCGACACCCCCGATACC
CGCGACACTCCCGACACCTTCAACCCCTACACCGCCGACACCCCGTCGGACTACCCTCAG
ACGAAACCTCTGAAGAAACGGAAGCGGAAGAAGCACCGACCGGTGGCCGCGAGCCCGGAG
CTCGGAGACGCCGCCAGCAACAAGGAGGAGGAGCAGGCTGGAGACCAGGACGACCTGAAC
AGGCCCGGCTCCAAGAAGATCATCATAGAAGACTTCGTGGAGATCAAACCCATGGGGGCC
TTGGCCAAGCACCTGAAGAAACGGAAACGGGCGAAAAAAACCCGGAGCACGACCGGCGCG
CCGCGGGTGGGCGACTGGGAGGACTCCATGAGGATGGAGAGTTCAGCTGACGACTCGCCC
ATGGTGGAGGCAGACGGCCGACCCGACACAGGGCGACGCGCCCCCAGCAGCTCGCGGGCA
CCTCGGCGCTCGGACCTGCTGCCTCAGAACAGAGCGCTCAACGCACAGGCACAGGAACCG
GTGAAGCTCATCATGAAGAGCAACGTCACCCTCAAGCTGGGAGACGAGTTCTTCACCTGG
ACCAAGAGGGGCCGCACCGCGGACGATGTGGCCGACATCCTCGGCGAGCTCATTATACCG
GACACGGCGAAAATCAGTCTGGAGAACGAGACGAAGGAGAAGGAAACGGAAGGATTTAAG
GGGAACAAGGAGGAGTCGGGGAGGAGGAGAGACTCTAGCTCCTCGGAGAGTGGACAGGCA
GACTGA

Protein sequence:

MGHFTWIEVPEREKRRRGSDIAPTWSPTMAGGLFAINRQYYWELGAYDEQMAGWGGENLE
MSFRIWQCGGTLETVPCSRVGHVFRAFHPYGLPAHTDTHGINTARMAEVWMDEYAELFYL
NRPDLRKSPKIGDVTHRKILREKLKCKSFQWYLDNIYKEKFVPVRDVFGYGRFMNPSSAM
CLDTLQREGEATALGLYPCHSRLEPTQHLALSLAGELRDEEKCAEVQYERSPVGSNENVS
RRVLMVTCHGKHRGQHWRYLPTQQIQHTESGLCLHSTGISGSDALVMRCRAGGAQVWVID
YSEINDFRMNDNEVPSEQEMKLKKLRGQRRISRSLLSYEDTTTRRHHRKKKHKKKNKFIL
KLVRNETEHLEVDVYCKHARLYPNNSFVRDLVTILNDKDIKVINNGRVFERNKVSDVVNR
EHSAVTLTTDTVNTPDTPDTRDTPDTFNPYTADTPSDYPQTKPLKKRKRKKHRPVAASPE
LGDAASNKEEEQAGDQDDLNRPGSKKIIIEDFVEIKPMGALAKHLKKRKRAKKTRSTTGA
PRVGDWEDSMRMESSADDSPMVEADGRPDTGRRAPSSSRAPRRSDLLPQNRALNAQAQEP
VKLIMKSNVTLKLGDEFFTWTKRGRTADDVADILGELIIPDTAKISLENETKEKETEGFK
GNKEESGRRRDSSSSESGQAD