DPGLEAN10014 in OGS1.0

New model in OGS2.0DPOGS204937 
Genomic Positionscaffold254:- 733-5086
See gene structure
CDS Length1602
Paired RNAseq reads  1235
Single RNAseq reads  3396
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA005279 (0.0)
Best Drosophila hit  CG30463, isoform D (0.0)
Best Human hitpolypeptide N-acetylgalactosaminyltransferase 1 (7e-120)
Best NR hit (blastp)  UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase, putative [Pediculus humanus corporis] (0.0)
Best NR hit (blastx)  UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase, putative [Pediculus humanus corporis] (0.0)
GeneOntology terms  GO:0004653 polypeptide N-acetylgalactosaminyltransferase activity
InterPro families

  
IPR000772 Ricin B lectin
IPR008997 Ricin B-related lectin
IPR001173 Glycosyl transferase, family 2
Orthology groupMCL16904

Nucleotide sequence:

ATGAGTGAGGATGCAAAACTAGCGGTGTCCGAGGGATGGAAAAAAAACGCCTTCAATCAG
TACGCCAGCGACTTAATATCGATTAGACGGACACTACCCGACCCTAGGGACGAGTGGTGT
AAGCAACCTGGTCGCTACTTAGAAGATCTTCCGCAAACATCTGTTGTTATATGTTTTCAC
AATGAGGCGTGGTCAGTGCTTCTCAGAACTGTTCATTCAGTTATAGACAGATCTCCTGCA
CATTTAATAAAAGAGATTATTCTCGTCGACGACTTTTCTGATATGCCACACTTGATGCAA
CAACTCGATGATTATATGTCCTCGTTGCCGAAGGTTAGGATAGTGAGAGCAACTCAACGC
GAGGGTCTCATTAGAGCGAGGCTGCTAGGTGCTAAGTACGTCACAGCGCCAGTACTAACA
TATCTCGATAGCCATTGCGAATGTACTGAGGGTTGGCTAGAACCACTTTTAGATAGAATT
GCTCGCAACAAAACAAATGTAGTGTGTCCTGTTATCGACGTAATTGACGACAATACACTC
GAGTATCACTACCGAGATTCAACTTCAGTAAACGTTGGTGGTTTTGACTGGAATCTACAA
TTCAATTGGCACCCGGTACCCGCAAGGGAGCGAGCTCGACACAAACATACCGCTGAACCA
GTATGGTCTCCAACTATGGCTGGTGGTCTCTTTGCTATAGATAAAGAGTTTTTTGAAAGA
CTTGGAACTTACGACAGTGGATTTGACATATGGGGTGGAGAGAATTTGGAACTGTCATTC
AAAACGTGGATGTGCGGTGGCACTCTCGAAATAGTTCCTTGCTCACATGTCGGTCATATA
TTTAGAAAACGATCGCCATACAAGTGGAGGACCGGAGTTAATGTTCTCAAGAAAAATTCC
GTCCGATTAGCTGAAGTGTGGTTGGACGATTATTCAAAATATTATTATCAGCGGGTTGGC
AACGACAAGGGTGACTACGGTGATATTAGTGGTAGGAAGGAATTAAGAGAAAAACTTAAG
TGTAAATCATTCGATTGGTACTTAAAGAACATTTACCCAGAACTGTTCATACCGGGAGAA
TCCGTAGCCCACGGAGAGATTCGAAATATCGGCTTCGAAAGGACATGTCTGGACTCTCCG
ACGCGGAAGTCCGATCATCATAAGCCAGTAGGACTATACCCGTGTCATCGGCAAGGAGGA
AATCAGATCGCGAATCCCTCTTCGGATATGTGCGTGGACTCGGCTGCTGGACCAGAAGAC
ATGAAGAAGCCAGTCAACCCTTGGCCTTGTCATGGAGAATATGGCAATCAGTACTGGATG
TATTCGAAGAATGGCGAGATCCGTCGCGATGAGACTTGCCTCGACTATTCGGGTCACGAT
GTTGTTTTGTACCCCTGTCATGGGGCCAAGGGTAATCAATTATGGCTGTATGACCCCACT
ACGAAGCTAATAAAACATGGCTCAAGTGAAAAATGTATGGCGATATCGCGGAAGAAGGAC
AAGATTGTAATGGAAACGTGCAACGAAAGGGAGAATAGGCAACAGTGGAATATGGAAAAC
TTTAATGCTGACAGACTCAGTCCCGAACTGACGGCTGAGTAG

Protein sequence:

MSEDAKLAVSEGWKKNAFNQYASDLISIRRTLPDPRDEWCKQPGRYLEDLPQTSVVICFH
NEAWSVLLRTVHSVIDRSPAHLIKEIILVDDFSDMPHLMQQLDDYMSSLPKVRIVRATQR
EGLIRARLLGAKYVTAPVLTYLDSHCECTEGWLEPLLDRIARNKTNVVCPVIDVIDDNTL
EYHYRDSTSVNVGGFDWNLQFNWHPVPARERARHKHTAEPVWSPTMAGGLFAIDKEFFER
LGTYDSGFDIWGGENLELSFKTWMCGGTLEIVPCSHVGHIFRKRSPYKWRTGVNVLKKNS
VRLAEVWLDDYSKYYYQRVGNDKGDYGDISGRKELREKLKCKSFDWYLKNIYPELFIPGE
SVAHGEIRNIGFERTCLDSPTRKSDHHKPVGLYPCHRQGGNQIANPSSDMCVDSAAGPED
MKKPVNPWPCHGEYGNQYWMYSKNGEIRRDETCLDYSGHDVVLYPCHGAKGNQLWLYDPT
TKLIKHGSSEKCMAISRKKDKIVMETCNERENRQQWNMENFNADRLSPELTAE