DPGLEAN12975 in OGS1.0

New model in OGS2.0DPOGS210595 
Genomic Positionscaffold554:+ 50024-57003
See gene structure
CDS Length3330
Paired RNAseq reads  4136
Single RNAseq reads  9357
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA013544 (7e-60)
Best Drosophila hit  CG1371 (9e-146)
Best Human hitnodal modulator 2 isoform 1 (3e-124)
Best NR hit (blastp)  PREDICTED: similar to conserved hypothetical protein [Nasonia vitripennis] (0.0)
Best NR hit (blastx)  GK20961 [Drosophila willistoni] (3e-165)
GeneOntology terms  GO:0030246 carbohydrate binding
InterPro families



  
IPR013784 Carbohydrate-binding-like fold
IPR008969 Carboxypeptidase-like, regulatory domain
IPR008970 Collagen-binding surface protein Cna, B-type domain
IPR013783 Immunoglobulin-like fold
IPR014766 Carboxypeptidase, regulatory domain
Orthology groupMCL11556

Nucleotide sequence:

ATGTTCGAGGGAAATAGTTTTTATGTTCATTTAATTTTCTTATTATCATTAATATCTTCA
AGTTACAGTAATGATATCTTAGGATGTGGTGGTTTCGTAAAAAGTCATGCCAGCATAGAC
TTTTCTAAAATCGAAATCGGCTTATATACTAGAGACGGTAGTCTCAAAGAGAAGACTGAG
TGCGCACCCACAAACGGTTACTATTTCTTACCTCTATACGAAAAGGGTGAATATGTTTTA
AAAGTTCATCCGCCCGCTGGCTGGAGTTTTGAGCCATCGCAGGTAGAATTAGACATAGAC
GGTGTGACCGATCAGTGTTCGATCGGCCAAGACATAAACTTCGCTTTCAACGGATTCGGT
ATAACCGGCAAGGTGATCACCGCCGGTCAGGTCAGCGGCCCCAGTGGTATCAATATACAG
CTTGTGAATGAGAAAGGAGAGACCAGAAACACAGTTACAACATCTGGCGGAGATTTCCAT
TTCACCCCTGTGATACCTGGAAAGTATGTTGTTAAAGCGTCTCACCCTCGATGGAAGTTG
GAGCCTGCACATACTGTTGTGCAAGTGAAGGAAGGTAACACCGCTTTGCCTGTGGGGGTT
TTAGCCGTTAAAGGCTATGATGTTTCGGGTTCCGCGACATCATTCGGCAGCCCCTTGGGT
GGAGTTCATGTACTACTTTACTCCAAAGAGGAGAAACCTAAGTTCCGCGTGGAGGGTTGC
AAGACTGCACTTCTTCAAGGTGTTCCGGATGCTCCTATTTGTTATTCAGTCACTGATGCT
AATGGGGAGTTTAAGTTTGGTCTCGTGCCAGCTGGAGAGTACAAACTACTAGCTTTGGCC
AAGACACCGGGGCAGACCTTCCTCACATACAACATCAAGCCTGATTCGGTGCCGTTCAGT
GTACTTCATGATAGCTTGTATATTAGAAATGCTTTTGAGGTTATGGGATTCTCAATCGTG
GGTTCCGCTCTGTCAGCTCCGGGTGGTAGTGGTATTGCAGGAGCTCAGGTGCTGTTGGCG
GGACAAGCTGTCACCACCACCGACAAGAAGGGGCACTTCACACTCAGTGGACTGAAGCCA
GGGGAATACTCACTGACCTTACAACATGAGCACTGCAGCTGGGAGGAGAAGCAGCTGTCT
GTGAGTGCGAGCGGTGTGGGGAGTCCCTTGACGGTGGTGGCGTCACGCTGGAAGGTGTGC
GGGTCGTTGACCCCACCCGAATCTCGCATCGTGCAGCTGCGTGGACCGAAGGACGAGGAC
CTCACAACTAAAGCTGATGGTTCCTGGTGTTCCCTGCTGCCCCCGGGCTCGTACTCCGCG
CGCGTGTCCGTCACGGAGCAGGAGCAGAGGGACGGCCTCCAGTTCTATCCGGAGGTGCAG
CACGTGTCCGTGGGCGGGGCGCCCGTCGGCGGGGTCTCCTTCAGCCAAGTGCGAGCGCGG
GTGAGAGGCTCCGTGAACTGCGCCCCGTACTGCCGCGGCCTAAGAGTGGCGCTGCGCCCC
CTCACAGCCGACGGGACTTACGCGGGCCCGCCACGCTACGCGAACATCGTCGACGGAGCG
TACACGTTCGAGGAGGTGGTCCCTGGCAGCGTGGAGGTGTCAGTGGTGGAGGGCGGCGCG
GGCGAGGCGCGGCTGTGCTGGAGGCAGGCCGCGCACAACGTGGTGGTGGCGCAGGACCTG
CCGCCCGTCACCGAGTTCACACTCACCGGCCTCGGCCTCGTCATCACCGCCTCGCATGAC
ATGGAGGTGGAGTACACGAGCGTCCACTCCTCGGGCGTGGTGAAGGTGTCTGCGGGCCGC
AGCCTGGTGTGTGTGCCGCCCGCCCCTCGCTACACGCTCACCGCCCGCGGCTGTCACCGC
GTCTCGCCGCCCACCGTGGACGTCGACATGCAGGGAACGGACATGCCGAGCGCGTCTTTC
AAGGCGACGGCGCACGCCTCCACCATCACGATCTCGTCTCCGGAGCGCGCCACGGACGTG
AGGTTGCACGTCACCACGGACGGCGGCCCCGCCACCGTGGACCTGCAGCCCGAGGCTCAC
GGCGACGGGTTCCTCTACACCCACACCATGTACCTGGCCGAGGGAGAGGTGGCGTCCGTG
CTGATGGAGTCGTCGACCCTGCTGTCGGTCCCGGGCGGGCGGCAGGATGTGGTGGGGGCG
GCGAGCTGCTCCAGGGCGCTCGCCCTCAGGGCGGTTCGAGCCAGGAAAGTCACGGGCCGA
GTCGTTCCGCCAGTAGAAGGTGTCACCATCACTCTGCAGGGAGGTGACGTGAAGCTGTCT
CAGGTGACCAAAGCCGACGGCCTCTACAGCTTCGGTCCCCTGGACGCGTCCGTGTCGTAC
AGCGTCACGGCCGAGAAGGAGTCGTACGTGTTCAGTGAGGTGGAGCCCTCGGGAGACGTG
CGCGCTCACCGCCTGGCCGAGATACAAGTACAGCTCGTCGACGACAGCAACAACCAGCCG
CTAGAGGGGGCGCTGGTGTCCATCTCCGGGGGCTCGTTCCGTCTGAACGCGCTGTCGGCG
GCGGGGCGGGTGGCGGCGCGCTCGCTGGCCCCGGCCTCGTACTACGTGAAGCCACACATG
AAGGAGTACCGCTTCCAGCCGCCTCACACGCTGCTGGACGTGGCGGACGGACAGACACAC
ACGCTCACCTTCAGAGGTGTTCGCGTGGCGTGGTCAGCCGTGGGGCGCGCCGTGTGTGTG
GGCGGCTCGGGGGTCCCCGGGCTGGCCCTCCGCGCTGTGGGGGACTCCGACTGTCACACG
CAGGACGCCGTCTGTGATCAGGACGGATACTTCCGCATCCGCGGCCTGCTGCCCGGTTGT
ACGTATTCCATCCAGCTGAAGGAGTCCTCGGAGCCGGCGCGTCTGGCGGACACGCCGCTC
GTCATAAAGATGACGGAGAGTGACGTGCTGGACCTGCGGGTGATCGTGATCCGGCCCCAC
CAGGTGTCGGACACGCTGGTGCTGGTGCGCTGCTCCAACCCCGACCACTACAAGACCCTC
CGCCTGACCCTCAGCCGCGAGTCCTCCTCGCCCGTGTTCTCCACCAAGCTGGACCCGGCG
GGCTACTCCCAGGTCAACAACCCCGGCCTGCTGTACCCGCTGCCTCGCCTGCCGGCCGAT
AATAACTCTTATGTGGTGTCGTTGGAGTCTACCCTGTCCAAGGTGACTCACTCCTATGAA
GAGCCGGTGCACTACTTCGTGTCGGACGGACGCTTCCGCTACTTCGAGATCAATTTTGAT
CCCAAGGCGAGTCTCTATATATATTATGGTATCACTAACTGTCGTCACACTCGAGCCATC
CATCTCTCAACAAATATAATGAGGAAGTAG

Protein sequence:

MFEGNSFYVHLIFLLSLISSSYSNDILGCGGFVKSHASIDFSKIEIGLYTRDGSLKEKTE
CAPTNGYYFLPLYEKGEYVLKVHPPAGWSFEPSQVELDIDGVTDQCSIGQDINFAFNGFG
ITGKVITAGQVSGPSGINIQLVNEKGETRNTVTTSGGDFHFTPVIPGKYVVKASHPRWKL
EPAHTVVQVKEGNTALPVGVLAVKGYDVSGSATSFGSPLGGVHVLLYSKEEKPKFRVEGC
KTALLQGVPDAPICYSVTDANGEFKFGLVPAGEYKLLALAKTPGQTFLTYNIKPDSVPFS
VLHDSLYIRNAFEVMGFSIVGSALSAPGGSGIAGAQVLLAGQAVTTTDKKGHFTLSGLKP
GEYSLTLQHEHCSWEEKQLSVSASGVGSPLTVVASRWKVCGSLTPPESRIVQLRGPKDED
LTTKADGSWCSLLPPGSYSARVSVTEQEQRDGLQFYPEVQHVSVGGAPVGGVSFSQVRAR
VRGSVNCAPYCRGLRVALRPLTADGTYAGPPRYANIVDGAYTFEEVVPGSVEVSVVEGGA
GEARLCWRQAAHNVVVAQDLPPVTEFTLTGLGLVITASHDMEVEYTSVHSSGVVKVSAGR
SLVCVPPAPRYTLTARGCHRVSPPTVDVDMQGTDMPSASFKATAHASTITISSPERATDV
RLHVTTDGGPATVDLQPEAHGDGFLYTHTMYLAEGEVASVLMESSTLLSVPGGRQDVVGA
ASCSRALALRAVRARKVTGRVVPPVEGVTITLQGGDVKLSQVTKADGLYSFGPLDASVSY
SVTAEKESYVFSEVEPSGDVRAHRLAEIQVQLVDDSNNQPLEGALVSISGGSFRLNALSA
AGRVAARSLAPASYYVKPHMKEYRFQPPHTLLDVADGQTHTLTFRGVRVAWSAVGRAVCV
GGSGVPGLALRAVGDSDCHTQDAVCDQDGYFRIRGLLPGCTYSIQLKESSEPARLADTPL
VIKMTESDVLDLRVIVIRPHQVSDTLVLVRCSNPDHYKTLRLTLSRESSSPVFSTKLDPA
GYSQVNNPGLLYPLPRLPADNNSYVVSLESTLSKVTHSYEEPVHYFVSDGRFRYFEINFD
PKASLYIYYGITNCRHTRAIHLSTNIMRK