DPGLEAN01154 in OGS1.0

New model in OGS2.0DPOGS207250 
Genomic Positionscaffold614:- 85930-101923
See gene structure
CDS Length2841
Paired RNAseq reads  346
Single RNAseq reads  962
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA012067 (2e-66)
Best Drosophila hit  multiplexin, isoform E (2e-77)
Best Human hitcollagen alpha-1(XV) chain precursor (1e-45)
Best NR hit (blastp)  PREDICTED: similar to collagen alpha 1(xviii) chain [Tribolium castaneum] (7e-125)
Best NR hit (blastx)  collagen alpha 1(xviii) chain [Aedes aegypti] (2e-107)
GeneOntology terms




  
GO:0005581 collagen
GO:0007155 cell adhesion
GO:0005198 structural molecule activity
GO:0031012 extracellular matrix
GO:0005488 binding
GO:0008045 motor axon guidance
InterPro families


  
IPR016186 C-type lectin-like
IPR016187 C-type lectin fold
IPR010515 Collagenase NC10/endostatin
IPR008160 Collagen triple helix repeat
Orthology groupMCL12955

Nucleotide sequence:

ATGAAATGGATTTGGTTACGTGTTGTAATACTTCTCGTGGTACAGGGTGCCTTTCAAGAG
CTCAAGCTTTATGGCTCTCCATCGCAAGCGGAAGTGCAATGTGTTAATACATACGAGGAA
ATAGGTAAAAGCAGCGAGGGAGATGAAATCGATATTGACAATTTTTTGGTCGACCAGGAG
GAAGGAGATTCAGAGGGTTCAGGGCGGTATGGGACCATACCACCGTTTCCACCACCACCA
CCAGGAATGGATGGGTATCCTTTGCGGCTTCGAGGAGAAAAGGGAGAAAGAGGACCAAGG
GGTCCGCCTGGGGAATCTATACGTGGCCCTCCTGGGCCACCAGGCCCTCCCGGTCCTCCT
GGCGTAAACACCGTCTCGGAAACTTCAGGGTCCGGAGATGACCAAATTTTTGGAGAAAAT
TACGCATCGCTCGGTCATTGTGGATGCAATTCAAGTGTTTTATTATCACTTTTGGAAATT
GCTCCCGAACTTCAAGGCCCTCCAGGTCCTCCTGGTATAACGGGGGCTGACGGATTAACA
GGTGCTCCTGGAATACCTGGACAACCTGGCATGCCTGGAGAGAGAGGTTCAATCGGCCAA
CGAGGCGAAAAGGGTGATAGAGGTGACAGTGGACCACGAGGATCAGAGGGTCAACCAGGT
CCTAAGGGAGAGCCAGGTGTAGATGGAAGACCTGGAAGTCCGGGACCACCAGGTCCACCA
GGAACACCTGGCTCTTCTGATTATAACAACTTTGATGAATCATTACTGGGTTCATACGGT
GGAGCAATTGGAAGACCAGGTGCCCCGGGGCCTAAGGGAGATGCAGGTCAACCTGGACCT
ATAGGACTACAAGGAGAACGGGGATTCCCTGGACCCAAAGGTGAAAGAGGGCAAATAGGA
CAAACAGGAGCCAAAGGTGACCGTGGACATCCTGGCCACAAGGGAGATAGAGGAGTAAAG
GGAGATCGAGGTAATCCAGGACTAGATGGGCGTTCTGGTCTACCGGGAGCCAATGGACGT
TTTGGAGAAAAAGGAGAAAAGGGAGAACGAGGCATACCTGGTCCACCGGGACCGCCATCC
CTACCCATAGGAGTTGTTGCTTCGGAAGAACCGGAATTTCTGGCGACAGGTTTACGACAC
TTAGGACCAGCTGAAAAAGGAGAAAAGGGAGAGAAGGGGAGTCGAGGAAATGACGGAACA
TCAGGTTTTCCTGGAAAAGATGGTAAGCCAGGCGAAAGGGGTGATATAGGTCCCTCTGGT
TTACCAGGTATTGCAGGTCCTCCAGGGAGCCCAGGTCTAAAGGGTGACAGAGGAGAAAGA
GGTCCTCCAGGCCCTGTCAGTTTAACATCTGCCGGCTCTGATATTCTAACAATCAAAGGT
GAAAAAGGTGAACCAGGCTTAAGGGGCCGAAGAGGACGACCTGGTCCACCTGGGCCGCGA
GGAGCCCAAGGACTTCAAGGCTTAGTTGGTCCAACCGGAAAACCGGGCGAAAAAGGTGAC
ATTGGTTTACCTGGTTGGATGGGACGACCAGGAACATTAGGACCACCGGGAATTCCAGGC
CCAGTAGGACCAAAGGGAGAAAAAGGGGACCCTGGAGTGAATATATTAGATGTCTCAATG
GGAGAAAAAGGAGACCGAGGATTAGAGGGGATATCTGGTCCGAAAGGAGAGCAGGGTCCT
ATTGGACCTCCAGGTCCACCTGGTCCCGGTTCTAGATCAGAAGCAGTACAATATATTCCT
GGACCACCAGGGCCTCCAGGACCACCAGGGCAACCTGGAACTCCTGGAATATCTATTGTC
GGACCCAAGGGTGAACCTGGAGTTAGCTACCTAGAAGAATATCCTGTGCATGGAAGCACG
AAATACTTTGGTAGACCAGCCTCTCCAGAATATCGACCTCATCAAGACGAAATGAACGCC
AACAAGAATGTACCAGGCGCTCTGGTATTCCACACTACGGAAGAGATGCTACGGCTTGCG
TCAACAAGTCATCTTGGAGCACTTGCGTATGTGATTGAAGAACAATCCCTTTTTGTAAAG
GTTAACTCAGGCTGGCAGTACGTTTTGTTAGGTTCCCTAGTGACGCAATCAGCTCTCCAT
ACAACAACAACGTCTGCTCCGGCACCACCACCACTACTGCCGGCTGCAAGCCTTGTGCAT
GCACCTTTATCAAACATGGTGGATACGCCTCTAGCTCCCATGGGACCTAGTCTCCGTCTA
GCCGCTCTGAATGAGCCGCTGTCCGGCGACATGCATGGCATACGCCGTGCTGACTATGCC
TGCTACCGACAAGCTCGTCGAGCTGGCCTGAAAGGAACATTCAGGGCCTTTCTTACAAGC
AGAATACAAAACTTAGATTCCACAGTGCGATATGCTGATAGGCATTTGCCAGTTATCAAC
ACTCAGGGTGACGTCCTATTCCAATCATTCTCAGATATTTTTGATGGAAATGGTGGTGTG
ATAGCTGGATCCCCAAGGATATACAGCTTTAGCGGAAAGAATATAATGCTTGATTCAAAC
TGGCCTCAAAAGCTCATCTGGCACGGATCTCATGCGAGCGGAGAACGAGCTCTGGAGACT
TTCTGTGAGGAGTGGCAGAGCGCTGATCCCTCATCCCGTGGCATGGCCGCCTCGTTACAT
TCACACCGACTTTTGTCTCAGGAGAGATATTCCTGTAATAACCACTTTGCAGTATTATGT
ATTGAAGCTACTTCGCACTTGAGTGTTCGAAGAAAACGAGAGATAGCAAGGTACAACATG
TCTTCGGTGAATGACGAGTATCATCCGTACAACGCTGAAGAATATCAAGACTTGTTAAAT
GAGATATTCGGACAACCATAA

Protein sequence:

MKWIWLRVVILLVVQGAFQELKLYGSPSQAEVQCVNTYEEIGKSSEGDEIDIDNFLVDQE
EGDSEGSGRYGTIPPFPPPPPGMDGYPLRLRGEKGERGPRGPPGESIRGPPGPPGPPGPP
GVNTVSETSGSGDDQIFGENYASLGHCGCNSSVLLSLLEIAPELQGPPGPPGITGADGLT
GAPGIPGQPGMPGERGSIGQRGEKGDRGDSGPRGSEGQPGPKGEPGVDGRPGSPGPPGPP
GTPGSSDYNNFDESLLGSYGGAIGRPGAPGPKGDAGQPGPIGLQGERGFPGPKGERGQIG
QTGAKGDRGHPGHKGDRGVKGDRGNPGLDGRSGLPGANGRFGEKGEKGERGIPGPPGPPS
LPIGVVASEEPEFLATGLRHLGPAEKGEKGEKGSRGNDGTSGFPGKDGKPGERGDIGPSG
LPGIAGPPGSPGLKGDRGERGPPGPVSLTSAGSDILTIKGEKGEPGLRGRRGRPGPPGPR
GAQGLQGLVGPTGKPGEKGDIGLPGWMGRPGTLGPPGIPGPVGPKGEKGDPGVNILDVSM
GEKGDRGLEGISGPKGEQGPIGPPGPPGPGSRSEAVQYIPGPPGPPGPPGQPGTPGISIV
GPKGEPGVSYLEEYPVHGSTKYFGRPASPEYRPHQDEMNANKNVPGALVFHTTEEMLRLA
STSHLGALAYVIEEQSLFVKVNSGWQYVLLGSLVTQSALHTTTTSAPAPPPLLPAASLVH
APLSNMVDTPLAPMGPSLRLAALNEPLSGDMHGIRRADYACYRQARRAGLKGTFRAFLTS
RIQNLDSTVRYADRHLPVINTQGDVLFQSFSDIFDGNGGVIAGSPRIYSFSGKNIMLDSN
WPQKLIWHGSHASGERALETFCEEWQSADPSSRGMAASLHSHRLLSQERYSCNNHFAVLC
IEATSHLSVRRKREIARYNMSSVNDEYHPYNAEEYQDLLNEIFGQP