DPGLEAN01805 in OGS1.0

New model in OGS2.0DPOGS210743 
Genomic Positionscaffold2621:- 3397-21173
See gene structure
CDS Length4614
Paired RNAseq reads  484
Single RNAseq reads  1060
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA006274 (1e-06)
Best Drosophila hit  collagen type IV, isoform A (4e-10)
Best Human hitcollagen alpha-2(I) chain precursor (6e-47)
Best NR hit (blastp)  collagen alpha-2 precursor, putative [Pediculus humanus corporis] (0.0)
Best NR hit (blastx)  collagen alpha-2 precursor, putative [Pediculus humanus corporis] (3e-117)
GeneOntology terms







  
GO:0005576 extracellular region
GO:0005578 proteinaceous extracellular matrix
GO:0005201 extracellular matrix structural constituent
GO:0005581 collagen
GO:0005515 protein binding
GO:0005584 collagen type I
GO:0048407 platelet-derived growth factor binding
GO:0046332 SMAD binding
GO:0070208 protein heterotrimerization
InterPro families
  
IPR000885 Fibrillar collagen, C-terminal
IPR008160 Collagen triple helix repeat
Orthology groupMCL10982

Nucleotide sequence:

ATGAATGTCAGAACCGAGCGGAGAATTGGTTCAGTGGAGTATCTACATCAGTTGAGACAA
TTGGTGTGTCCTTCGAAGGGCGATCTATTCGGGCGCTATTGTGGCGCGCGGGCGGGCCGC
GCTACTTACGCACAACACGTAGGTAAAATGTGGCAAGTGTCAGTAGTGTGGCTGGCGGTG
CTGGCGCTGCCAGTGGCTTGCAAGGACCTGGTAGAGGTCCGAGAGGAACTGCCCCAAGCT
CAAGAACCAGATCTCGTGAATTATGATGATTATAATCCGTTTGGAGAAGATGAAAACGCT
ATACTTGCAGAAAGAGACTTCACTCAATCGACCACTGCAACTCCGACTCGACGTCCTCCC
GCGCCGCCCACGAGAGCGCAATCCACCAGATGTTTCGACATAGGACGCTCGTTCTCTGAA
GGGGAAACCTGGAAACGAGACAACTGCACGCTCTGTCAGTGTTATAACACTCGTGTCAAT
TGCTCTGTGCTGCCATCTTGTCTTTTAAGCCCGGGCACGATCCGTCCAACACCATTGGTA
CCGCCGCAATCTCCACCGGTTCGTGGCATGGCTGGGGAGGCTGGGGCCGACGGACCCCCC
GGACCTCCAGGGACCCCTGGCGTTAATGGAGTCCCTGGAGCACCTGGCATTCCAGGACCG
GTCGGACCAATGCCTGATGTAAACGCCTACCTGGCTCAGTTAGCAGCTGCCGCCGGTGGT
GACAAAGGTCCTGCCTTCGATCCATATCACTACATGCAAGCACCTGTGGGAGTTCCAGGG
GTTAGGGGAATTCCAGGGCCCCAGGGTCCTCCGGGACCTCAAGGATTTCAAGGCCCCCGA
GGTGAACCCGGTGAGCCTGGATTGCCTGGTCCACCAGGAGTTGCTGGTGAGAGAGGTCCT
CAAGGGCTACCCGGAAAAGATGGATCACCGGGAGAAGATGGGGAGCCTGGTCCAGCCGGT
GCTATCGGCCAGCCAGGCCAGAGAGGAAGCCCTGGTATACCTGGTATTCAAGGATTGAAG
GGACATCGTGGCTTTGACGGCAAGGACGGTGCTAAGGGCGAACAAGGTTTGCCTGGAGAA
AAGGGCCCTACTGGCCTCCCTGGACCTATGGGCCCAAATGGACCCGCGGGTCCTGTCGGA
CCGAGAGGCGAAAGGGGCCGTGAAGGCCCTGCTGGAGTTCCGGGTATAAGAGGAGTTGAC
GGAGCTCCAGGATCACCGGGACTAATGGGTAATATTGGAAAGCCTGGCGCACCTGGTTTT
CCTGGATCCCCCGGCTTGAAGGGCGAACAAGGAGCAGTCGGCCCAAAAGGCAGCCAAGGA
ATGCAAGGTCCACGAGGGGAGCCAGGACGTAATGGACAGTCAGGGGAAGTGGGTCCCCCG
GGTCTAGCTGGTAGAGATGGCACACCGGGAGAAAAAGGGGTACAAGGACAAACAGGCGCC
GCCGGACCTCAAGGATTTCCGGGACCTCGTGGTACGCCTGGAGTTGCCGGTGATCCTGGG
TTGCCAGGGACGAAGGGAATGCCAGGCCAGCCGGGGGATAGAGGACCTAAGGGTGAAATT
GGCTTAAGAGGTGAAATTGGCGCACCAGGTCCGCGTGGTATGCCAGGTGGCATTGGACCT
GAGGGTAAAAGGGGAAAAAGAGGTTTGAGAGGACCACCGGGAAACTTAGGACCGCAAGGA
GATAGAGGGTTACAAGGATTAAGAGGTTTGAATGGCGCTGACGGTCCAATGGGTCCAAAA
GGTCAAACCGGGGAAAGAGGCGCTGTTGGTCTCCCAGGTCCAAAAGGCAGTACCGGCGAT
ATTGGCAGACCTGGACCTCAAGGACTTCTCGGCCCCAGAGGGTTTATGGGCAGACCAGGC
ATTCCAGGTAAATCTGGTCAACCTGGCGAAAGAGGTATACCTGGTGCAGATGGTAGGCCT
GGTGAACAGGGACTTCAAGGACTTCAAGGTCCACCTGGCTTAATTGGAAGTCCTGGCGAA
CGTGGATTGCAGGGTGAACATGGTAAAGATGGAGATGTTGGTCAACCTGGAGCACCTGGT
GCTAAAGGAGACGCCGGAAGAGATGGAAGTCCAGGGAGTCAAGGACCGCAAGGCCCCGTT
GGTGCAACTGGGGAACGCGGTCCCATAGGGCCTGCTGGACCGACTGGTTTTCCTGGTTTA
CCTGGAGCACCAGGATCACCAGGTCCTGCAGGTAAAGACGGAGAACCAGGTGTGTCTGGA
CCTGCCGGACCACCAGGAGCTACAGGACAAAGAGGTGAAAGAGGTTTCTCAGGAGAAAGA
GGCAGTCCAGGTCTACCAGGTGTTGCAGGAGAAAAGGGTGAAGCAGGTGCTCAGGGTCTT
GATGGTCCACCGGGTGCGGAAGGACCTCGTGGAAGTAAGGGACATCCAGGACCAATGGGC
ACGATGGGATTACCAGGACTACGTGGTATGTCCGGTTTGCCAGGTGAAAAGGGGGAGAGA
GGATCTTCAGGACCACAGGGGCCAGAGGGACCAGCGGGTCGCCAAGGAGAGCAAGGACCT
CAAGGTCCTATAGGTCCTGCTGGACCACCTGGAGAGCCGGCCGAAAGAGGTGAACCTGGA
ACACCAGGGATGCCAGGAGAATCAGGGGCACCGGGGTCTACAGGAGAGCGGGGTCACCCT
GGTCCTCAGGGCAATAATGGTTTACCTGGTCCTCCAGGATTAACAGGCATGCCCGGCTTG
AAAGGGGATAGAGGCTATGCAGGACCCAAAGGTCAACAAGGAGCTCAAGGAAGTCCAGGG
ATTTCTGGTGAACCGGGGCAACGAGGGCTTCCAGGGCAACCTGGAGCTAAGGGTGCTCAT
GGAGAGCAAGGACATAAAGGTGAAATTGGACGTGCTGGCTTACCGGGAAGACCAGGGGAC
ATGGGGCCGCAGGGTCCTCAAGGAAGCCCGGGTCCTACTGGGGCTCCTGGGTTACCAGGT
GCCAAAGGGTCGACTGGAGATACTGGACGGTCCGGACCACCTGGACCTCAAGGGTTAATT
GGACCACAAGGACCAGAAGGTCCTAAAGGCGAAAGAGGTGCTGAAGGAGAAACTGGGCCA
CAGGGACAGCAGGGTATCCCTGGACAAGCTGGAGAAAGAGGACCAAGTGGATTACCAGGT
TTAACTGGGGCACCAGGACCACAAGGTCTGAGAGGAGTGCAGGGAGAAAGTGGAGTTCCG
GGAAAACCAGGAGCAGACGGTGCCCCAGGTCCTATAGGCGCTCCTGGACCACAAGGGATG
ACTGGTCCAATGGGGGAACCTGGACCAGAAGGTCGTCCGGGCAAGCTGGGTCAGCCTGGT
ATTCCAGGACGGCAGGGTGAAAAAGGTCCTATGGGACAACCAGGACAACCAGGTCCACCA
GGTTCTCCAGGCGTGCAGGGACCTCCAGGTTCCTCCGGACCTCCTGGAGCAACTGGAGAA
AGAGGACCTAGAGGAGAAAGTGGTTCTCCTGGCATAGAAGGGCCTCAAGGTCCATCTGGC
AAGCAGGGCCCGCCTGGTATGGATGGTATTAAGGGTGAGCGAGGAGAAAACGGCGCTGAT
GGACCTAAAGGTCACGCGGGATTGCCAGGACTCCCGGGTTTGATAGGTACTCCAGGAAGA
CAAGGTGACAGAGGTTTACCAGGTGCCATAGGACCGCCAGGAAAGGATGGAGACGCTGGA
CCAAGAGGACCACCTGGCCGTGATGGTAGTCCGGGTCCCCAAGGGCCATTAGGTCCCCCG
GGAGGTCGTGGTCCTCCAGGAGAGCCTGGACGTCATGGAACACCTGGACCCGCTGGACCT
CCTGGACCACCAGGACCTCCTGGAGAAGGGTTGGCATATGATGCTGCTGCAATTGCTGCT
ATGCTTCAACAAGGAACGATGAAAGGTCCAGATCCGATGGGAGATGATCCGAACATAATG
CCACCAAGGTTCTTCAAAGAAGATATGTCCCCAGAAGAGAGGAAAAGTATTGTAATGAAA
GCGTACGAAAGACTCCAGGTTTCTCTGGACAAATTCTTAAAACCTGATGGCTCTAAGGAA
GCACCCGCCAAGACATGTGGGGATATCAAATACCATCATCCTCATTTCGAAAGTGGTCAG
TACTGGATAGATCCCAACGGTGGTGATATCAAGGATGCGATCTTGGTGCACTGCGATTTA
AGTACTGGCGCTAGCTGTGTATTTCCGAAGCCCATGATGTCGGAGGAACTCGTCCACTCC
GAAAGAAACGAGGCGTGGTTAAGTGAGATGGATAACGGGTTCGCTATATCTTACAAAGCG
GAGCACAGTCAGCTGACTTACCTACAACTGCTATCAGTGAAAGCGGTACAGAATGTCACA
CTTCATTGTCGAAATATTGTTGGCTACTATGACCCAGCCACTAGGAACTACAAACATGGC
CTGAAACTACTGGCTTATAATGATGCCGAGATTCTTCCTAAGGCCAATAACAGACTGCGA
TATAAGGCTTTAATAGATGAATGTCAGTTTAAATCTCAAGATTGGTCCAAGACGATAGTC
CAGTATGAGACGGACAAGCCTGGGCGGTTGCCAGTTCTGGATGTGGCTGTAAGAGACCTG
CCCAGAACCGACCAGGCCTTCAGGATTGAACTGGGACTGGCGTGCTTTACTTAA

Protein sequence:

MNVRTERRIGSVEYLHQLRQLVCPSKGDLFGRYCGARAGRATYAQHVGKMWQVSVVWLAV
LALPVACKDLVEVREELPQAQEPDLVNYDDYNPFGEDENAILAERDFTQSTTATPTRRPP
APPTRAQSTRCFDIGRSFSEGETWKRDNCTLCQCYNTRVNCSVLPSCLLSPGTIRPTPLV
PPQSPPVRGMAGEAGADGPPGPPGTPGVNGVPGAPGIPGPVGPMPDVNAYLAQLAAAAGG
DKGPAFDPYHYMQAPVGVPGVRGIPGPQGPPGPQGFQGPRGEPGEPGLPGPPGVAGERGP
QGLPGKDGSPGEDGEPGPAGAIGQPGQRGSPGIPGIQGLKGHRGFDGKDGAKGEQGLPGE
KGPTGLPGPMGPNGPAGPVGPRGERGREGPAGVPGIRGVDGAPGSPGLMGNIGKPGAPGF
PGSPGLKGEQGAVGPKGSQGMQGPRGEPGRNGQSGEVGPPGLAGRDGTPGEKGVQGQTGA
AGPQGFPGPRGTPGVAGDPGLPGTKGMPGQPGDRGPKGEIGLRGEIGAPGPRGMPGGIGP
EGKRGKRGLRGPPGNLGPQGDRGLQGLRGLNGADGPMGPKGQTGERGAVGLPGPKGSTGD
IGRPGPQGLLGPRGFMGRPGIPGKSGQPGERGIPGADGRPGEQGLQGLQGPPGLIGSPGE
RGLQGEHGKDGDVGQPGAPGAKGDAGRDGSPGSQGPQGPVGATGERGPIGPAGPTGFPGL
PGAPGSPGPAGKDGEPGVSGPAGPPGATGQRGERGFSGERGSPGLPGVAGEKGEAGAQGL
DGPPGAEGPRGSKGHPGPMGTMGLPGLRGMSGLPGEKGERGSSGPQGPEGPAGRQGEQGP
QGPIGPAGPPGEPAERGEPGTPGMPGESGAPGSTGERGHPGPQGNNGLPGPPGLTGMPGL
KGDRGYAGPKGQQGAQGSPGISGEPGQRGLPGQPGAKGAHGEQGHKGEIGRAGLPGRPGD
MGPQGPQGSPGPTGAPGLPGAKGSTGDTGRSGPPGPQGLIGPQGPEGPKGERGAEGETGP
QGQQGIPGQAGERGPSGLPGLTGAPGPQGLRGVQGESGVPGKPGADGAPGPIGAPGPQGM
TGPMGEPGPEGRPGKLGQPGIPGRQGEKGPMGQPGQPGPPGSPGVQGPPGSSGPPGATGE
RGPRGESGSPGIEGPQGPSGKQGPPGMDGIKGERGENGADGPKGHAGLPGLPGLIGTPGR
QGDRGLPGAIGPPGKDGDAGPRGPPGRDGSPGPQGPLGPPGGRGPPGEPGRHGTPGPAGP
PGPPGPPGEGLAYDAAAIAAMLQQGTMKGPDPMGDDPNIMPPRFFKEDMSPEERKSIVMK
AYERLQVSLDKFLKPDGSKEAPAKTCGDIKYHHPHFESGQYWIDPNGGDIKDAILVHCDL
STGASCVFPKPMMSEELVHSERNEAWLSEMDNGFAISYKAEHSQLTYLQLLSVKAVQNVT
LHCRNIVGYYDPATRNYKHGLKLLAYNDAEILPKANNRLRYKALIDECQFKSQDWSKTIV
QYETDKPGRLPVLDVAVRDLPRTDQAFRIELGLACFT