New model in OGS2.0 | DPOGS210743  |
---|---|
Genomic Position | scaffold2621:- 3397-21173 |
See gene structure | |
CDS Length | 4614 |
Paired RNAseq reads   | 484 |
Single RNAseq reads   | 1060 |
Migratory profiles | Query via corresponding ESTs |
Best Bmobyx hit | BGIBMGA006274 (1e-06) |
Best Drosophila hit   | collagen type IV, isoform A (4e-10) |
Best Human hit | collagen alpha-2(I) chain precursor (6e-47) |
Best NR hit (blastp)   | collagen alpha-2 precursor, putative [Pediculus humanus corporis] (0.0) |
Best NR hit (blastx)   | collagen alpha-2 precursor, putative [Pediculus humanus corporis] (3e-117) |
GeneOntology terms    | GO:0005576 extracellular region GO:0005578 proteinaceous extracellular matrix GO:0005201 extracellular matrix structural constituent GO:0005581 collagen GO:0005515 protein binding GO:0005584 collagen type I GO:0048407 platelet-derived growth factor binding GO:0046332 SMAD binding GO:0070208 protein heterotrimerization |
InterPro families    | IPR000885 Fibrillar collagen, C-terminal IPR008160 Collagen triple helix repeat |
Orthology group | MCL10982 |
Nucleotide sequence:
ATGAATGTCAGAACCGAGCGGAGAATTGGTTCAGTGGAGTATCTACATCAGTTGAGACAA
TTGGTGTGTCCTTCGAAGGGCGATCTATTCGGGCGCTATTGTGGCGCGCGGGCGGGCCGC
GCTACTTACGCACAACACGTAGGTAAAATGTGGCAAGTGTCAGTAGTGTGGCTGGCGGTG
CTGGCGCTGCCAGTGGCTTGCAAGGACCTGGTAGAGGTCCGAGAGGAACTGCCCCAAGCT
CAAGAACCAGATCTCGTGAATTATGATGATTATAATCCGTTTGGAGAAGATGAAAACGCT
ATACTTGCAGAAAGAGACTTCACTCAATCGACCACTGCAACTCCGACTCGACGTCCTCCC
GCGCCGCCCACGAGAGCGCAATCCACCAGATGTTTCGACATAGGACGCTCGTTCTCTGAA
GGGGAAACCTGGAAACGAGACAACTGCACGCTCTGTCAGTGTTATAACACTCGTGTCAAT
TGCTCTGTGCTGCCATCTTGTCTTTTAAGCCCGGGCACGATCCGTCCAACACCATTGGTA
CCGCCGCAATCTCCACCGGTTCGTGGCATGGCTGGGGAGGCTGGGGCCGACGGACCCCCC
GGACCTCCAGGGACCCCTGGCGTTAATGGAGTCCCTGGAGCACCTGGCATTCCAGGACCG
GTCGGACCAATGCCTGATGTAAACGCCTACCTGGCTCAGTTAGCAGCTGCCGCCGGTGGT
GACAAAGGTCCTGCCTTCGATCCATATCACTACATGCAAGCACCTGTGGGAGTTCCAGGG
GTTAGGGGAATTCCAGGGCCCCAGGGTCCTCCGGGACCTCAAGGATTTCAAGGCCCCCGA
GGTGAACCCGGTGAGCCTGGATTGCCTGGTCCACCAGGAGTTGCTGGTGAGAGAGGTCCT
CAAGGGCTACCCGGAAAAGATGGATCACCGGGAGAAGATGGGGAGCCTGGTCCAGCCGGT
GCTATCGGCCAGCCAGGCCAGAGAGGAAGCCCTGGTATACCTGGTATTCAAGGATTGAAG
GGACATCGTGGCTTTGACGGCAAGGACGGTGCTAAGGGCGAACAAGGTTTGCCTGGAGAA
AAGGGCCCTACTGGCCTCCCTGGACCTATGGGCCCAAATGGACCCGCGGGTCCTGTCGGA
CCGAGAGGCGAAAGGGGCCGTGAAGGCCCTGCTGGAGTTCCGGGTATAAGAGGAGTTGAC
GGAGCTCCAGGATCACCGGGACTAATGGGTAATATTGGAAAGCCTGGCGCACCTGGTTTT
CCTGGATCCCCCGGCTTGAAGGGCGAACAAGGAGCAGTCGGCCCAAAAGGCAGCCAAGGA
ATGCAAGGTCCACGAGGGGAGCCAGGACGTAATGGACAGTCAGGGGAAGTGGGTCCCCCG
GGTCTAGCTGGTAGAGATGGCACACCGGGAGAAAAAGGGGTACAAGGACAAACAGGCGCC
GCCGGACCTCAAGGATTTCCGGGACCTCGTGGTACGCCTGGAGTTGCCGGTGATCCTGGG
TTGCCAGGGACGAAGGGAATGCCAGGCCAGCCGGGGGATAGAGGACCTAAGGGTGAAATT
GGCTTAAGAGGTGAAATTGGCGCACCAGGTCCGCGTGGTATGCCAGGTGGCATTGGACCT
GAGGGTAAAAGGGGAAAAAGAGGTTTGAGAGGACCACCGGGAAACTTAGGACCGCAAGGA
GATAGAGGGTTACAAGGATTAAGAGGTTTGAATGGCGCTGACGGTCCAATGGGTCCAAAA
GGTCAAACCGGGGAAAGAGGCGCTGTTGGTCTCCCAGGTCCAAAAGGCAGTACCGGCGAT
ATTGGCAGACCTGGACCTCAAGGACTTCTCGGCCCCAGAGGGTTTATGGGCAGACCAGGC
ATTCCAGGTAAATCTGGTCAACCTGGCGAAAGAGGTATACCTGGTGCAGATGGTAGGCCT
GGTGAACAGGGACTTCAAGGACTTCAAGGTCCACCTGGCTTAATTGGAAGTCCTGGCGAA
CGTGGATTGCAGGGTGAACATGGTAAAGATGGAGATGTTGGTCAACCTGGAGCACCTGGT
GCTAAAGGAGACGCCGGAAGAGATGGAAGTCCAGGGAGTCAAGGACCGCAAGGCCCCGTT
GGTGCAACTGGGGAACGCGGTCCCATAGGGCCTGCTGGACCGACTGGTTTTCCTGGTTTA
CCTGGAGCACCAGGATCACCAGGTCCTGCAGGTAAAGACGGAGAACCAGGTGTGTCTGGA
CCTGCCGGACCACCAGGAGCTACAGGACAAAGAGGTGAAAGAGGTTTCTCAGGAGAAAGA
GGCAGTCCAGGTCTACCAGGTGTTGCAGGAGAAAAGGGTGAAGCAGGTGCTCAGGGTCTT
GATGGTCCACCGGGTGCGGAAGGACCTCGTGGAAGTAAGGGACATCCAGGACCAATGGGC
ACGATGGGATTACCAGGACTACGTGGTATGTCCGGTTTGCCAGGTGAAAAGGGGGAGAGA
GGATCTTCAGGACCACAGGGGCCAGAGGGACCAGCGGGTCGCCAAGGAGAGCAAGGACCT
CAAGGTCCTATAGGTCCTGCTGGACCACCTGGAGAGCCGGCCGAAAGAGGTGAACCTGGA
ACACCAGGGATGCCAGGAGAATCAGGGGCACCGGGGTCTACAGGAGAGCGGGGTCACCCT
GGTCCTCAGGGCAATAATGGTTTACCTGGTCCTCCAGGATTAACAGGCATGCCCGGCTTG
AAAGGGGATAGAGGCTATGCAGGACCCAAAGGTCAACAAGGAGCTCAAGGAAGTCCAGGG
ATTTCTGGTGAACCGGGGCAACGAGGGCTTCCAGGGCAACCTGGAGCTAAGGGTGCTCAT
GGAGAGCAAGGACATAAAGGTGAAATTGGACGTGCTGGCTTACCGGGAAGACCAGGGGAC
ATGGGGCCGCAGGGTCCTCAAGGAAGCCCGGGTCCTACTGGGGCTCCTGGGTTACCAGGT
GCCAAAGGGTCGACTGGAGATACTGGACGGTCCGGACCACCTGGACCTCAAGGGTTAATT
GGACCACAAGGACCAGAAGGTCCTAAAGGCGAAAGAGGTGCTGAAGGAGAAACTGGGCCA
CAGGGACAGCAGGGTATCCCTGGACAAGCTGGAGAAAGAGGACCAAGTGGATTACCAGGT
TTAACTGGGGCACCAGGACCACAAGGTCTGAGAGGAGTGCAGGGAGAAAGTGGAGTTCCG
GGAAAACCAGGAGCAGACGGTGCCCCAGGTCCTATAGGCGCTCCTGGACCACAAGGGATG
ACTGGTCCAATGGGGGAACCTGGACCAGAAGGTCGTCCGGGCAAGCTGGGTCAGCCTGGT
ATTCCAGGACGGCAGGGTGAAAAAGGTCCTATGGGACAACCAGGACAACCAGGTCCACCA
GGTTCTCCAGGCGTGCAGGGACCTCCAGGTTCCTCCGGACCTCCTGGAGCAACTGGAGAA
AGAGGACCTAGAGGAGAAAGTGGTTCTCCTGGCATAGAAGGGCCTCAAGGTCCATCTGGC
AAGCAGGGCCCGCCTGGTATGGATGGTATTAAGGGTGAGCGAGGAGAAAACGGCGCTGAT
GGACCTAAAGGTCACGCGGGATTGCCAGGACTCCCGGGTTTGATAGGTACTCCAGGAAGA
CAAGGTGACAGAGGTTTACCAGGTGCCATAGGACCGCCAGGAAAGGATGGAGACGCTGGA
CCAAGAGGACCACCTGGCCGTGATGGTAGTCCGGGTCCCCAAGGGCCATTAGGTCCCCCG
GGAGGTCGTGGTCCTCCAGGAGAGCCTGGACGTCATGGAACACCTGGACCCGCTGGACCT
CCTGGACCACCAGGACCTCCTGGAGAAGGGTTGGCATATGATGCTGCTGCAATTGCTGCT
ATGCTTCAACAAGGAACGATGAAAGGTCCAGATCCGATGGGAGATGATCCGAACATAATG
CCACCAAGGTTCTTCAAAGAAGATATGTCCCCAGAAGAGAGGAAAAGTATTGTAATGAAA
GCGTACGAAAGACTCCAGGTTTCTCTGGACAAATTCTTAAAACCTGATGGCTCTAAGGAA
GCACCCGCCAAGACATGTGGGGATATCAAATACCATCATCCTCATTTCGAAAGTGGTCAG
TACTGGATAGATCCCAACGGTGGTGATATCAAGGATGCGATCTTGGTGCACTGCGATTTA
AGTACTGGCGCTAGCTGTGTATTTCCGAAGCCCATGATGTCGGAGGAACTCGTCCACTCC
GAAAGAAACGAGGCGTGGTTAAGTGAGATGGATAACGGGTTCGCTATATCTTACAAAGCG
GAGCACAGTCAGCTGACTTACCTACAACTGCTATCAGTGAAAGCGGTACAGAATGTCACA
CTTCATTGTCGAAATATTGTTGGCTACTATGACCCAGCCACTAGGAACTACAAACATGGC
CTGAAACTACTGGCTTATAATGATGCCGAGATTCTTCCTAAGGCCAATAACAGACTGCGA
TATAAGGCTTTAATAGATGAATGTCAGTTTAAATCTCAAGATTGGTCCAAGACGATAGTC
CAGTATGAGACGGACAAGCCTGGGCGGTTGCCAGTTCTGGATGTGGCTGTAAGAGACCTG
CCCAGAACCGACCAGGCCTTCAGGATTGAACTGGGACTGGCGTGCTTTACTTAA
Protein sequence:
MNVRTERRIGSVEYLHQLRQLVCPSKGDLFGRYCGARAGRATYAQHVGKMWQVSVVWLAV
LALPVACKDLVEVREELPQAQEPDLVNYDDYNPFGEDENAILAERDFTQSTTATPTRRPP
APPTRAQSTRCFDIGRSFSEGETWKRDNCTLCQCYNTRVNCSVLPSCLLSPGTIRPTPLV
PPQSPPVRGMAGEAGADGPPGPPGTPGVNGVPGAPGIPGPVGPMPDVNAYLAQLAAAAGG
DKGPAFDPYHYMQAPVGVPGVRGIPGPQGPPGPQGFQGPRGEPGEPGLPGPPGVAGERGP
QGLPGKDGSPGEDGEPGPAGAIGQPGQRGSPGIPGIQGLKGHRGFDGKDGAKGEQGLPGE
KGPTGLPGPMGPNGPAGPVGPRGERGREGPAGVPGIRGVDGAPGSPGLMGNIGKPGAPGF
PGSPGLKGEQGAVGPKGSQGMQGPRGEPGRNGQSGEVGPPGLAGRDGTPGEKGVQGQTGA
AGPQGFPGPRGTPGVAGDPGLPGTKGMPGQPGDRGPKGEIGLRGEIGAPGPRGMPGGIGP
EGKRGKRGLRGPPGNLGPQGDRGLQGLRGLNGADGPMGPKGQTGERGAVGLPGPKGSTGD
IGRPGPQGLLGPRGFMGRPGIPGKSGQPGERGIPGADGRPGEQGLQGLQGPPGLIGSPGE
RGLQGEHGKDGDVGQPGAPGAKGDAGRDGSPGSQGPQGPVGATGERGPIGPAGPTGFPGL
PGAPGSPGPAGKDGEPGVSGPAGPPGATGQRGERGFSGERGSPGLPGVAGEKGEAGAQGL
DGPPGAEGPRGSKGHPGPMGTMGLPGLRGMSGLPGEKGERGSSGPQGPEGPAGRQGEQGP
QGPIGPAGPPGEPAERGEPGTPGMPGESGAPGSTGERGHPGPQGNNGLPGPPGLTGMPGL
KGDRGYAGPKGQQGAQGSPGISGEPGQRGLPGQPGAKGAHGEQGHKGEIGRAGLPGRPGD
MGPQGPQGSPGPTGAPGLPGAKGSTGDTGRSGPPGPQGLIGPQGPEGPKGERGAEGETGP
QGQQGIPGQAGERGPSGLPGLTGAPGPQGLRGVQGESGVPGKPGADGAPGPIGAPGPQGM
TGPMGEPGPEGRPGKLGQPGIPGRQGEKGPMGQPGQPGPPGSPGVQGPPGSSGPPGATGE
RGPRGESGSPGIEGPQGPSGKQGPPGMDGIKGERGENGADGPKGHAGLPGLPGLIGTPGR
QGDRGLPGAIGPPGKDGDAGPRGPPGRDGSPGPQGPLGPPGGRGPPGEPGRHGTPGPAGP
PGPPGPPGEGLAYDAAAIAAMLQQGTMKGPDPMGDDPNIMPPRFFKEDMSPEERKSIVMK
AYERLQVSLDKFLKPDGSKEAPAKTCGDIKYHHPHFESGQYWIDPNGGDIKDAILVHCDL
STGASCVFPKPMMSEELVHSERNEAWLSEMDNGFAISYKAEHSQLTYLQLLSVKAVQNVT
LHCRNIVGYYDPATRNYKHGLKLLAYNDAEILPKANNRLRYKALIDECQFKSQDWSKTIV
QYETDKPGRLPVLDVAVRDLPRTDQAFRIELGLACFT