New model in OGS2.0 | DPOGS206549  |
---|---|
Genomic Position | scaffold74:+ 109891-131508 |
See gene structure | |
CDS Length | 5382 |
Paired RNAseq reads   | 25021 |
Single RNAseq reads   | 56761 |
Migratory profiles | Query via corresponding ESTs |
Best Bmobyx hit | BGIBMGA014040 (0.0) |
Best Drosophila hit   | viking (8e-134) |
Best Human hit | collagen alpha-1(IV) chain preproprotein (2e-111) |
Best NR hit (blastp)   | PREDICTED: similar to collagen alpha-2(IV) chain, partial [Acyrthosiphon pisum] (0.0) |
Best NR hit (blastx)   | hypothetical protein TcasGA2_TC014326 [Tribolium castaneum] (0.0) |
GeneOntology terms    | GO:0005581 collagen GO:0005587 collagen type IV GO:0005488 binding GO:0005201 extracellular matrix structural constituent GO:0005811 lipid particle GO:0007519 skeletal muscle tissue development |
InterPro families    | IPR008160 Collagen triple helix repeat IPR001442 Collagen IV, non-collagenous IPR016187 C-type lectin fold |
Orthology group | MCL10157 |
Nucleotide sequence:
ATGTCGGGCCCGGCGCGTGCGCGCCTCCTCGCTGTACACCTTGTGCCCGTTCTCCTCCTT
CTCTCCAGGATTTCGGCTCCTGTATCTGCTGTTGTATGCAATCAGACCGTATGTGATTGT
GCTGGTCTCAAAGGAGATCGAGGTGACATTGGTTCTCCAGGAATACCCGGACCCCAAGGC
GACTATGGAGAGGATGGACCTGATGGACCCATGGGACCCCCTGGTGAACCAGGGGACTGG
GGCGAGAAGGGTATCTCGGGTGACAAAGGAGAAAGGGGTGTAGATGGCCCATACGGACCG
AGGGGATTCACTGGACCTCAGGGACCCACTGGTTTAGAAGGTGTGAGAGGTATTGCTGGT
CTTGATGGATGCAGTGGTATTGATGGTATTATGGGTCCTCCTGGCCCCCAAGGTATACCT
GGTGATAGAGGGTTACCAGGTCCCTATGGTGAAAAAGGAACACAAGGGTTAGCAGGAGAG
GGTGGTGTTAATTCAAGAGGAGCGAAGGGAGATCAAGGTGATAGTGGACGACCTGGCTTA
CAAGGACCAAGAGGGCCTATTGGATGGCGAGGTGATAGTGGTATGCCGGGAGAAGTTGGT
GATCAGGGACCTATGGGGTACCGAGGAGAACCGGGATATAGAGGAGATCTCGGCGACGAC
GTTGTGGGGCCTCAAGGTGAAAAGGGAGATCAAGGGGAAGTTGGCGAATCAGGGAGGCCA
GCGAAAGTTATCTATATTGATCATCTTCAAGAAAATGTTACGATATTAGCCAAGGGAAAC
AGGGGTGACAAAGGATTTGCGGGTACACAGGGCGTCAAAGGTGTTAAAGGAGACACTGGC
TCTATGGGTCCTCCAGGGCCAAATGGACTTAACGGAGACCAAGGTTACAAAGGCGATCAA
GGAATAGACGGCCCTAGAGGTAAACCTGGACATCGTGGACGCAAAGGGCCTCTTGGACCA
AAAGGTGATAAAGGCTGGCCAGGATATGCTGGGCTAGATGGAGAAGACGGAGAGCCGGGA
GCAAGAGGAGAAGACGGAAGACCTGGAATGCCGGGAGTTCAGGGACCTCAAGGAGAGAAA
GGAATATACGACGAACGACTTAATGAACCACTTCTGCCAGGTTCAAACGGTCCACAAGGT
CCTGTAGGTTATCCAGGACCTCCAGGACCTCCAGGTGCAGATGGAACCAGAGGTTTGCCT
GGAATTCAAGGTCCACCAGGTCTTCCGGGACCGAAAGGAATAGCTGGACGTCAAGGACCG
CCAGGAAGTTCAGAAAAAGGAGAACCAGGAAATGACGGTTTTAAGGGTTTACCTGGACCT
CGAGGTCCTATGGGCTACCCAGGTCCACAGGGAGTATTAGGACCAAAAGGTTTTAAAGGC
TCAGCTGTAAGAGGCCCCGAGGGTGAAGAAGGTACACCGGGATTAGATGGCAGACCTGGT
ATAAGAGGAGACAGAGGGGACTTCGGTTTCATGGGCCTGCCCGGGTATCCAGGTCGAGGT
GTTCACGGTGTTGGTCCACCAGGGGAGGATGGTCCTCCAGGACGTCCTGGAGTCGTTGGT
GATAGTGGAACACCTGGAAGACCTGGATTTAGAGGTCCAAAAGGAGAACGTGGTGACGAC
TGTCCATTCTGTCCATCAGGTTTACCAGGTATGAAGGGAAAGAGAGGAGATGAAGGTTTT
AAAGGTCAAAAAGGATATCCTGGCCCTGAAGGAGATCGTGGCCAGCGTGGGTTAAAAGGA
GAAAGTGGATCACCAGGTTTACCTGGATCAAAAGGTCCGAAGGGCATTACTGGTCCGCCC
GGAATGACTGGCCGTCCTGGTCTACAAGGAGAGAAGGGACGACTTATACAGCCTCCTCTT
TCCTTAATAATAGCTGAACGTGGACCTCGTGGTGAACCAGGTTTTATTGGGGATCCTGGT
CTTCGTGGAGATCCAGGTTTTCCTGGGCTTCGAGGAGAAAATGGCTGGAAAGGTTCCAAA
GGTATGGCTGGTGAGGATGGTTTCCCTGGTCCTGATGGAAGGGATGGATTAAAAGGACGA
GACGGCGTACCAGGAATGCCTGGTGAACATGCCGATGTTCCTATACAATTTCTATTTGGA
CAACGAGGAGATAAAGGGATTAAAGGACAACTCGGAGAACCCGGAGATGATGGTCTAAAG
GGTGATGCCGGAGAAGCCCTTGGTTTCGGAATAAATGCTAAAGGAGAAAAAGGGGAACCG
GGACCGATGGGTCCAGAAGGTTTGCAAGGAATTAAAGGAGATTCTGGTGATATTGGATAC
GAAGGACTTCCAGGAGAACGAGGGGATATTGGTCTACCTGGTGTTTCTAAACAAGGAGAA
AGAGGTGCTAGAGGTTTTCCCGGAGACAAAGGAGATATAGGTCCCTACGGAGAACCTGGA
GGTCCAGGTCTTAGGGGTCCTGTGGGATTTGATGCACTTAAAGGCAAGAAAGGTAGTCGG
GGAGAAGTTGGGTACGCAATTATTTACGGAGAAAGAGGTTTCGATGGTATGGCCGGGGAT
TATGGTGATGTAGGTGAACCTGGTTATGCTGGAAACCCCGGAAGAGCAGGTTTGATGGGA
CCTAAAGGGGAACCAGGTTTACCTGGTGATGTGGGTCCACCTGGACCCGTAGGACCACCA
GGACGAAAAGGAATGTCAGGAAACATTATACAGGGTGCACCTGGTATGCCAGGTCAACCC
GGACGACTGGGTTCTATAGGATTAATCGGTGAACCAGGACTACAGGGCTACAATGGCTTG
CAAGGGGATGTTGGTCCTAAAGGGATGAAAGGAGAAGCTGGTCGAATGGGAAATCGTGGC
TGGACTGGTGAACGTGGTCTTACGGGCAGAAGAGGGCGACCCGGACTTATGGGTCAACCT
GGCCTGAGTGGCGAAACGGGAGACCGAGGTGAAACTGGTCTTCGTGGTTATGATGGTTTA
CCTGGTAAAGAAGGTCCCCTGGGCATAATTGGTCAAAAAGGAATACGTGGTGATATTGGT
TTACCGGGAGCAGACGGTTTAGGTGGACCTCCAGGTCCTAAAGGAGAGAGAGGTTACGAT
GGAGTTGTTGGTGATAAAGGAATGCAGGGAGAAAACGCCTCCATAGGAATGAAAGGCATG
TCTGGAGACATGGGTTTTAATGGAATGCCAGGAAGACCAGGGCAAACTGGTTTAAAAGGT
TTAAGAGGTGACATCGGCAACCCAGGATTAAATTTAAGAGGCCTTAATGGTACAAAAGGA
TTCCGAGGTGATGATGGCATTCCTGGAAGAGTAGGGGAAAAGGGTTTAAAAGGATTCCAA
GGAGATTACGGTTTCGAGGGTATTGCTGGTGAAATAGGAGACGAAGGTTTTCCAGGTTTA
TCTGGTTTACCTGGACGAATAGGATTTGATGGTGCCAAAGGACCTTCAGGGCACAAAGGA
TTGCCAGGTTTACAGGGTCCGAAAGGTGATACAGGATTTGAGGGTGAACCAGGTAGAATG
GGTTCACCAGGATATCCCGGTGACGTAGGCTTGCGAGGCTTGGTTGGTGAAAGGGGTCCA
TCTGGCGCCAAAGGAATGTCAGGAGATATTGGACCCAGTATTTATTTACCAGCCACCAAA
GGGGATATGGGAGATATCGGAATGGAGGGACTAAAAGGGGGTAAAGGCGAAATGGGTGAA
CCTGGATTTCCAGGATTAAAAGGCCACAAAGGAGAACAAGGCGATGTAGGCTTACAAGGA
GAATTTGGTGATGATGGACTTCCAGGTCCTAAGGGTTATTTAGGAGTAATGGGACCTCCA
GGTTTACCAGGTCTAGATGGCATCAACCCTGAGCCAGGAGAACAAGGCAAATCTGGAATT
GACGGATTACCAGGTTGGCCAGGTCCCATGGGTCAAAAGGGTGCTCCGGGAGAGTTTGGT
ATTAATGGTCCTGAAGGAGCACCTGGTCAACCAGGGCTCATTTTTAGTGGACCAAAAGGG
TATAAGGGAGCAACTGGTCGACCCGGGCTAAGGGGCATTTCTGGTAAGCCTGGTTCAACA
GGATTACAAGGAAATCCGGGACTAAAAGGATTAACTGGTGACATTGGTGAACCTGGCTAT
GCTATAAGCCCTAAGGGTGAAACAGGAAATCCTGGTATATCAGGGTTTTATGGCTTGAAA
GGGATAAAAGGAGAAGCTGGAGATTTGGGACTGGCAGGTTTGAAAGGATATCAAGGCCCA
ATGGGAATGAAAGGAGAAAGAGGTGACGAAGGCTATGAAGGACTTAATGGATATTCAGGT
GCTAAGGGAATGAAAGGTGATAGAGGAGATGAAATACTTCCATCAGATGTTGAGCCCGGG
CCAATTGGTGATATAGGTCCTCCTGGATTTGATGGGCAACCTGGTCGTGCAGGAGCTCCC
GGAAATTTCGGAGAAAATGGCATTCCTGGATTCAAAGGTGAAAGAGGTGATATTGGAGAT
ATTGGTCCTGAAGGTTTGCTAGGCAAACAAGGTGGACAAGGGTTCATGGGTATCAAAGGA
GAAATTGGTTTTGATGGAATCCGTGGTTTGCCTGGTCTTCCTGGATTACCAGCACCTCCT
CCACCAATTCCTAAATCAAGAGGATTCTATTTTACAGTACATTCACAGACTCATCTCATT
CCCGAATGCCCCTCTGGAACTACACCTTTATGGGAAGGATTCTCCTTACTTCATATAGTT
GCAAATTCTAAGGCCCATGGACAAGATTTAGGTGCACCTGGAAGTTGTCTTCGAAGATTT
TCAACAATGCCTTATATGTTCTGTAACATAAACAATGTTTGTGATTTCGCCCAACGCGAA
GACTACAGTTTTTGGCTATCAACACCAGAACCAATGCCAAGCGGAATGACCCCAATTCCA
GCAACTGACGTTGGATCATACATATCCAGGTGTCAAGTGTGCGAGACATCAACACGATCC
ATTGCTATTCATAGCCAAAGCAGCTCCATACCAACTTGTCCAGATGGTTGGGATGAATTA
TGGATAGGTTATAGTTTCCTTATGCATACCGCTGGAGCTGATGCGGCAGGTCAAAGTCTC
ATATCACCGGGATCCTGCCTTCGGGAATTCAGAACGCGACCATTCATAGAATGTAACGGA
CTCGGCCGTTGCAACTTTTTCGCAACCGCGGTTTCATATTGGTTATCAACAATTGATGAC
AACAAAATGTTTGAAACACCTATTCAAGAAACACTGAAACAAAATAAAGTTTCTAGAGTC
AGCAGGTGCGCCGTATGTATGCGACGTCAACCACAGAGGTCGTATAGCGCAGGCACAGTG
GAGGCTGTACCTAACGCAGTAGTACGACGCCCCGTCAACCGACCTCTTAACCGGCTTCGG
CCTCGCTACCCTGCGAGGTACCGGGGGAGACGCCGCCATTGA
Protein sequence:
MSGPARARLLAVHLVPVLLLLSRISAPVSAVVCNQTVCDCAGLKGDRGDIGSPGIPGPQG
DYGEDGPDGPMGPPGEPGDWGEKGISGDKGERGVDGPYGPRGFTGPQGPTGLEGVRGIAG
LDGCSGIDGIMGPPGPQGIPGDRGLPGPYGEKGTQGLAGEGGVNSRGAKGDQGDSGRPGL
QGPRGPIGWRGDSGMPGEVGDQGPMGYRGEPGYRGDLGDDVVGPQGEKGDQGEVGESGRP
AKVIYIDHLQENVTILAKGNRGDKGFAGTQGVKGVKGDTGSMGPPGPNGLNGDQGYKGDQ
GIDGPRGKPGHRGRKGPLGPKGDKGWPGYAGLDGEDGEPGARGEDGRPGMPGVQGPQGEK
GIYDERLNEPLLPGSNGPQGPVGYPGPPGPPGADGTRGLPGIQGPPGLPGPKGIAGRQGP
PGSSEKGEPGNDGFKGLPGPRGPMGYPGPQGVLGPKGFKGSAVRGPEGEEGTPGLDGRPG
IRGDRGDFGFMGLPGYPGRGVHGVGPPGEDGPPGRPGVVGDSGTPGRPGFRGPKGERGDD
CPFCPSGLPGMKGKRGDEGFKGQKGYPGPEGDRGQRGLKGESGSPGLPGSKGPKGITGPP
GMTGRPGLQGEKGRLIQPPLSLIIAERGPRGEPGFIGDPGLRGDPGFPGLRGENGWKGSK
GMAGEDGFPGPDGRDGLKGRDGVPGMPGEHADVPIQFLFGQRGDKGIKGQLGEPGDDGLK
GDAGEALGFGINAKGEKGEPGPMGPEGLQGIKGDSGDIGYEGLPGERGDIGLPGVSKQGE
RGARGFPGDKGDIGPYGEPGGPGLRGPVGFDALKGKKGSRGEVGYAIIYGERGFDGMAGD
YGDVGEPGYAGNPGRAGLMGPKGEPGLPGDVGPPGPVGPPGRKGMSGNIIQGAPGMPGQP
GRLGSIGLIGEPGLQGYNGLQGDVGPKGMKGEAGRMGNRGWTGERGLTGRRGRPGLMGQP
GLSGETGDRGETGLRGYDGLPGKEGPLGIIGQKGIRGDIGLPGADGLGGPPGPKGERGYD
GVVGDKGMQGENASIGMKGMSGDMGFNGMPGRPGQTGLKGLRGDIGNPGLNLRGLNGTKG
FRGDDGIPGRVGEKGLKGFQGDYGFEGIAGEIGDEGFPGLSGLPGRIGFDGAKGPSGHKG
LPGLQGPKGDTGFEGEPGRMGSPGYPGDVGLRGLVGERGPSGAKGMSGDIGPSIYLPATK
GDMGDIGMEGLKGGKGEMGEPGFPGLKGHKGEQGDVGLQGEFGDDGLPGPKGYLGVMGPP
GLPGLDGINPEPGEQGKSGIDGLPGWPGPMGQKGAPGEFGINGPEGAPGQPGLIFSGPKG
YKGATGRPGLRGISGKPGSTGLQGNPGLKGLTGDIGEPGYAISPKGETGNPGISGFYGLK
GIKGEAGDLGLAGLKGYQGPMGMKGERGDEGYEGLNGYSGAKGMKGDRGDEILPSDVEPG
PIGDIGPPGFDGQPGRAGAPGNFGENGIPGFKGERGDIGDIGPEGLLGKQGGQGFMGIKG
EIGFDGIRGLPGLPGLPAPPPPIPKSRGFYFTVHSQTHLIPECPSGTTPLWEGFSLLHIV
ANSKAHGQDLGAPGSCLRRFSTMPYMFCNINNVCDFAQREDYSFWLSTPEPMPSGMTPIP
ATDVGSYISRCQVCETSTRSIAIHSQSSSIPTCPDGWDELWIGYSFLMHTAGADAAGQSL
ISPGSCLREFRTRPFIECNGLGRCNFFATAVSYWLSTIDDNKMFETPIQETLKQNKVSRV
SRCAVCMRRQPQRSYSAGTVEAVPNAVVRRPVNRPLNRLRPRYPARYRGRRRH