DPGLEAN00807 in OGS1.0

New model in OGS2.0DPOGS210481 
Genomic Positionscaffold345:+ 58924-73962
See gene structure
CDS Length3390
Paired RNAseq reads  2470
Single RNAseq reads  5716
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA001836 (8e-13)
Best Drosophila hit  thrombospondin, isoform F (8e-179)
Best Human hitthrombospondin-3 precursor (8e-162)
Best NR hit (blastp)  AGAP002157-PA [Anopheles gambiae str. PEST] (0.0)
Best NR hit (blastx)  AGAP002157-PA [Anopheles gambiae str. PEST] (0.0)
GeneOntology terms






  
GO:0031012 extracellular matrix
GO:0008201 heparin binding
GO:0043234 protein complex
GO:0005509 calcium ion binding
GO:0033627 cell adhesion mediated by integrin
GO:0005927 muscle tendon junction
GO:0007517 muscle organ development
GO:0016203 muscle attachment
InterPro families








  
IPR008985 Concanavalin A-like lectin/glucanase
IPR013320 Concanavalin A-like lectin/glucanase, subgroup
IPR001881 EGF-like calcium-binding
IPR006210 Epidermal growth factor-like
IPR013032 EGF-like region, conserved site
IPR018097 EGF-like calcium-binding, conserved site
IPR008859 Thrombospondin, C-terminal
IPR013091 EGF calcium-binding
IPR003367 Thrombospondin, type 3-like repeat
IPR017897 Thrombospondin, type 3 repeat
Orthology groupMCL10371

Nucleotide sequence:

ATGGCTTCGCATGCCGTGTGGGGTGCATTAGTGCTTCTATTCGTCGCATCCTCTACATTT
TCACTTACACTAGACGAAGAAGCAACAAATGACGTCATAGCCGCCGCTTCAGCCACCGAA
GATGGTGAAGTAGCTATTATAGTCCGAGGTCCGTACGGGGATAACTTAGTCCGTGAGGAA
TTGCTTCACGCGAAGAGCACAGACGATAACTCCGTCTCACTTTATTACAATAGTAAAAGT
AAAAAGGTGTCATTGGAAAGTCTGAACGGGAATCACATCAAGTCAGTTTCCTGGAGTTTG
GGTTCTCATTTTCATGGCACATTGATTCTTATCGTGACCCACTCCAGAATAAAGTTGGCG
GTGGGATGCAAGCCGCTTCATTGGCATCCAATGTCCGGTAGGCATGACGTGCTAACACTT
CTAGCGAACGAAAAGTTAAAATTGTACCACGAAGAGAATGCTCCGGTGGAGGTGTATGAC
AGCGAAAAGACAGCGTTAGACGCCTTGAACTGCAACCACAGGGACCTTAAACCTCCGACC
TTATTGACAGTGGACTCTGATGTGGAGGAAGTCAAAGATTTTATAAAACGCGAAGAGAGA
ATGAAGATGGAAGATGAGATGCAAGGGGACGATCCGCGTAATAACTATATAGATCCTAAC
ATTTACGCCCCACTGCCTCTGCCACCAACGACACCTGGCTCACAAAGAGGAGACATTCCT
GCGACGGATATAGAATCTTGTGATGATGAAGTGATCCGTCAACTGAAACTTCTCCGTCAG
ACGATTGAACTTCTGCGTCGTGAGCTTGCAGACCAAAAAGGAACTATAGACGGACTCAGA
AACCAACTCCGAGCTTGTTGCAACCGAGTCTCGCCACCTCCCATAGATAGATGTTCCGGA
TCTTCGTGCTATCCTGGCGTGCAGTGTCGCAACACGGCGACAGGCATCCAGTGCGGACCC
TGTCCATCAGGGATGGAAGGTGATGGAAGAACATGCAGACCTATAACTTGCAATCGACGC
CCATGCTCTAAAAACGAATATTGCATCGACACGGAACAAGGGTTTAGATGCGAGCGGTGT
CCAGGAAGACAGACCAGCGACGGACAAACATGTCAATCAGCTTGTAGCTCCAATCCTTGC
TTTGGAGGAAGAGTTCAATGTCAAGATTTACCGGATGGTAGGTATCGTTGTGGGTCTTGC
CCCGCCGGTTATACAGGGAATGGGGAGCAGTGTGTTAGACTGTCTTGCCGTTCCAACACT
TGCTTCCAAGGAGTTGAATGCCAGGAGACGGCGTCAGGTCCACGGTGTGGACCGTGTCCC
CGGGGATACGACGGTGATGGTGTTCGTTGTGCACACGTTTGCTCGCGTCGACCCTGCGGG
GAGAGACGCTGCAGCCCCTCGAACAGCAGTCCCTACTACATCTGCGAAGGTTGCCCCAAG
GGCTACGAATGGAACGGTTACACATGCGTTGATATGGACGAGTGTGATTTAATACGTCCG
TGTGACGAACTGGTGTCGTGTCGTAATACGGAGGGAGGGTTCGAGTGTGGCGCATGTCCG
ACAGGGTACAGGGGCAGTTCGGGATGGAGCGGTGCTGGCCAGGAGAGACGGAAGGAGGGA
TGCGTTGATGTAGACGAGTGTGACCAAGACGTCTGTCCTCGGGGACGGCTGTGTGTCAAC
ACACCTGGTTCGTTCACGTGCGTTCCCTGCGGCGGCCACTACTACGTGAACACGTCTCGG
CCGTGCATAGAGGCGGACTCCTTGCGGCGCTGCGACCCAGCCTTCTGCCGCTCTCATAAC
GCCGTGTGTGGCTTCGGACAGGGCTGTGTGTGTGCGACGGGCTGGGCCGGTAATGGTACT
GTTTGCGGTACGGACAGTGATCTAGACGGATATCCGGATCAACAGTTGCCTTGTACTGAA
TTGCAATGCACAGCTGATAACTGTCCCCATGTGTCCAACTCGGGACAGGAGGACGCAGAT
AAGGACGGTATCGGAGATTCTTGCGATCCTGATGCTGATGGTGACGGCATACCGAATGTC
CCGGACAATTGTCCCTTAACACCTAATCCAGATCAGCTAGATAGGGACGAGGATCGCAGT
GACAAACGTGGGGATGCTTGTGACAATTGTCCAAGAAGATTTAACCCTGGACAAGAAGAT
GCAGATAACGATGGACTCGGAAACGTCTGCGATCCCGACATGGATAATGATGGCATTCCC
AACGACCACGACAATTGTCCTCTCGTGTTCAACCCACAACAGGAAGATATGGATGGAGAT
GGTGTGGGTGATCTGTGCGACAACTGTCCAAGAGTACGGAACCCCTCCCAGGATGACTCC
GACAAAGATAACGTTGGTGACGCCTGTGACAGTGACGTGGATAGAGACCAGGACGGCATA
CAGGACGGTTTGGATAATTGTCCGAATTTAGCGAACAGTGATCAGCAAGATGTTGATAAT
GATGGCAAGGGAGACGCTTGTGATGATGATATAGACGGTGATGGGATCCCGAACCTCGAA
GACAACTGTCCTTTGGTGTACAATCCTGATCAGGCTGACGCTAATGGTGACGGTGTCGGG
AACGTTTGCGACAACGACTTCGATGGAGACAACATCACTAACGCACTCGACAATTGCCCG
AATAATTCGAGGATTTTTCGCACCGACTTCAGGAAGTATATGACGGTAAGGTTGGACCCA
GAAGGTACCTCCCAGCAAGACCCACGCTGGCAGCTCGCACACGAGGGCGCTGAGATCACT
CAAACCCTCAACTCAGATCCTGGACTGGCGGTCGGATTCGACAGCTTCGGAGGAGTTGAC
TTTGAAGGCACCTTATTTGTCGACTCGCACATAGACGATGACTACGTCGGCTTCATATTC
GGCTACCAGAACAACAAGCGGTTTTATGTGGTGATGTGGAAGAAGAACAGCCAGACGTAT
TGGCAGACGACGCCGTTCAGGGCGGTCGCGGAGCCGGGGATACAGCTGAAGTTGGTGCAC
TCTAGCACTGGACCTGGGAAGATACTGAGGAACGCGCTCTGGAACACGGAGTCTACTCCT
GATCAGGTGACACTTCTGTGGAAGGATCCTCGAAACGTCGGCTGGCGAGAGAAGACCGCG
TACCGCTGGCGTCTCATACACAGACCCAAGATAGGACTGATTAGACTGAAGATATATGAG
AACAACAGTCTCGTGGCTGACTCCGGGAACGTTTACGACTTCACGCTTAAGGGTGGAAGG
CTGGGAGTTTTCTGCTTTTCCCAGGAAATGATCATTTGGTCCAACCTTGTGTACCGCTGT
AACGATAAAATACCAACGAACATAGTATCAGAACTGCCACCAAGGCTCCTTAAAAAGTTG
GATATAGACCACGACTTCGTTTATTTGTAG

Protein sequence:

MASHAVWGALVLLFVASSTFSLTLDEEATNDVIAAASATEDGEVAIIVRGPYGDNLVREE
LLHAKSTDDNSVSLYYNSKSKKVSLESLNGNHIKSVSWSLGSHFHGTLILIVTHSRIKLA
VGCKPLHWHPMSGRHDVLTLLANEKLKLYHEENAPVEVYDSEKTALDALNCNHRDLKPPT
LLTVDSDVEEVKDFIKREERMKMEDEMQGDDPRNNYIDPNIYAPLPLPPTTPGSQRGDIP
ATDIESCDDEVIRQLKLLRQTIELLRRELADQKGTIDGLRNQLRACCNRVSPPPIDRCSG
SSCYPGVQCRNTATGIQCGPCPSGMEGDGRTCRPITCNRRPCSKNEYCIDTEQGFRCERC
PGRQTSDGQTCQSACSSNPCFGGRVQCQDLPDGRYRCGSCPAGYTGNGEQCVRLSCRSNT
CFQGVECQETASGPRCGPCPRGYDGDGVRCAHVCSRRPCGERRCSPSNSSPYYICEGCPK
GYEWNGYTCVDMDECDLIRPCDELVSCRNTEGGFECGACPTGYRGSSGWSGAGQERRKEG
CVDVDECDQDVCPRGRLCVNTPGSFTCVPCGGHYYVNTSRPCIEADSLRRCDPAFCRSHN
AVCGFGQGCVCATGWAGNGTVCGTDSDLDGYPDQQLPCTELQCTADNCPHVSNSGQEDAD
KDGIGDSCDPDADGDGIPNVPDNCPLTPNPDQLDRDEDRSDKRGDACDNCPRRFNPGQED
ADNDGLGNVCDPDMDNDGIPNDHDNCPLVFNPQQEDMDGDGVGDLCDNCPRVRNPSQDDS
DKDNVGDACDSDVDRDQDGIQDGLDNCPNLANSDQQDVDNDGKGDACDDDIDGDGIPNLE
DNCPLVYNPDQADANGDGVGNVCDNDFDGDNITNALDNCPNNSRIFRTDFRKYMTVRLDP
EGTSQQDPRWQLAHEGAEITQTLNSDPGLAVGFDSFGGVDFEGTLFVDSHIDDDYVGFIF
GYQNNKRFYVVMWKKNSQTYWQTTPFRAVAEPGIQLKLVHSSTGPGKILRNALWNTESTP
DQVTLLWKDPRNVGWREKTAYRWRLIHRPKIGLIRLKIYENNSLVADSGNVYDFTLKGGR
LGVFCFSQEMIIWSNLVYRCNDKIPTNIVSELPPRLLKKLDIDHDFVYL