DPGLEAN01131 in OGS1.0

New model in OGS2.0DPOGS202165 
Genomic Positionscaffold1548:- 22-6956
See gene structure
CDS Length1827
Paired RNAseq reads  1429
Single RNAseq reads  3059
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA003435 (1e-57)
Best Drosophila hit  CG1965 (1e-18)
Best Human hitGC-rich sequence DNA-binding factor 1 isoform 1 (1e-17)
Best NR hit (blastp)  PREDICTED: similar to gc-rich sequence DNA-binding factor [Tribolium castaneum] (1e-70)
Best NR hit (blastx)  GJ23221 [Drosophila virilis] (7e-21)
GeneOntology terms




  
GO:0003677 DNA binding
GO:0003700 sequence-specific DNA binding transcription factor activity
GO:0045449 regulation of transcription
GO:0071013 catalytic step 2 spliceosome
GO:0000398 nuclear mRNA splicing, via spliceosome
GO:0071011 precatalytic spliceosome
InterPro families
  
IPR012890 GC-rich sequence DNA-binding factor
IPR022783 GC-rich sequence DNA-binding factor domain
Orthology groupMCL10454

Nucleotide sequence:

ATGTCCTTGTTTCGTAAACCGAAGAAGATCCAGAGACGAGTTTTTTGTGCTGACGATGAA
GAAGACGGTGAGCCGGAGGCACCTGTGCCGCCGCCGCCGCCGATTATTAGTAATTCAAGG
AAGGAAAACAAACAAGTAAAAGTAACAACGCTATTGAGTTTCGCCGATGAAGAGGAAGAG
GGCGAAGTATTCAAGGTGAAGAAGTCATCACAGAGCAAGAGATTGAGTAAACGGAGACAG
AAAGAAAAACAACGCACAGATGGTGATAGTAATAAATATGACAATCACATGGTCGAGGAG
AAACCGTCGGAGGAGATAGAAGAACCGAGGAAGAAGGTTACCCTCGAGGGTCTGATCCTG
TCAGGGCGGGAGGCGTTGTCCGCGGACGGGGCGGGGGACATTTCCGAAGACAGCGAGGAA
GATAACAGGGGGTTCCACACGTACCGAGCCGAGAGCGTGCGGGCGGCGCTCGCCGGCGCG
GGGGGAATCCCCGACGCCGCGCTCATACACGCCGCGCGCAAGACCCGACAGCAGGCTCGT
GAGTTGGGTGACTTTGTTCCCATCAAGAATGATGGCGGCTCCAGGATGATGAGAGATGAT
GACGCTGATGACGATGACGATGATGAGGCAGACGAGGGCCGGATACAGGTCAGGGGGTTG
GAACTGCCAAGCGACAGACCCGAACGTGGTACAACAGCCGCCGCGTCTGATGATGAAGCT
CAAAGTGAAGGAGAGGAGTGGGAGGAGCAGCAGATTAAGAAAGCTGTGCCCTCAATAGCT
GATATTACAGGTGATTGTATCCCACTAAATCCGTTCGCTGTTCCTCCGCCCCCGGACACG
CCGCGTCACCTGCGGTCCCTCGCGCGCCCCGGACAGCCTCCGCCAGCTACCGCGCAACAA
CTCGTAGAGGCGCTACGAGACAGGCTGTCAGAGCTTCACGAGAGTCGTGCGAGAACAGCG
CAGCGTATGTATCACTTACAAGAGCGAGCGTCTAACGCGGCCGCCAAGCGTGAGAGGTGT
AAGGGGTTGTGCTCGGAACTCGACCGCAGATACAAGAGGGCGCAGGCGGCCAGGGGGTAC
ATCACCGACCTCGTGGAGTGTCTGGACGAGAAGATACCTCAACTGCAAGCGTTGGAGGCC
CGGGCCCTGGCGCTTCATCGCAAGAGACGCGACCTGCTGGTGGAGAGGCGGAGGGCCGAC
GTGCGCGACCAGGCGCAGGACGTGCTCGCACTCGCCGCTCGCGCGGGGTCATCGAAGCCG
GTGGACAGCGAGGAGAAGCGTCAGCGTACTGCGGAGCGCGAGGGCCGGCGGCGGGCGAGG
CGGCTCAGGCGACAGGCGGCCGGCAACAACCAGCACAGGGACGGGGACTCCAGCGACGAT
GACCTGCCCCCAACCCTGCACCATCACTGTCAACAGGAGGCGGACGCGATCCGCTCTCTG
TCGAGTCAGTTGTTCGCGGACACGTTGTCGGCCTACCGCAGTGTGCAAGGAGTCTGCGGT
CGCATGGCAAGACTGAGGCGGACGCGCGGGTTGTACACGGACGCGTATGTCGCACAATGT
CTGCCGAAGTTACTGGCGCCGTATGTTAGACATCAGCTGATCCTCTGGAACCCGCTCGCT
GACGAAGACAACGAGGACTACGAGAAGATGGACTGGTACAAATGTCTCATGATGTACGGC
TGCAAGTCCGAGCGCCTGTCCAGCGACTCAGAGCAGTCTTCCTCCGACGAGGTTTCCGTG
ACCGAGCTGGCCGTGAGAGACGACCCCGACCTACTGCTGGTACCCACGATCATAGACAAG
GTCGTACTGCCCAAGATCACCGGTTAG

Protein sequence:

MSLFRKPKKIQRRVFCADDEEDGEPEAPVPPPPPIISNSRKENKQVKVTTLLSFADEEEE
GEVFKVKKSSQSKRLSKRRQKEKQRTDGDSNKYDNHMVEEKPSEEIEEPRKKVTLEGLIL
SGREALSADGAGDISEDSEEDNRGFHTYRAESVRAALAGAGGIPDAALIHAARKTRQQAR
ELGDFVPIKNDGGSRMMRDDDADDDDDDEADEGRIQVRGLELPSDRPERGTTAAASDDEA
QSEGEEWEEQQIKKAVPSIADITGDCIPLNPFAVPPPPDTPRHLRSLARPGQPPPATAQQ
LVEALRDRLSELHESRARTAQRMYHLQERASNAAAKRERCKGLCSELDRRYKRAQAARGY
ITDLVECLDEKIPQLQALEARALALHRKRRDLLVERRRADVRDQAQDVLALAARAGSSKP
VDSEEKRQRTAEREGRRRARRLRRQAAGNNQHRDGDSSDDDLPPTLHHHCQQEADAIRSL
SSQLFADTLSAYRSVQGVCGRMARLRRTRGLYTDAYVAQCLPKLLAPYVRHQLILWNPLA
DEDNEDYEKMDWYKCLMMYGCKSERLSSDSEQSSSDEVSVTELAVRDDPDLLLVPTIIDK
VVLPKITG