DPGLEAN13507 in OGS1.0

Genomic Positionscaffold13138:- 230-5861
See gene structure
CDS Length3216
Paired RNAseq reads  462
Single RNAseq reads  2372
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA008012 (1e-108)
Best Drosophila hit  cag (1e-10)
Best Human hittigger transposable element-derived protein 4 (3e-75)
Best NR hit (blastp)  PREDICTED: tigger transposable element derived 6-like [Saccoglossus kowalevskii] (5e-99)
Best NR hit (blastx)  PREDICTED: tigger transposable element derived 6-like [Saccoglossus kowalevskii] (2e-95)
GeneOntology terms






  
GO:0005634 nucleus
GO:0003677 DNA binding
GO:0045449 regulation of transcription
GO:0000775 chromosome, centromeric region
GO:0003676 nucleic acid binding
GO:0005575 cellular_component
GO:0003674 molecular_function
GO:0008150 biological_process
InterPro families





  
IPR009057 Homeodomain-like
IPR006600 Pogo transposase / Cenp-B / PDC2, DNA-binding HTH domain
IPR012287 Homeodomain-related
IPR004875 DDE superfamily endonuclease, CENP-B-like
IPR004906 Pogo transposase / Cenp-B / PDC2, subgroup, DNA-binding HTH domain
IPR006695 Centromere protein Cenp-B, DNA-binding domain 1
IPR002893 Zinc finger, MYND-type
Orthology groupMCL17228

Nucleotide sequence:

ATGGCTCCTAGACGCACGTTTACTGTTAAAGAAAAGGTGGAAATCATTTCAAAATTGAAA
AATGGTGCAAACAATGCTGATTTGTGCAAAGAATACAAAGTTTCGCATTCAACAATATCT
ACGATGTGGAAAAACCGTGATAAAATATTAGAGTGCTTTGAGTCCAAATCCTTAAAAATT
AAGAAAAACCGCAAGCCGACGCATCAAGATGTTGAAAAAGCTCTTCTAGTGTGGTTCAAA
GCTCAAAGAAGCCAAAACGTACCTGTGAGCGGTCCTCTGCTGCAAGAAAAGGCTAATCAT
TTTGCCAGACTGTTTGGAAAAATAGATTTTAAGTGTTCGGAAAGCTGGATATACCGGTTT
CGTCAACGGCACGATATTGTAGTAGGCAAAGTTTGTGGAGAGGCCGCAAGCGTATCGCAT
AGTGACTGTGACAACTGGCTAAAAACAGTTTTCCCAAAATTGACTGAAGGTTACACTGAT
AGCCAAATATGGAACGCAGACGAGACAGGTTTGTTTTTTAAGCTAACGCCTGAAAAAACA
TTGAAATTTAAAGGAGAAAAGTGCGTAGGTGGAAAATTATCAAAAGATAGAATCACCGTG
CTTGTTGCATCAAGCATGGCAGGGGAAAAAAGAAAATTATTGGTCATAGGCAAGGCGAAA
AAGCCCAGGTGTTTCAAAAATGTTAAATTTTTACCGGTTGATTACGAAGCAAATAGAAAG
GCCTGGATGACTTCAGATATTTTCGAAAAAGTTCTACGAAAATGGGACTCCCAGTTAAGA
AATAACAAAAAGAAGATCATTTTATTTATTGACAATTGCCCAGCTCATCCTAAAATTGAA
AATTTGACCAACAGAAAACTGGCATTTTTACCACCAAACACAACATCAGTGATCCAACCT
ATTGATCAAGGGATAATTAAAACTCTCAAAAGTCATTATCGGAAGATTTTAGTACAGAAA
ATGATGAACGACATTGAAAAAGCGGCAGGCTCATTTTCCGTTAATCTTTTAAATGCTATT
GAGATGACAACTACGGCATGGGCTCGAGTGACTCCGGAAACGATCAAAAAGTGTTTTCTA
CACGCTGGGTTCTGTAAATCTTCGGTTATCACAACAATTGACGACGATTCTGACGATGAG
TTGGACATACCATTGGCACAGCTGACAACATCGTCAAGCGGAACAAATGTTCCGGATTGG
GAAACATATGTCGATATTGACAGTCAGCTTATCACCACTAGCAATTTGACTGATAATGAA
ATGGTTGAAACTGTTGCTTCATCACCTACGACGCAAGATGAGGAAAATGAGGAGGAAGAA
GAAGATGACAAGGAAGGTGAGATTCCTACAACTGAGGATGTGTTATGTGCAGTGACAAGG
CTCAAAAGATATTGTCTATTTGGTGATGGAAAGGATAATATAGACATTGAAGGAGACGAG
CAACTGAACAAACGGAGAGCCAGTCAAGATATAGACACTGAATCTCCTCTGAAGAAATTA
TGCGCGGAAGTCGAGAAGACGTTCCCCCAGCATGACACGGCGACCAAGAACAACGTGGAC
GAAATACAGAGGCACACCGAGCAACTGCTGTCCGAGATACAGACGCTGAGGGAGCTGGCG
CAGAAGAAGGAGCACGAGTGGAACAACATCCTGCACCTCAAGAAAGTCAAGGAGGAAATA
CTGCTCAGGCTGCTCAGGCGGAAGCAGGTGCTGGCCTTCGAGAAGAGCGCCGACGTCAAC
GGCAGCGAGCGGACCGACCCCTTCGACTACCTCAACCAGGCCAAGAACCTGGCCATCGAT
AAGAGCGATGACATATCCGGCCTGGCGATCAAACAGCCGGGCTCCGCCATCGTCAACCCC
ATCATGCAAGCGCCCATCATGCCCGTGACGTCACACTTCAACCCCATGTCAGGGCTGCCA
CCGCCCTACGACAAGGCCGCTCACTTGCAGTCGATGCCCAAACCCAACCAACTATTCCCG
CAGGCTATGATGATGCCGGGCCATCTGCAAGGTTTCCCCAGGGATATGAACGGTCAGCTG
CCATCTAGCTACGGAATGCCGATGGGGCGTCAGGGGCCGACCAAAGACGTGAAGAGCATT
ATAGCCGATTACAGGCAGAGGAATCCTGAGATAACTCCTCGCAGGGGGAGGCGGATGAAG
CCCATCGTCAACCCCAGCATGATGAACCAGCCGCGACCGATAGCCCCCAAGGTGGACGCC
ATGAACAGTCTCAGTAATCTCAACATGCTGTTCAACAATTTGGACATGGTGAGTTTATAC
GAACTGTCCGTGGACCGAGTGATTCCAGGAACACTACACGGAATGACCGTTACCGTTCGC
AGACCGGGTCTGTTGCATGTTGTGTTCAGTATGAAGAATCAGAAGGCGATGATAGAGCGT
CTGCAGCAGATCCAAGCGGGCGGGCTGCCCAACGGTCTGTCGTTCAAGGACGTGCTGGTG
CAGGTCGCCAACATGCAGCAGAACAACGCCGGCCTCATGGCCGCCAGGGCCCACGACGTG
AACAGGCCCGAGCGACGGCGGCAGGAGCGCGGCGAGGACCCGCCGGCCCCGCCCGCCCCA
CCCGCTCCACCCAAACAGGCGGACAGGCTCGCCGCCTCCAGCCCCCGCCTGCCGCCGCCG
CCGCCATACCCGGAGATATCGCTCCTGCCGGTCAGCACCGCCCAGGACGCCTCGCACACG
CAGCAGAACTCGCTGCTCCACGGAATACTGACCAAGCTTGAATGGGTGTTGCCAACCTTC
CAGCAGGCGTCTCCCGCGTCTCAGTGTTACTCCCCGACGTTGGCTAAACTGCTGACGTCG
CCGGAACGGAAACAGAGCGCTCCGCCGTTGCCAACTTTCGGTCAGGCCAAGAACTGCGGC
GAGATCACGATAACGCCGGTGCAGCCCGCGCCGCCGGACGCGGAGAAGACGGAGGTGGTG
CAGCTCGAGGAGGAGGAGAGCGGCGCGTCCGAGGAGTCCGCGGGCTCGGCGTCCGCGGGC
TCGGGCTCCGGCTCGGGCTCCGGCCGGCTGGTCATCGACGAGGGCAACGACGAAGCGCCC
ACCTGCCAGGGCTGCCGCTCGCGACTCGCGCAGTTCGTGTGCGCCGGCTGCGCCAACCAG
TGGTACTGCTCCAGGGACTGCCAGGTCGACGTCACACTTATACATTTATCACTCTCGTTC
GTTGACATATATCTACTTATATTATGTAAGATGTGA

Protein sequence:

MAPRRTFTVKEKVEIISKLKNGANNADLCKEYKVSHSTISTMWKNRDKILECFESKSLKI
KKNRKPTHQDVEKALLVWFKAQRSQNVPVSGPLLQEKANHFARLFGKIDFKCSESWIYRF
RQRHDIVVGKVCGEAASVSHSDCDNWLKTVFPKLTEGYTDSQIWNADETGLFFKLTPEKT
LKFKGEKCVGGKLSKDRITVLVASSMAGEKRKLLVIGKAKKPRCFKNVKFLPVDYEANRK
AWMTSDIFEKVLRKWDSQLRNNKKKIILFIDNCPAHPKIENLTNRKLAFLPPNTTSVIQP
IDQGIIKTLKSHYRKILVQKMMNDIEKAAGSFSVNLLNAIEMTTTAWARVTPETIKKCFL
HAGFCKSSVITTIDDDSDDELDIPLAQLTTSSSGTNVPDWETYVDIDSQLITTSNLTDNE
MVETVASSPTTQDEENEEEEEDDKEGEIPTTEDVLCAVTRLKRYCLFGDGKDNIDIEGDE
QLNKRRASQDIDTESPLKKLCAEVEKTFPQHDTATKNNVDEIQRHTEQLLSEIQTLRELA
QKKEHEWNNILHLKKVKEEILLRLLRRKQVLAFEKSADVNGSERTDPFDYLNQAKNLAID
KSDDISGLAIKQPGSAIVNPIMQAPIMPVTSHFNPMSGLPPPYDKAAHLQSMPKPNQLFP
QAMMMPGHLQGFPRDMNGQLPSSYGMPMGRQGPTKDVKSIIADYRQRNPEITPRRGRRMK
PIVNPSMMNQPRPIAPKVDAMNSLSNLNMLFNNLDMVSLYELSVDRVIPGTLHGMTVTVR
RPGLLHVVFSMKNQKAMIERLQQIQAGGLPNGLSFKDVLVQVANMQQNNAGLMAARAHDV
NRPERRRQERGEDPPAPPAPPAPPKQADRLAASSPRLPPPPPYPEISLLPVSTAQDASHT
QQNSLLHGILTKLEWVLPTFQQASPASQCYSPTLAKLLTSPERKQSAPPLPTFGQAKNCG
EITITPVQPAPPDAEKTEVVQLEEEESGASEESAGSASAGSGSGSGSGRLVIDEGNDEAP
TCQGCRSRLAQFVCAGCANQWYCSRDCQVDVTLIHLSLSFVDIYLLILCKM