DPGLEAN18224 in OGS1.0

Genomic Positionscaffold625:+ 19515-37916
See gene structure
CDS Length2262
Paired RNAseq reads  584
Single RNAseq reads  1605
Migratory profilesQuery via corresponding ESTs
Best Bmobyx hitBGIBMGA009245 (2e-138)
Best Drosophila hit  CG1402 (6e-125)
Best Human hitcarbonic anhydrase-related protein 10 (4e-60)
Best NR hit (blastp)  AGAP000715-PA [Anopheles gambiae str. PEST] (5e-139)
Best NR hit (blastx)  AGAP000715-PA [Anopheles gambiae str. PEST] (2e-137)
GeneOntology terms


  
GO:0004089 carbonate dehydratase activity
GO:0008270 zinc ion binding
GO:0006730 one-carbon metabolic process
GO:0005576 extracellular region
InterPro families


  
IPR000477 Reverse transcriptase
IPR001148 Carbonic anhydrase, alpha-class, catalytic domain
IPR018347 Carbonic anhydrase, CAH2-like, metazoa
IPR023561 Carbonic anhydrase, alpha-class
Orthology groupMCL10884

Nucleotide sequence:

ATGCAGACATGTTGGTTCCTCGCACTCCTCTTGTGCTATTTCGTGAAAGACATATGCGGC
AGTTGGGAAGAATGGTGGACATACGACGGAATATCAGGTCCGGGTTTCTGGGGTCTAATC
AATCCGCAATGGAACATGTGTAACAAGGGCCGGCGGCAGAGTCCCGTCAACATAGAACCC
GATAAATTACTCTTTGACCCCTGGCTGAGAGACATACAGTTTGATAAACATAAGGTCAGC
GGTGTTCTTCAAAACACTGGCCAATCACTGGTCTTCCGAGTCGAAAAGGACAGCAAACAC
CAGGTCAACATTAGCGGTGGGCCACTGTCTTACAGATACCAGTTCGAGGAGATATATTTC
CATTATGGTTTGGAAGACAACCGGGGCTCTGAACACCAGATTGATCATCATACCTTTCCT
GGAGAGATACAGTTATACGGCTTCAACAAGGAATTATATCATAACATGTCAGAAGCGCAA
CACAAGTCCCAGGGGGTGGTAGGAATATCACTAATGGTTCAAATAGGAGAACCTACTAAT
AAGGAACTGCGTCTTATAACCAGCGCCTTCAACAAAGTTACTTACAGAGGCAGTTCCTTC
GCCATAAAACACCTACCGCTTAGTTCGTTGCTACCCAATACGCAGCAATATCTCACGTAT
GAAGGTTCCACCACTCACCCAGGATGCTGGGAGACCGCTGTTTGGATCATCTTCAACAAA
CCGATCTATATATCAAAGCAAGAGATGTACGCAATTCGTCGTCTGATGCAAGGGTCTCAA
CTGACCCCAAAGGCCCCATTGGGAAATAATGCTCGCCCGGTTCAGCCTCTGCACCATCGC
ACTGTCAGGACAAATATCAACTTCAACAAGCAAGGGATGCCGGTATCGAGTAACTGTCCT
GATATGTATAGAAATATGCATTACACAGGTACATCTTCATGGGCAGTCGAGAACCGGAGG
GGGTCCATAGAGTGGTGGACAGTCGACAACGGCGTTCCACAGGGTTCGGTATTGGGACCT
GTCCTGTGGAACGTCGGGTATGACTGGGTCCTGCGGAGCCGCCTCCTCCCCGGGATGGGT
GTCATATGCTACGCTGATGACACCCTCGTCTTATCCCGGGGACGGAGCTACAAGGAGGCG
GCGCGGCTGGCCGAGGTCGGAACTGAGCTCGTGGTCAGCCGCATAGAGAGGTTGGGGCTT
CGGGTCAGAATCGACAAGACCGAAGCCCTCCTCTTCCGCGGGACTGGGCGGAAAGGACCC
CCGCCGGGTGCCACCCTCCTCATAGGAGGAGGGAGGGTCAGGGTGAGCCCTACCATGAAA
TATCTGGGGCTCACCCTTGACGGAGGGTGGACCTTCGTGCCCCATTTCAGGGAGTTGGGG
CCGAAGGTCATGAGGACGGCAGGTGCGCTGGGGAAATTCCTCCCGAACCTCGGAGGACCC
AGCGCAGCTTGCAGGCGGCTATACTCTGGGGTCTGTCGGAGTATAGCCACGTACGGTGCT
CCCGTTTGGGCTGATCGACCGATGAGCCGCGGGGTCAAGGCCCTACTGCGCTCGGCGCAA
AGGGCACCCGCGGTGAGGGTGATTAGGGGGTACCGTACGGTCTCCTGGGCCGCAGCGACG
GCTCTTGCCGGCGATCCGCCTTGGGATCTTGTGGCGTCGGTTCTCGCCGAGGTGTTCTCC
TACGTCTCGGGTAGGAGGGCTCTCGGAGAGAACCCTTCATCAGAGGAGATCCGGGCGGTT
CGCCGGCAGGGGGAGTCACGTCTTATGCGGGAGTGGGGGGAGGACCTGGCGGGCCAGCCG
TACGGTAAACGTACAACGGCAGCGCTCCGTCCGGTCCTAGAGCGTTGGATGAGGCGGAAA
CGCAAACCCCTCACTTTCCGTCTGACGCAGGTCTTCACCGGGCACGGGTGCTTTGGTGAT
TACTTGTGTCGGACGGCCAGGAGAGAGCCGGGGAGTGGCTGTCATGAGTGCGGAGCTGCG
GTGGACTCGGCCCAGCACACCCTCGAGGTGTGCCCGAGATGGGCTGCGCAGCGCCAAGAC
CTTGTGGCGGCTCTCGGCGGAGTGGACTTGTCGCTTTCGAGTATCGCGGAGAAGATGCTT
GAGAGTGACAGGTCCTGGCTGGCGGTGTCCTCCTTCTGTGAGACGGTCATGTCCACGAAG
GAGGCATCTGAGCGGGAACGGGAGGTTGCGGCTGATGCATCCTCCCTCCGCAGACGACGG
ACGGGGGCGCGCCGGGGGCGATATCAACGCCTCCTCCATTAG

Protein sequence:

MQTCWFLALLLCYFVKDICGSWEEWWTYDGISGPGFWGLINPQWNMCNKGRRQSPVNIEP
DKLLFDPWLRDIQFDKHKVSGVLQNTGQSLVFRVEKDSKHQVNISGGPLSYRYQFEEIYF
HYGLEDNRGSEHQIDHHTFPGEIQLYGFNKELYHNMSEAQHKSQGVVGISLMVQIGEPTN
KELRLITSAFNKVTYRGSSFAIKHLPLSSLLPNTQQYLTYEGSTTHPGCWETAVWIIFNK
PIYISKQEMYAIRRLMQGSQLTPKAPLGNNARPVQPLHHRTVRTNINFNKQGMPVSSNCP
DMYRNMHYTGTSSWAVENRRGSIEWWTVDNGVPQGSVLGPVLWNVGYDWVLRSRLLPGMG
VICYADDTLVLSRGRSYKEAARLAEVGTELVVSRIERLGLRVRIDKTEALLFRGTGRKGP
PPGATLLIGGGRVRVSPTMKYLGLTLDGGWTFVPHFRELGPKVMRTAGALGKFLPNLGGP
SAACRRLYSGVCRSIATYGAPVWADRPMSRGVKALLRSAQRAPAVRVIRGYRTVSWAAAT
ALAGDPPWDLVASVLAEVFSYVSGRRALGENPSSEEIRAVRRQGESRLMREWGEDLAGQP
YGKRTTAALRPVLERWMRRKRKPLTFRLTQVFTGHGCFGDYLCRTARREPGSGCHECGAA
VDSAQHTLEVCPRWAAQRQDLVAALGGVDLSLSSIAEKMLESDRSWLAVSSFCETVMSTK
EASEREREVAADASSLRRRRTGARRGRYQRLLH