Monarch geneset OGS2.0

DPOGS215999
TranscriptDPOGS215999-TA1008 bp
ProteinDPOGS215999-PA335 aa
Genomic positionDPSCF300078 + 326572-337072
RNAseq coverage22x (Rank: top 79%)
Annotation
HeliconiusHMEL0218159e-8371.30% 
BombyxBGIBMGA000934-TA1e-12884.62% 
DrosophilaCG32698-PA4e-13875.25% 
EBI UniRef50UniRef50_Q9W3166e-13675.25%CG32698 n=12 Tax=Arthropoda RepID=Q9W316_DROME
NCBI RefSeqXP_972474.21e-14284.12%PREDICTED: similar to CG32698 CG32698-PA [Tribolium castaneum]
NCBI nr blastpgi|1892366782e-14184.12%PREDICTED: similar to CG32698 CG32698-PA [Tribolium castaneum]
NCBI nr blastxgi|1892366787e-14284.12%PREDICTED: similar to CG32698 CG32698-PA [Tribolium castaneum]
Group
Gene OntologyGO:00082704.9e-168zinc ion binding
GO:00067304.9e-168one-carbon metabolic process
GO:00055764.9e-168extracellular region
GO:00040894.9e-168carbonate dehydratase activity
KEGG pathwayptr:7408993e-40 
 K01672 (E4.2.1.1)maps-> Nitrogen metabolism
InterPro domain[22-316] IPR0183474.9e-168Carbonic anhydrase, CAH2-like, metazoa
[22-316] IPR0235614.9e-168Carbonic anhydrase, alpha-class
[31-295] IPR0011481.6e-78Carbonic anhydrase, alpha-class, catalytic domain
Orthology groupMCL12344 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215999-TA
ATGACGACTATAGAAGATAAAAGAAATATTCACAATGAGCTCAACTTAATTTCGTCTGGTTTGATTGCAGCCGTGAGTGGCGTCAGCTGGGAGGAATGGTGGACCTACGACGGCATTTCAGGTCCGGCGTTTTGGGGTCTGATCAATCCGGAATGGTCGCTATGCAACAAGGGACGGAGGCAATCCCCCGTCAACCTCGAGCCCGAGAAGTTACTCTTCGATCCGAACCTGAGATTTTTACATATAGATAAGCATAGAATAAACGGACTGATCAGCAACACTGGCCACTCGGTAATATTCACCGTGGAAAATGAGACTCGCCATCACATAAACATAACGGGTGGACCCCTCTCTTATAAATATCAATTTCATGAAATCCATATTCATTATGGATTACACGATCAATTTGGATCGGAGCACGCTGTCAATGGCTATTCCTTTCCCGCTGAGATACAAATATTTGGTTTCAATTCACAGCTTTATTCAAACTTCTCAGAGGCTTTACATAAAGCTCAAGGAATTGTTTCCATTTCTTTACTCCTGCAGCTAGGGGATTTATCGAATCCCGAGTTAAGAATATTGACAGAGGAGTTAGAAAATATAAAGTACGGAGGCGCCGAGATGCCTGTCAACCGGCTGTCAGTGAGGGGTCTTCTGCCCGATACGGACTATTACATGACGTACGACGGATCAACCACAGCCCCCGCCTGCTACGAGACTGTTACCTGGATAATAATTAACAAACCCATTTACATAACGAAACAACAGCTGCACGCCCTGAGGCGATTGATGCAGGGCGACGCGAGGCACCCGAAGGCCCCGCTCGGGAATAATTTCAGGCCCCCTCAACCACTACACCACCGGGCAGTTAGAACTAACATTGACTTTGACTTGAGCAAGTACCCAGGCAAGACATGCCCCAGCATGCACCGAGACATGCATTACAAGGGTGACGAAAAGTTGGACGCAGTATTGAATATTGGACGCAGTATTGTTTGGTTTCGGTAG

Protein sequence:

>DPOGS215999-PA
MTTIEDKRNIHNELNLISSGLIAAVSGVSWEEWWTYDGISGPAFWGLINPEWSLCNKGRRQSPVNLEPEKLLFDPNLRFLHIDKHRINGLISNTGHSVIFTVENETRHHINITGGPLSYKYQFHEIHIHYGLHDQFGSEHAVNGYSFPAEIQIFGFNSQLYSNFSEALHKAQGIVSISLLLQLGDLSNPELRILTEELENIKYGGAEMPVNRLSVRGLLPDTDYYMTYDGSTTAPACYETVTWIIINKPIYITKQQLHALRRLMQGDARHPKAPLGNNFRPPQPLHHRAVRTNIDFDLSKYPGKTCPSMHRDMHYKGDEKLDAVLNIGRSIVWFR-