Monarch geneset OGS2.0

DPOGS213760
TranscriptDPOGS213760-TA867 bp
ProteinDPOGS213760-PA288 aa
Genomic positionDPSCF300212 - 274276-282590
RNAseq coverage370x (Rank: top 32%)
Annotation
HeliconiusHMEL0128071e-10774.52% 
BombyxBGIBMGA009245-TA5e-14696.80% 
DrosophilaCG1402-PB2e-11770.76% 
EBI UniRef50UniRef50_Q9W3P73e-11570.76%CG1402 n=42 Tax=Pancrustacea RepID=Q9W3P7_DROME
NCBI RefSeqXP_001870284.18e-12372.20%carbonic anhydrase [Culex quinquefasciatus]
NCBI nr blastpgi|3479643618e-12272.56%AGAP000715-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|3479643613e-12172.56%AGAP000715-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00082703.5e-144zinc ion binding
GO:00067303.5e-144one-carbon metabolic process
GO:00055763.5e-144extracellular region
GO:00040893.5e-144carbonate dehydratase activity
KEGG pathwayisc:IscW_ISCW0151002e-33 
 K01672 (E4.2.1.1)maps-> Nitrogen metabolism
InterPro domain[2-287] IPR0183473.5e-144Carbonic anhydrase, CAH2-like, metazoa
[2-287] IPR0235613.5e-144Carbonic anhydrase, alpha-class
[2-242] IPR0011483.2e-68Carbonic anhydrase, alpha-class, catalytic domain
Orthology groupMCL16838 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213760-TA
ATGTGTAACAAGGGCCGGCGGCAGAGTCCCGTCAACATAGAACCCGATAAATTACTCTTTGACCCCTGGCTGAGAGACATACAGTTTGATAAACATAAGGTCAGCGGTGTTCTTCAAAACACTGGCCAATCACTGGTCTTCCGAGTCGAAAAGGACAGCAAACACCAGGTCAACATTAGCGGTGGGCCACTGTCTTACAGATACCAGTTCGAGGAGATATATTTCCATTATGGTTTGGAAGACAACCGGGGCTCTGAACACCAGATTGATCATCATACCTTTCCTGGAGAGATACAGTTATACGGCTTCAACAAGGAATTATATCATAACATGTCAGAAGCGCAACACAAGTCCCAGGGGGTGGTAGGAATATCACTAATGGTTCAAATAGGAGAACCTACTAATAAGGAACTGCGTCTTATAACCAGCGCCTTCAACAAAGTTACTTACAGAGGCAGTTCCTTCGCCATAAAACACCTACCGCTTAGTTCGTTGCTACCCAATACGCAGCAATATCTCACGTATGAAGGTTCCACCACTCACCCAGGATGCTGGGAGACCGCTGTTTGGATCATCTTCAACAAACCGATCTATATATCAAAGCAAGAGATGTACGCAATTCGTCGTCTGATGCAAGGGTCTCAACTGACCCCAAAGGCCCCATTGGGAAATAATGCTCGCCCGGTTCAGCCTCTGCACCATCGCACTGTCAGGACAAATATCAACTTCAACAAGCAAGGGATGCCGGTATCGAGTAACTGTCCTGATATGTATAGAAATATGCATTACACAGCTACTCAGTGGCCAAGAGAGCACAGCATGAGATACAGGAGCACTGAGGACCTGGCGATGCTGTCATTAAATTAG

Protein sequence:

>DPOGS213760-PA
MCNKGRRQSPVNIEPDKLLFDPWLRDIQFDKHKVSGVLQNTGQSLVFRVEKDSKHQVNISGGPLSYRYQFEEIYFHYGLEDNRGSEHQIDHHTFPGEIQLYGFNKELYHNMSEAQHKSQGVVGISLMVQIGEPTNKELRLITSAFNKVTYRGSSFAIKHLPLSSLLPNTQQYLTYEGSTTHPGCWETAVWIIFNKPIYISKQEMYAIRRLMQGSQLTPKAPLGNNARPVQPLHHRTVRTNINFNKQGMPVSSNCPDMYRNMHYTATQWPREHSMRYRSTEDLAMLSLN-