Monarch geneset OGS2.0

DPOGS211602
TranscriptDPOGS211602-TA1434 bp
ProteinDPOGS211602-PA477 aa
Genomic positionDPSCF300232 - 280336-287491
RNAseq coverage2072x (Rank: top 6%)
Annotation
HeliconiusHMEL0162430.090.38% 
BombyxBGIBMGA008214-TA0.079.49% 
Drosophilal(1)G0255-PA0.074.78% 
EBI UniRef50UniRef50_Q9W3X50.074.51%CG4095 n=121 Tax=cellular organisms RepID=Q9W3X5_DROME
NCBI RefSeqXP_967085.10.076.24%PREDICTED: similar to AGAP001884-PA [Tribolium castaneum]
NCBI nr blastpgi|910840430.076.24%PREDICTED: similar to AGAP001884-PA [Tribolium castaneum]
NCBI nr blastxgi|3323749400.076.62%unknown [Dendroctonus ponderosae]
Group
Gene OntologyGO:00452396.1e-222tricarboxylic acid cycle enzyme complex
GO:00061066.1e-222fumarate metabolic process
GO:00043336.1e-222fumarate hydratase activity
GO:00038242.7e-159catalytic activity
GO:00060999.9e-26tricarboxylic acid cycle
GO:00168299.9e-26lyase activity
KEGG pathwaytca:6554480.0 
 K01679 (E4.2.1.2B, fumC)maps-> Pathways in cancer
    Citrate cycle (TCA cycle)
    Reductive carboxylate cycle (CO2 fixation)
    Renal cell carcinoma
InterPro domain[16-475] IPR0056776.1e-222Fumarate hydratase, class II
[16-476] IPR0089482.7e-159L-Aspartase-like
[24-357] IPR0227613.5e-112Lyase 1, N-terminal
[15-153] IPR0240832.9e-58L-Aspartase-like, N-terminal
[149-167] IPR0003626.4e-35Fumarate lyase
[422-476] IPR0189519.9e-26Fumarase C, C-terminal
Orthology groupMCL10518 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211602-TA
ATGGCGTGTTCCACTTGTTTGGAACAGTCCGGTAAAACTGTAAAAATGCGTAAAGAGCGTGATACTTTCGGCGAGCTGGATGTGCCTGATGACAGGCTGTATGGCGCCCAAACGGTCAGGTCCGTCATGAATTTCCCCATCGGTGGCATCGAGGAAAGAATGCCGTACCCTGTCATCGTGGCCTTTGGTATCCTGAAGAAGGCTGCTGCTAAAGTGAACACGGAATATGGCTTGGACAAGAAAATTGCGGATGCGATCATCGAGGCGTGTGATGACGTCATATCCGGGAAGCTGTACCGAGAAGGCCACTTCCCCCTCGTCATCTGGCAGACTGGATCTGGCACACAGTCCAACATGAACACCAATGAGGTGATATCCAACCGAGCCATTCAGATCCTGGGTGGTAAGCTTGGCTCTAAGACGCCCGTTCATCCCAACGACCACGTGAACAAATCTCAGAGCTCCAACGACACCTACCCAGCGGCGATGCACATCGCTGTGGCCATGGAGCTGAGGGATAGGCTGATGCCCGGTCTCGTAGCGCTGAGGGACACCCTCGAGAAGAAGTCCAAGGAGTTTGAGAAGATCATCAAGATCGGAAGAACCCACTTGATGGATGCCGTTCCTCTCACCCTCGGCCAGGAGTTCAGCGGGTACGCGACCCAGCTCACGTACGGGATAGAGCGAGTGTGCACCACGCTGCCTCGACTGCACTACCTGGCCCTGGGCGGAACGGCGGTCGGCACGGGACTCAACACCAGGAAGGGATTCGCTGAGAAATGCGCTAAGGAAATCGCCACACTTACTGGCATCCCCTTCGAAACCGCCCCCAACAAGTTCGAGGCCCTGGCGGCCCACGACTCCATGGTGGAGGTCCACGGGGCCCTCAACACCATAGCGGTGTCCCTGATGAAGATCGCCAACGACATCCGCATGCTGGCTTCCGGTCCGCGCTGCGGCCTCGCCGAGTTAATGCTGCCTGAGAACGAGCCCGGATCCTCTATCATGCCGGGCAAAGTGAACCCGACTCAGTGCGAAGCGCTTACAATGTTGGCCGCTCAGGTGATGGGTAACCACGTGGCCTGCACCATCGGTGGCAGCAACGGACACTTTGAACTGAACGTCTTCAAGACAATGATGGTGGCCAACATGTACACTCAGATATTTTTAGGCGACGGCTGCCAGGCGTTCAACAAGAACTGCGCTGTCGGCATTCAGGCTAACATTGCACAGATCAATAAGATCATGAAGGAGTCACTAATGTTGGTGACGGCACTCAATCCACACATTGGATATGATAAGGCTGCGTTGATCGCTAAGACAGCACACAAGGAAGGTGGAACCCTGAAGGACACGGCTATCAAACTAGGAATCCTCACCGCTGAACAGTTCGACCAATGGGTCAGGCCGGAGGACATGCTCGGCCCCAAGTAG

Protein sequence:

>DPOGS211602-PA
MACSTCLEQSGKTVKMRKERDTFGELDVPDDRLYGAQTVRSVMNFPIGGIEERMPYPVIVAFGILKKAAAKVNTEYGLDKKIADAIIEACDDVISGKLYREGHFPLVIWQTGSGTQSNMNTNEVISNRAIQILGGKLGSKTPVHPNDHVNKSQSSNDTYPAAMHIAVAMELRDRLMPGLVALRDTLEKKSKEFEKIIKIGRTHLMDAVPLTLGQEFSGYATQLTYGIERVCTTLPRLHYLALGGTAVGTGLNTRKGFAEKCAKEIATLTGIPFETAPNKFEALAAHDSMVEVHGALNTIAVSLMKIANDIRMLASGPRCGLAELMLPENEPGSSIMPGKVNPTQCEALTMLAAQVMGNHVACTIGGSNGHFELNVFKTMMVANMYTQIFLGDGCQAFNKNCAVGIQANIAQINKIMKESLMLVTALNPHIGYDKAALIAKTAHKEGGTLKDTAIKLGILTAEQFDQWVRPEDMLGPK-