Monarch geneset OGS2.0

DPOGS215394
TranscriptDPOGS215394-TA945 bp
ProteinDPOGS215394-PA314 aa
Genomic positionDPSCF300088 + 25663-28274
RNAseq coverage478x (Rank: top 26%)
Annotation
HeliconiusHMEL0104706e-14477.10% 
BombyxBGIBMGA002355-TA5e-13775.92% 
DrosophilaNmdmc-PB2e-9457.63% 
EBI UniRef50UniRef50_D4A1Y51e-9658.14%RCG56426 n=23 Tax=Euteleostomi RepID=D4A1Y5_RAT
NCBI RefSeqXP_001687783.16e-10559.42%AGAP004677-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582932641e-10359.42%AGAP004677-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1571076492e-10161.46%methylenetetrahydrofolate dehydrogenase [Aedes aegypti]
Group
Gene OntologyGO:00044889e-81methylenetetrahydrofolate dehydrogenase (NADP+) activity
GO:00038249e-81catalytic activity
GO:00551149e-81oxidation-reduction process
GO:00093969e-81folic acid-containing compound biosynthetic process
GO:00054884.5e-42binding
KEGG pathwayaga:AgaP_AGAP0046772e-104 
 K13403 (MTHFD2)maps-> One carbon pool by folate
    Glyoxylate and dicarboxylate metabolism
InterPro domain[48-70] IPR0006729e-81Tetrahydrofolate dehydrogenase/cyclohydrolase
[138-311] IPR0206313e-62Tetrahydrofolate dehydrogenase/cyclohydrolase, NAD(P)-binding domain
[154-293] IPR0160404.5e-42NAD(P)-binding domain
[18-135] IPR0206301.2e-39Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain
Orthology groupMCL11029 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215394-TA
ATGAGAAGTCACACATACAATGAATCTTTGGGCACAGCCAGGATCATGGCAAGAATCCTGGATGGAAAAGCACTGGCCAAAGAAATAAAGGATGAAGTAAAGATAGAAATCAACAATTGGATCAGGAATGGCAACCCTGCACCAACTCTACGCTGCATTATAGTGGGAGAAGACCCTGCCAGTCATACTTATGTTAGAAACAAAATACAGGCAGCACGTGAAGTTGGCATTGAAGCAGAAACAATAAGATATGATGCTAATTTAACGGAAGAGGCATTGATGTCTATAATAGATTCATTGAATAAGGATAAAAATATTAACGGGATATTAGTGCAGCTGCCTTTACCAAACTCTATTGATGAGAGGAGAGTGTGTAATGCTTTGGCTCCGGAGAAGGATGTTGATGGTTTTCATATAACAAATGTTGGTCAGTTGTGTCTGGACATGCCCACCATTGTACCAGCCACGGCCCTCGCTGTGGTTGAAATGCTGAGAAGATTCAAAATTGATACATTCGGTCGCAACACGGTGGTTATAGGACGATCTAAGAACGTTGGCATGCCAATAGCCATGATGCTACATTCTGACGGCCGCCATGACAGTGGTCTCGGTATGGACTCAACTGTCACTATCTGTCACAGACACACGCCTGCTGATCAATTGGAGTTTTATTGTCGACACGCTGATATTATTGTTAGCGCAACCGGTCTACCAAAGCTTATTAAGGCTGATATGGTGAAGCCAGGAGCCACGATCATAGACGTCGGCATCACCAGAGTAACTGATGAAAGCGGAAAGACAAAACTTGTTGGCGATGTTGATTATGACGAGGTAGTTAAAGTGGCCGGTGCTATAACCCCCGTCCCTGGTGGAGTGGGTCCCATGACAGTCGCCATGTTAATGAAGAACACTCTTCAAGCTGCGCAGCATAACTCATTTAAATGA

Protein sequence:

>DPOGS215394-PA
MRSHTYNESLGTARIMARILDGKALAKEIKDEVKIEINNWIRNGNPAPTLRCIIVGEDPASHTYVRNKIQAAREVGIEAETIRYDANLTEEALMSIIDSLNKDKNINGILVQLPLPNSIDERRVCNALAPEKDVDGFHITNVGQLCLDMPTIVPATALAVVEMLRRFKIDTFGRNTVVIGRSKNVGMPIAMMLHSDGRHDSGLGMDSTVTICHRHTPADQLEFYCRHADIIVSATGLPKLIKADMVKPGATIIDVGITRVTDESGKTKLVGDVDYDEVVKVAGAITPVPGGVGPMTVAMLMKNTLQAAQHNSFK-