Monarch geneset OGS2.0

DPOGS211321
TranscriptDPOGS211321-TA2802 bp
ProteinDPOGS211321-PA933 aa
Genomic positionDPSCF300125 + 101162-107444
RNAseq coverage1662x (Rank: top 8%)
Annotation
HeliconiusHMEL0213180.086.30% 
BombyxBGIBMGA004950-TA0.081.53% 
Drosophilapug-PD0.068.21% 
EBI UniRef50UniRef50_O965530.068.21%C-1-tetrahydrofolate synthase, cytoplasmic n=537 Tax=root RepID=C1TC_DROME
NCBI RefSeqXP_002097109.10.068.64%GE26043 [Drosophila yakuba]
NCBI nr blastpgi|410168260.082.93%methylenetetrahydrofolate dehydrogenase [Spodoptera frugiperda]
NCBI nr blastxgi|410168260.082.93%methylenetetrahydrofolate dehydrogenase [Spodoptera frugiperda]
Group
Gene OntologyGO:00055249e-248ATP binding
GO:00043299e-248formate-tetrahydrofolate ligase activity
GO:00093969e-248folic acid-containing compound biosynthetic process
GO:00044882.8e-72methylenetetrahydrofolate dehydrogenase (NADP+) activity
GO:00038242.8e-72catalytic activity
GO:00551142.8e-72oxidation-reduction process
GO:00054883.3e-52binding
KEGG pathwaydya:Dyak_GE260430.0 
 K00288 (MTHFD)maps-> One carbon pool by folate
    Glyoxylate and dicarboxylate metabolism
InterPro domain[315-933] IPR0005599e-248Formate-tetrahydrofolate ligase, FTHFS
[35-57] IPR0006722.8e-72Tetrahydrofolate dehydrogenase/cyclohydrolase
[128-292] IPR0206311.1e-66Tetrahydrofolate dehydrogenase/cyclohydrolase, NAD(P)-binding domain
[145-274] IPR0160403.3e-52NAD(P)-binding domain
[4-124] IPR0206301.2e-32Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain
Orthology groupMCL10547 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211321-TA
ATGTCTGCTCAAGTGATTTCTGGTACACAAACAGCTAGATCTATAGAAAATGACCTCCGTCAGCAAGTGGCCGTCATGGGTCAACAGCACCCTGGCTTCCAACCAAAGCTGGCTATCGTGCAGGTCGGAGGACGAGAAGACTCCAACGTTTACATCCGAGCCAAGCTGAAAGCAGCAGAAAACATCGGCATAGCTGCTGAACACATCAAACTTCCCAGAGAAATCTCGCAGGCTGAGCTACTTACAAAGTTGACAGCTCTAAATGATTCGCCGTTAGTACATGGCATAATAGTTCAGATGCCACTGGATTCGGTCGAGAAAATCGACTCGCATCTCATCACCGACGCTGTCTCCTCGCAGAAAGATGTTGACGGATTGAATACTGAAAACGAAGGACGTGTGGCCCTCGGTGATATGTCAGGCTTCGTTTCTTGCACCCCAGCTGGTTGTATAGAACTCATCAAACGTACTGGAATCTCCATCGAAGGCAAACAGGCGGTGGTTATCGGACGCAGTAGGATCGTTGGAACACCAGTAGCTGAACTTCTCAAGTGGGAAAACGCCACTGTTACCGTTTGCCACTCGAAGACTAAGAACTTAAGTGAAATTACCAAAACTGCTGATATTTTAGTGGTAGCGATTGGTAAAGCAGAAATGGTTCGTGGCTCTTGGATTAAACCGGGGGCGGTGGTAATAGACTGCGGTATTAATCCCATCCCAGATACATCAAAACCCAGCGGCCGGAGGTTGGTAGGTGACGTGGCATATTCCGAGGCGGTACAGGTCGCGTCGCATGTAACCCCTGTACCCGGCGGTGTGGGTCCCATGACTGTGGCTATGTTGATGAAAAACACCGTGTTGGCTGCTAGCAGACAACTCCAACGGATCTCTACACCCGTGTGGCCGCTGCAGCCGCTTAGACTTAGCACGGTTTCGCCACCTCCAAGCGACATTGTTATAGCGCGTTCTCAAAAACCTAAATATATTAGTAAGTTGGCGGAGGAGATAGGATTGTTCCCCAGTGAGGTGTCACAATATGGTAATACCAAGGCGAAAATATCTTTGTCTGTGCTGGATCGTCTCCGAGATCAGCGTGGCGGAAAATACATCGTCGTGGCTGGCATAACCCCCACTCCTCTCGGTGAGGGTAAGAGTACGACGTTGATCGGTCTGGTGCAGGCTCTGGGTGCTCATCGCGGAAGGAACGCCTTCGCCGTCATGCGTCAGCCCAGTCAGGGACCAACCTTCGGAGTCAAGGGCGGAGCCGCTGGCGGAGGATACTCACAGGTCATTCCTATGGAAGATTTCAACCTTCATCTCACTGGTGACATTCACGCCGTTTCTGCAGCCAACAATCTCCTCGCAGCTCACATGGATGCCAGGATCTTCCATGAGCTAACACAAAAAGACGGTCCTCTGTATGATCGTTTGGTGCCAGAAATTAAAGGAGTCAGAAAATTCTCCCCCATTCAGTTGAGAAGATTAAAGAGATTGGGAATCGAAAAGACCGATCCGAACGCCTTAACACCAGAAGAAAGAGTTAAATTTGCACGACTTAACATTGACCCCAAAAAAGTTATGTGGAATAGAGTCGTGGATTTGAACGATAGATATTTACGTAAAATTACTATCGGACAATCACCCACTGAGAAAGGTTTTACCCGCGAGACTAGTTTTGACATCGCCGTAGCGTCTGAAATTATGGCTGTGTTGGCTCTGGGCAAGGATGTGAATGATATTAAGGAGAGACTCGCGAATATGGTGGTAGCTCTGGACACAAACGGCAAACCAGTAATAGCTGATGATCTTGGCATTACAGGGGCTTTAATGGTGTTGCTTAAGGACGCATTTGAGCCCACATTGATGCAGACTTTGGAAGGTACTCCTGTATTGGTCCACACGGGACCGTTCGCCAACATAGCTCATGGATGCTCCTCTATACTTGCCGATAAGATAGCCATGAAACTGGCCCGAGAAAATGGCTATGTGGCAACTGAAGCCGGCTTTGGATCTGACATCGGTATGGAAAAGTTCTTTGATATAAAGTGTCGTTCGAGCGGCGACACCCCTCACTGCGCTGTCATCGTGAGTACAGTCCGCGCGCTCAAGATGCACGGCGGAGGACCTACCGTCAGCCCTGGACAACCGCTCCACTCAGTATATGTCCAAGAAAACTTGGAACTGCTTAGCAAAGGACTGTGCAATTTAGGAAAACACATCAGCAACGGCAATAAGTTTGGCGTTCCTGTCGTTATTGCTGTTAACAAACACGGAAACGACACAGAAGCAGAACTGAACATGGTTAGAGAATATGCCTTGAAAAATGGAGCATTCCGTGCTGTTATTTGCGATCACTGGGCTAAGGGAGGCGCTGGCGCCTTGGAACTAGCGGACGCGGTCGTAGAAGCCTGCGACCGTCCCTCGAACTTCCAATATCTCTATCCATTGGAAATGACGATCCAAGATAAAATTAAGAAGATCGCTGTAGAGATGTACGGAGCTGGGACAGTGGAATACACAGATGTGGTTTTGGAGAAAATTAAAGTTTTGAATGATAGGGGCTACGATAAGCTGGCGATATGTATGGCCAAGACTTCTAATTCGCTGACCGGCGACCCCAGTATCAAGGGTGCTCCTACCGGATTCACTCTTCGTATCAATGATATTTTCGTGTCTGCGGGCGCTGGTTTTATTGTTCCTATGGTTGGCGAGATATCCAAAATGCCTGGCCTTCCTACAAGACCCAGCATCTACGATATAGATCTGAACACCGAGACCGGTGAAATCGATGGCCTTTTTTAA

Protein sequence:

>DPOGS211321-PA
MSAQVISGTQTARSIENDLRQQVAVMGQQHPGFQPKLAIVQVGGREDSNVYIRAKLKAAENIGIAAEHIKLPREISQAELLTKLTALNDSPLVHGIIVQMPLDSVEKIDSHLITDAVSSQKDVDGLNTENEGRVALGDMSGFVSCTPAGCIELIKRTGISIEGKQAVVIGRSRIVGTPVAELLKWENATVTVCHSKTKNLSEITKTADILVVAIGKAEMVRGSWIKPGAVVIDCGINPIPDTSKPSGRRLVGDVAYSEAVQVASHVTPVPGGVGPMTVAMLMKNTVLAASRQLQRISTPVWPLQPLRLSTVSPPPSDIVIARSQKPKYISKLAEEIGLFPSEVSQYGNTKAKISLSVLDRLRDQRGGKYIVVAGITPTPLGEGKSTTLIGLVQALGAHRGRNAFAVMRQPSQGPTFGVKGGAAGGGYSQVIPMEDFNLHLTGDIHAVSAANNLLAAHMDARIFHELTQKDGPLYDRLVPEIKGVRKFSPIQLRRLKRLGIEKTDPNALTPEERVKFARLNIDPKKVMWNRVVDLNDRYLRKITIGQSPTEKGFTRETSFDIAVASEIMAVLALGKDVNDIKERLANMVVALDTNGKPVIADDLGITGALMVLLKDAFEPTLMQTLEGTPVLVHTGPFANIAHGCSSILADKIAMKLARENGYVATEAGFGSDIGMEKFFDIKCRSSGDTPHCAVIVSTVRALKMHGGGPTVSPGQPLHSVYVQENLELLSKGLCNLGKHISNGNKFGVPVVIAVNKHGNDTEAELNMVREYALKNGAFRAVICDHWAKGGAGALELADAVVEACDRPSNFQYLYPLEMTIQDKIKKIAVEMYGAGTVEYTDVVLEKIKVLNDRGYDKLAICMAKTSNSLTGDPSIKGAPTGFTLRINDIFVSAGAGFIVPMVGEISKMPGLPTRPSIYDIDLNTETGEIDGLF-