Monarch geneset OGS2.0

DPOGS208845
TranscriptDPOGS208845-TA2577 bp
ProteinDPOGS208845-PA858 aa
Genomic positionDPSCF300036 + 888832-902739
RNAseq coverage265x (Rank: top 40%)
Annotation
HeliconiusHMEL0154162e-15370.45% 
BombyxBGIBMGA007951-TA3e-14458.67% 
DrosophilaCG2543-PA2e-8038.38% 
EBI UniRef50UniRef50_C3ZPE32e-9342.14%Folylpolyglutamate synthase (Fragment) n=2 Tax=Bilateria RepID=C3ZPE3_BRAFL
NCBI RefSeqXP_001180391.13e-9146.67%PREDICTED: similar to Folylpolyglutamate synthase [Strongylocentrotus purpuratus]
NCBI nr blastpgi|470874093e-9742.43%folylpolyglutamate synthase, mitochondrial [Danio rerio]
NCBI nr blastxgi|470874097e-9342.52%folylpolyglutamate synthase, mitochondrial [Danio rerio]
Group
Gene OntologyGO:00043262.1e-142tetrahydrofolylpolyglutamate synthase activity
GO:00055242.1e-142ATP binding
GO:00093962.1e-142folic acid-containing compound biosynthetic process
GO:00090583.2e-73biosynthetic process
GO:00354341.8e-26copper ion transmembrane transport
GO:00160211.8e-26integral to membrane
GO:00053751.8e-26copper ion transmembrane transporter activity
GO:00168742.4e-13ligase activity
KEGG pathwaydre:4067465e-98 
 K01930 (E6.3.2.17)maps-> Folate biosynthesis
InterPro domain[25-479] IPR0016452.1e-142Folylpolyglutamate synthetase
[59-341] IPR0132213.2e-73Mur ligase, central
[484-627] IPR0072741.8e-26Ctr copper transporter
[342-480] IPR0041012.4e-13Mur ligase, C-terminal
Orthology groupMCL13777 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208845-TA
ATGCTAAGCATTGGGATTGCTCGTGTTATAGTAGTAAATAGAGTATTCAGTTTGAGGATGACTTCCACCAAGATGTCCTATGAGGATGCTGTTCAGAAGTTAAACTTGCTTCAGTCTAATAAATCCACTATTGACCAAATTCGAAAAGGCCTTAGAAATGGTGAAAAATGTAATAACCTAAAAGATATGGAAAACTATTTATCACGAACAGGAGTCACCATGTCTATGTTAGACACTCTGTCTGTCATACATGTGGCAGGCACCAAGGGCAAGGGTTCAACTAGTGCCATGTGTGACTCTATACTACGTTCATATGGATTCCGTACTGGTTTTTACTCATCTCCACATCTTGTAGCTGTCAGGGAGAGAATACAACTTGAGGGACAAATACTTTCCAAAGAAAAATTTGCTTATTACTTTCACCAGGTGTATGGATCTCTCTCGGCTACACAGGCTTTTGAAGGTGATATGCCCAAATATTTTAGTTTCTTAACAGTTTTGGCCTTCAATGTATTTCTGAAAGAGAAAGTAGATGTAGCTATTATTGAAGTGGGTATTGGTGGTGTAGTTGATTATACTAATGTTTTAAGAAAGGTTCCAGTTGTGGGTATAACTCCATTGGGCTTGGATCATACATCAATACTAGGTAATACATTGCCCGAGATTGCAGCTGCAAAGGCTGGAATAATGAAACCTGGCTGTGAAGCATATACCGTCCACCAAGCCCCAGAGGCGATGGATGTCTTAGAAAAAGTTGCAAAAGACGTTAAGTGTTCTCTAAATATAGTTCCTGAACTTAGTTCTTATAAATATGAAAACGGGCTCAAGCTCTCCATATATTTAGAAGCCTATGCAATGAATGCCTCTCTTGCCATACAACTTTCACATGCTTGGATACGAAGAACAAGGGGAAGCATAAAACCAATGGTGCCAAGGAATGGTACTTGCAAAATAGATGATCATATCTTGGTTGACGTCTTAACAAAAGAAACTGTAAAAGGTTTAAAAGAATTTAGATGGCCTGGCCGGTACCAGATTGTTAAGACAGATTATGCCGAGTTTTATTTGGATGGAGCACACACAAAGGAGTCAATGGATATCTGTGCAAAATGGTTCACAGATAGCAACAGATCTTCTGCTCAAGCACTGATCTTCAGCGCAACCGGTGATCGTGATTCAAAAGTTTTATTAGAATCACTGAGAGATATAGATTTTCATAAAGTATATTTTGTTATACCCAGTTCCTACAAAAAGTTGAGCAAAAATAATGACAATTTTTATATGATGGAACACAAAGATCTTCTAACGAGATGTAAGAGTCAAGCCTCTATATGGAAGAACATTAATGGAAACTCTATAGTGAATGTATATGAATGTGTCGCTGATGCATTAGAAAGTATAAAGGAAATTAAAGCCGACAGTTCCGTTCTTGTTACTGCTGAAGTCATGATGCACATGTGGATGTGGTTTGGCTATGACCTGGGCGACTTCCTGTTTTCTGGTCTGGTCGTCAACACGAGGTGGTCGTTCGCTCTCACCTGGATAGTGCTGTTTTTCGTGGCTCTGCTCTTCGAGGGATCGAAGGTATATCTCGCTCGTGTTCAGCGCGAGGCACTTCGGAAATTGCGTCCTCACGGCTCTGACGAAAGACGGAACTTGTTATGTGAGCGCGATCGAGAGCAAGCGAACGCCATGGAAGCGACGACCAGTCGCAACACAAGCTCGGGGCAAGTCAGCAAGACCCTGGTGAACGGTCATCAGACCCTGGTGTTCGTCGTCCATAACGTGGTGGGGTACCTGCTGATGTTGGCCGTCATGATCTACAACGTTCACCTCATGCTGGCCGTGGTCTTCGGCATGATGTTGGGTTACTTCCTGTTCGGGACCAAGTTAACTCGCCTCCAGATGCAATGCTTCAGCACCAAGCGCGTCGTCATCTGTACGCCGGAATGCGACGACACTGACCGCTGCAACACGGAGAAGTGCAGCTGGGAGGTGTGCGAGGGCAAGCTCTGCAAGGAAATGGACGCGCTAGAACTCCAACAGGCAGACACCGCGTACACACGAAAGTACACGCCCACTACCATGTTATATGTAGGAACTGAAAACTCAACGCCTCCGCTACTAAACACTTCCATGGATTCCCAGTCGTCTGACATATTTGTTTGTCAAACTCGTACCTGTATACAGCCCTCTCACTACTTTCCCGCCAATACCCAAGATGGCGGCGAACACAGCAACGCCAGCTGTTACTATGGCGCCAAACGATGTCCGTCTAAGGTTGCCAAGTTTAAAAAGATGCAGTCCGGTGTCCATTGTCACCACGACAGCGAAAAGGAAGACAGTCCTAGCGTCGAGGATGCGCAACTTCTTCATAGTGAGCGGCGCGGCTGTTGCAAGAAATTAGAACCTCCTCCAGAAGAACAACGCTGCAAATCTAGTCAAAACACGACAGTCGTCATCCACGAAGAGAGCCATTCGGAGGACGCCGACACCCAAAGCAGAGAGAACAGTCCTCAGATCTCGTGCTGTCACTCCAAGTCGACGCGGGACAGCCAAGAGCAGATAGTCACGTGA

Protein sequence:

>DPOGS208845-PA
MLSIGIARVIVVNRVFSLRMTSTKMSYEDAVQKLNLLQSNKSTIDQIRKGLRNGEKCNNLKDMENYLSRTGVTMSMLDTLSVIHVAGTKGKGSTSAMCDSILRSYGFRTGFYSSPHLVAVRERIQLEGQILSKEKFAYYFHQVYGSLSATQAFEGDMPKYFSFLTVLAFNVFLKEKVDVAIIEVGIGGVVDYTNVLRKVPVVGITPLGLDHTSILGNTLPEIAAAKAGIMKPGCEAYTVHQAPEAMDVLEKVAKDVKCSLNIVPELSSYKYENGLKLSIYLEAYAMNASLAIQLSHAWIRRTRGSIKPMVPRNGTCKIDDHILVDVLTKETVKGLKEFRWPGRYQIVKTDYAEFYLDGAHTKESMDICAKWFTDSNRSSAQALIFSATGDRDSKVLLESLRDIDFHKVYFVIPSSYKKLSKNNDNFYMMEHKDLLTRCKSQASIWKNINGNSIVNVYECVADALESIKEIKADSSVLVTAEVMMHMWMWFGYDLGDFLFSGLVVNTRWSFALTWIVLFFVALLFEGSKVYLARVQREALRKLRPHGSDERRNLLCERDREQANAMEATTSRNTSSGQVSKTLVNGHQTLVFVVHNVVGYLLMLAVMIYNVHLMLAVVFGMMLGYFLFGTKLTRLQMQCFSTKRVVICTPECDDTDRCNTEKCSWEVCEGKLCKEMDALELQQADTAYTRKYTPTTMLYVGTENSTPPLLNTSMDSQSSDIFVCQTRTCIQPSHYFPANTQDGGEHSNASCYYGAKRCPSKVAKFKKMQSGVHCHHDSEKEDSPSVEDAQLLHSERRGCCKKLEPPPEEQRCKSSQNTTVVIHEESHSEDADTQSRENSPQISCCHSKSTRDSQEQIVT-