Monarch geneset OGS2.0

DPOGS207953
TranscriptDPOGS207953-TA2811 bp
ProteinDPOGS207953-PA936 aa
Genomic positionDPSCF300090 - 90689-96135
RNAseq coverage257x (Rank: top 41%)
Annotation
HeliconiusHMEL0143270.092.95% 
BombyxBGIBMGA000391-TA0.091.03% 
DrosophilaCG3999-PA0.071.22% 
EBI UniRef50UniRef50_P233780.064.67%Glycine dehydrogenase [decarboxylating], mitochondrial n=133 Tax=root RepID=GCSP_HUMAN
NCBI RefSeqXP_970082.10.074.41%PREDICTED: similar to CG3999 CG3999-PA [Tribolium castaneum]
NCBI nr blastpgi|910924640.074.41%PREDICTED: similar to CG3999 CG3999-PA [Tribolium castaneum]
NCBI nr blastxgi|910924640.074.41%PREDICTED: similar to CG3999 CG3999-PA [Tribolium castaneum]
Group
Gene OntologyGO:00065440glycine metabolic process
GO:00551143.5e-173oxidation-reduction process
GO:00043753.5e-173glycine dehydrogenase (decarboxylating) activity
GO:00038242.2e-26catalytic activity
GO:00301702.2e-26pyridoxal phosphate binding
KEGG pathwaytca:6586130.0 
 K00281 (GLDC, gcvP)maps-> Glycine, serine and threonine metabolism
InterPro domain[1-932] IPR0205810Glycine cleavage system P protein
[1-929] IPR0034370Glycine cleavage system P protein, homodimeric
[1-414] IPR0205803.5e-173Glycine cleavage system P-protein, N-terminal
[454-931] IPR0154248e-96Pyridoxal phosphate-dependent transferase, major domain
[476-708] IPR0154212.2e-26Pyridoxal phosphate-dependent transferase, major region, subdomain 1
Orthology groupMCL11146 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207953-TA
ATGCTGGATCTCTTAGGTTATAAGAGTTTAGACCAACTAACTAATGATGCTGTGCCTAAGAAAATCCAGTTAGAAGGTCTTATGAACATCACAGAACCAATGAGCGAATATGATCTTATTAAAAGAATTCGCAAAATAGCAGAAACAAATCAAATATGGCGTTCCTATATTGGTATGGGATATCATAATTGCTGTGTGCCGCATTCCATAATGAGAAACATGTTTGAGAACCCTGGATGGACTACACAGTACACTCCTTATCAACCCGAAGTGGCTCAAGGACGGCTAGAAGGTTTATTAAACTATCAAACAATGGTCAGTGATCTAACAGGGTTAGACGTTGCTAACGCGTCTCTCCTAGATGAAGGAACTGCTGCAGCTGAGGCTTTGTCTTTGTGTCATAGACACAATAGGAGGACAAAATTTGTAGTATCAGAACGATTGCATCCGCAAACTTTAGCTGTTGTTCAAACTCGGTTAGATGCGTTGGGCCTAGAAGTTATGGTAGTACCTGATGTTAGACAAGCAGACTTCGCACAACGAGACATATCTGCTGTATTATTACAATGTCCAGACACAAGAGGATTGGTTTATGATTACTCAGGCCTTGCTGCAGCTGCACAAGAACATGGGACTTTGGTGGTCGTTGCGACTGATCTTCTAGCTATGGCACTTTTACGCCCCCCCGCAGAGTGTGGTGCTGCTTTAGCAGTTGGTACTTCACAGAGGTTAGGTGTTCCTCTTGGATATGGTGGACCTCATGCCGGATTTTTCGCAGCTGAACATGCGTTAGTTCGTTTGATGCCAGGTCGCATGGTTGGCGTGACTCGGGACGCGGCTGGAAGAGATGCTTACAGACTAGCTCTCCAGACAAGGGAACAGCATATCCGGAGAGACAAAGCCACGTCAAATATATGTACAGCTCAGGCTCTTTTAGCAAATATGTCAGCCATGTTCGCTGTTTATCACGGGCCACAAGGTCTGAGGGACATTGCGGTGCGAGTTCATAACGCTACTCTGGTTCTTGATGACGGAATTCAAAAACGTGGTCATAGGCAGTTGAACGACGTATATTTTGACACACTCTACATCATTCCAAGTGCGGATCATGATGCTACTGCTATAAAGGCAAGAGCTCAAGAAAAGAAAATTAATTTGCGATATTTTGATGATGGAGCTGTCGGAGTAGCATTAGACGAAACCACTACAATGGAAGACGTTGATGATTTGCTCTGGGTATTTGACTGCGAGAGGGTTGCTGAGGTGATGAAGAGTGGTGATGTCAAGTCAAGAAGTATCTTAAAGGGTCCGTTCAGAAGAACTTCTCCATACTTAACACATCCTGTGTTTAATATGCATCACTCTGAAACAAGAATAGTAAGGTATATGAAGAGACTGGAAAATAAGGATATATCATTGGTTCACTCTATGATTCCTCTCGGTTCCTGTACAATGAAGTTAAATTCTACAACCGAAATGATGCCATGTTCATTTAAACATTTTACTGACATCCATCCATTTGCACCTCTTGAGCAATGCCAGGGCTACCATACACTTTTTGAAGAGCTTGCTAAGGATTTGTGTGCTATCACAGGTTACGATCGTGTATCTTTCCAACCGAACAGTGGAGCTCAAGGCGAATACGCTGGTCTTAGAACAATCAAACGCTACCATGAATTCCGAGGTGACACAGGGCGTAACATATGTTTAATACCAGTTAGTGCTCATGGTACAAATCCAGCCTCGGCACACATGGCCGGCATGAGGGTCTGTGCAATCCGCGTCACACCCACTGGAGATATTGATATGGCACACCTTAAAGATATGGTGGAAGAACATAGTGAAAAATTATCATGTCTGATGTTAACTTATCCGAGTACATTCGGTGTGTTTGAGGAACGCACAGCCGACGTGTGCTCGCTCGTTCACCAACATGGGGGACAGGTCTATTTGGATGGTGCTAACATGAATGCACAGGTTGGACTTTGTAGGCCAGGAGACTATGGCAGTGATGTATCCCATTTGAATTTACATAAAACTTTCTGTATACCACACGGCGGAGGCGGCCCAGGAATGGGTCCAATAGGAGTAAAAGCTCATCTTGCTCCATTTTTACCGTCACATCCCGTGGTGAATCCGTTAGCTGACTTGGGTGAAGATGCCCATAGTTTTGGCTCCGTCAGTGCAGCGCCATTTGGTTCATCTGCAATATTACCAATATCATGGGCTTACATTAAGATGATGGGCCCTAAAGGCTTAAAGAGGGCGACTCAGGTGGCTATTCTTAATGCTAATTATATGTCGCGAAGATTAGATGGTCATTATAAAACTTTGTACAAAGGTGAAAGAGGACTTGTCGCACATGAATTTATTATAGATGTCCGAGATATGAAAAAAACTGCTAATATTGAACCCGGAGATATTGCAAAACGTCTTATGGACTTCGGTTTTCACGCACCTACAATATCTTGGCCGGTGGCTGGCACCCTTATGATTGAACCTACCGAATCTGAAGACTTACAAGAGTTGGATCGCTTCTGTGAAGCCCTTATTGCTATTAGAAAAGAAATTAAAGATATTGAAGATGGTCTTATTGATAAAAGATTGAATCCTGTAAAGATGGCGCCACACACACAAGAAGAAGTGATTACGGAAGATTGGAGTCGCCCTTACACAAGAGAACAAGCCGCTTTTCCTGCGCCATTTGTAAAGGGAGAAACAAAGATTTGGCCTACGGTTAGTCGCATCGACGATATGTACGGCGACAAACATCTTGTTTGCACGTGTCCTCCGGTAATCGATGACTTCTAA

Protein sequence:

>DPOGS207953-PA
MLDLLGYKSLDQLTNDAVPKKIQLEGLMNITEPMSEYDLIKRIRKIAETNQIWRSYIGMGYHNCCVPHSIMRNMFENPGWTTQYTPYQPEVAQGRLEGLLNYQTMVSDLTGLDVANASLLDEGTAAAEALSLCHRHNRRTKFVVSERLHPQTLAVVQTRLDALGLEVMVVPDVRQADFAQRDISAVLLQCPDTRGLVYDYSGLAAAAQEHGTLVVVATDLLAMALLRPPAECGAALAVGTSQRLGVPLGYGGPHAGFFAAEHALVRLMPGRMVGVTRDAAGRDAYRLALQTREQHIRRDKATSNICTAQALLANMSAMFAVYHGPQGLRDIAVRVHNATLVLDDGIQKRGHRQLNDVYFDTLYIIPSADHDATAIKARAQEKKINLRYFDDGAVGVALDETTTMEDVDDLLWVFDCERVAEVMKSGDVKSRSILKGPFRRTSPYLTHPVFNMHHSETRIVRYMKRLENKDISLVHSMIPLGSCTMKLNSTTEMMPCSFKHFTDIHPFAPLEQCQGYHTLFEELAKDLCAITGYDRVSFQPNSGAQGEYAGLRTIKRYHEFRGDTGRNICLIPVSAHGTNPASAHMAGMRVCAIRVTPTGDIDMAHLKDMVEEHSEKLSCLMLTYPSTFGVFEERTADVCSLVHQHGGQVYLDGANMNAQVGLCRPGDYGSDVSHLNLHKTFCIPHGGGGPGMGPIGVKAHLAPFLPSHPVVNPLADLGEDAHSFGSVSAAPFGSSAILPISWAYIKMMGPKGLKRATQVAILNANYMSRRLDGHYKTLYKGERGLVAHEFIIDVRDMKKTANIEPGDIAKRLMDFGFHAPTISWPVAGTLMIEPTESEDLQELDRFCEALIAIRKEIKDIEDGLIDKRLNPVKMAPHTQEEVITEDWSRPYTREQAAFPAPFVKGETKIWPTVSRIDDMYGDKHLVCTCPPVIDDF-