Monarch geneset OGS2.0

DPOGS205264
TranscriptDPOGS205264-TA2343 bp
ProteinDPOGS205264-PA780 aa
Genomic positionDPSCF300021 - 1342192-1352988
RNAseq coverage1248x (Rank: top 10%)
Annotation
HeliconiusHMEL0162040.078.90% 
BombyxBGIBMGA011064-TA0.085.81% 
DrosophilaCG7470-PA0.077.54% 
EBI UniRef50UniRef50_B4H8Q10.070.40%GL25419 n=5 Tax=Coelomata RepID=B4H8Q1_DROPE
NCBI RefSeqXP_001652254.10.077.58%glutamate semialdehyde dehydrogenase [Aedes aegypti]
NCBI nr blastpgi|1571144030.077.58%glutamate semialdehyde dehydrogenase [Aedes aegypti]
NCBI nr blastxgi|1571144030.077.58%glutamate semialdehyde dehydrogenase [Aedes aegypti]
Group
Gene OntologyGO:00065612e-289proline biosynthetic process
GO:00038242e-289catalytic activity
GO:00551142e-289oxidation-reduction process
GO:00043506.3e-130glutamate-5-semialdehyde dehydrogenase activity
GO:00081521.1e-104metabolic process
GO:00164911.1e-104oxidoreductase activity
GO:00086528.9e-66cellular amino acid biosynthetic process
GO:00057371.8e-59cytoplasm
GO:00043491.8e-59glutamate 5-kinase activity
GO:00166207.1e-33oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor
KEGG pathwayaag:AaeL_AAEL0068340.0 
 K12657 (ALDH18A1, P5CS)maps-> Arginine and proline metabolism
InterPro domain[1-780] IPR0057660Delta l-pyrroline-5-carboxylate synthetase
[360-752] IPR0009656.3e-130Gamma-glutamyl phosphate reductase GPR
[720-762] IPR0161621.1e-104Aldehyde dehydrogenase, N-terminal
[349-765] IPR0161611.3e-94Aldehyde/histidinol dehydrogenase
[63-343] IPR0010488.9e-66Aspartate/glutamate/uridylate kinase
[63-363] IPR0057151.8e-59Glutamate 5-kinase/delta-1-pyrroline-5-carboxylate synthase
[101-115] IPR0010578.3e-53Glutamate/acetylglutamate kinase
[573-719] IPR0161637.1e-33Aldehyde dehydrogenase, C-terminal
[341-649] IPR0155907.4e-16Aldehyde dehydrogenase domain
Orthology groupMCL14006 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205264-TA
ATGTTGAGGTGTAACATTTTCTCTAAACGTCTGGGTGCCAGGTTTTATACGGTTGTGGGCTCTGATTCTCTTCCTGGTGGGAAAGTCAATTGGGGTCATGAGGGCCATAATGTAGATAAGCGGAATATAGGTAGCCTGGATAGAAAGAAACAAACATTTTCTGATAGATCGCAACTGAAGTACGCCAGAAGATTAGTGGTCAAACTAGGAAGTGCTGTGATCACGAGGGAGGATGGCAATGGACTCGCTCTAGGAAGATTAGCATCTATTGTTGAACAGGTAGCAGAATGTCATCATGAAGGAAGAGAATGTATTATGGTAACGAGTGGTGCCGTGGCATTTGGTAGGCAGAAGTTGACTCAAGAGTTGCTCATGTCTCTATCCATGAGGGAAACTCTTTCACCCAGCGATCACACCAGAGAGGACGCGGGCAGTATTTTGGACCCTCGGGCTGCAGCCGCAGTTGGTCAGTCGGAGCTGATGTCCATGTATGATGTTATGTTTTCACAATATAACGTGAAAATTGCTCAGGTGCTTGTTACAAAACCCGATTTCTACAACGAGGAAACAAGAAAAAACTTGTTTACCACTCTCTCAGAGTTGATATCTCTAAATATAGTACCTATTGTGAATACCAATGACGCCGTTAGTCCGCCGATGTATATTCATGATGATACTGTTGTCCCTGGGACTGGAAAAAAGGGAATTGGAATAAAAGACAACGATAGTCTTTCTGCTTTATTGGCCGCTGAAGTCCAATCGGACCTTCTTATTATGATGTCAGATGTGGATGGTATTTATAACAAACCGCCCTGGGAAGATGGTGCTCGAATGATGCATACTTATACATCAGCTGAAAAAGTACAGTTTGGACAAAAGTCGAAAGTGGGGACAGGAGGTATGGATTCAAAGGTGAATGCTGCTACTTGGGCCATGGCAAGAGGGGTCAGTGTCGTTATTTGTAACGGCATGCAGGAAAAGGCTATTAAAACAATCATAAGTGGACGTAAGGTTGGAACGTTCTTCACTGATACCCCAAGCGTGTCAACCGCGTCAGTTGATGTGATGGCTGAAAATGCTCGCACAGGCAGCAGGGTGTTGCAAAAATTATCTCCAGCTGAAAGAGCGGCTGCTATTCATTCTTTAGCAGATTTGTTACTAGATAAAAAAGACAAAATATTGGAAGCGAATGCGAAAGACTTGGAGGAAGCTACAAAAACTGGACTAGAGAAACCACTCTTAAATAGGTTATCACTTAGTCCAGGAAAACTGAAGACATTATCAATTGGTTTAAAACAAATTGCAGATTCAAGCTATGATAATGTCGGGAGGGTATTGAGAAAGACCAAACTTGCTGAAAATTTATTATTAAAACAAGTAACAGTTCCTATTGGCGTCTTACTTGTTATATTTGAATCAAGACCTGATTCACTGCCTCAAGTTGCTGCATTAGCAATGGCGTCTGGTAACGGACTCTTACTAAAAGGTGGGAAAGAGGCAGCTTACTCAAATAGAGCATTGATGGAAATCGTTAAAGAGTCTCTTCAGCCATTTGGCGCGTCAGATGCCGTTTCATTGGTCTCAACTAGAGATGAAATAAGTGATTTACTAGCAATGGAAAAGCATATTGATTTGATTATTCCTCGTGGTTCTTCTGAACTAGTTAGAAATATTCAAAAGCAGTCACAACATATTCCTGTATTGGGTCACGCTGAAGGAATTTGTCATGTATATTTAGACAAAGATGCCGATCCAAGCAAGGCGTTGAAGATAGTCCGCGACGCTAAATGTGATTATCCTGCAGCATGTAATGCTATGGAGACTTTGCTTATCCATGAGGATCATTTGTCAGGGACTTTATTCCAGGATATATGTAATATGTTGAAACAGGAAGGTGTAAAAATACACGCTGGTCCAAAGCTTGCTAGTCAATTAACCTTCGGACCACCTCCTGCTAGAACAATGAAATATGAGTATGGTGATTTAGAATGCTCTATAGAAGTCGTCAAGGATTTGGATGAAGCAATTGATCACATTCATAAGTTTGGAAGCTCTCACACTGACGTCATCGTTACGGAGAATGATCATACAGCTAGGGACTTCCTTAATACTGTTGACAGTGCTTGTGTTTTCCACAATGTTTCATCACGATTTGCTGACGGATTTAGATTTGGTCTTGGTGCTGAAGTTGGCATTTCTACAGCAAGAATACATGCAAGAGGTCCAGTAGGTGTTGAGGGTTTGTTAACCACTAAATGGATCCTCGAAGGCACTGACCATACAGCCGCTGAATTCAATGAAGGAAAGCGAAATTGGCTTCATGAAAAATTACCCATCAATTGA

Protein sequence:

>DPOGS205264-PA
MLRCNIFSKRLGARFYTVVGSDSLPGGKVNWGHEGHNVDKRNIGSLDRKKQTFSDRSQLKYARRLVVKLGSAVITREDGNGLALGRLASIVEQVAECHHEGRECIMVTSGAVAFGRQKLTQELLMSLSMRETLSPSDHTREDAGSILDPRAAAAVGQSELMSMYDVMFSQYNVKIAQVLVTKPDFYNEETRKNLFTTLSELISLNIVPIVNTNDAVSPPMYIHDDTVVPGTGKKGIGIKDNDSLSALLAAEVQSDLLIMMSDVDGIYNKPPWEDGARMMHTYTSAEKVQFGQKSKVGTGGMDSKVNAATWAMARGVSVVICNGMQEKAIKTIISGRKVGTFFTDTPSVSTASVDVMAENARTGSRVLQKLSPAERAAAIHSLADLLLDKKDKILEANAKDLEEATKTGLEKPLLNRLSLSPGKLKTLSIGLKQIADSSYDNVGRVLRKTKLAENLLLKQVTVPIGVLLVIFESRPDSLPQVAALAMASGNGLLLKGGKEAAYSNRALMEIVKESLQPFGASDAVSLVSTRDEISDLLAMEKHIDLIIPRGSSELVRNIQKQSQHIPVLGHAEGICHVYLDKDADPSKALKIVRDAKCDYPAACNAMETLLIHEDHLSGTLFQDICNMLKQEGVKIHAGPKLASQLTFGPPPARTMKYEYGDLECSIEVVKDLDEAIDHIHKFGSSHTDVIVTENDHTARDFLNTVDSACVFHNVSSRFADGFRFGLGAEVGISTARIHARGPVGVEGLLTTKWILEGTDHTAAEFNEGKRNWLHEKLPIN-