Monarch geneset OGS2.0

DPOGS206359
TranscriptDPOGS206359-TA4095 bp
ProteinDPOGS206359-PA1364 aa
Genomic positionDPSCF300082 + 1225531-1238842
RNAseq coverage1102x (Rank: top 11%)
Annotation
HeliconiusHMEL0171200.073.20% 
BombyxBGIBMGA014119-TA0.077.52% 
Drosophilaade3-PA0.050.66% 
EBI UniRef50UniRef50_Q262550.055.34%Trifunctional purine biosynthetic protein adenosine-3 n=605 Tax=cellular organisms RepID=PUR2_CHITE
NCBI RefSeqXP_318881.40.058.38%AGAP009786-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582987020.058.38%AGAP009786-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1582987020.058.42%AGAP009786-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00091131.9e-157purine base biosynthetic process
GO:00046371.9e-157phosphoribosylamine-glycine ligase activity
GO:00061896.7e-126'de novo' IMP biosynthetic process
GO:00057376.7e-126cytoplasm
GO:00046416.7e-126phosphoribosylformylglycinamidine cyclo-ligase activity
GO:00046447.6e-75phosphoribosylglycinamide formyltransferase activity
GO:00090585.7e-70biosynthetic process
GO:00167425.7e-70hydroxymethyl-, formyl- and related transferase activity
GO:00055242.5e-58ATP binding
GO:00168742.5e-58ligase activity
GO:00038241.2e-48catalytic activity
KEGG pathwayaga:AgaP_AGAP0097860.0 
 K11787 (GART)maps-> Purine metabolism
    One carbon pool by folate
InterPro domain[4-428] IPR0001151.9e-157Phosphoribosylglycinamide synthetase
[801-1134] IPR0047336.7e-126Phosphoribosylformylglycinamidine cyclo-ligase
[106-299] IPR0205612.8e-86Phosphoribosylglycinamide synthetase, ATP-grasp (A) domain
[1163-1355] IPR0046077.6e-75Phosphoribosylglycinamide formyltransferase
[1163-1364] IPR0023765.7e-70Formyl transferase, N-terminal
[195-331] IPR0138162.5e-58ATP-grasp fold, subdomain 2
[969-1144] IPR0109182.3e-50AIR synthase-related protein, C-terminal
[799-967] IPR0161881.2e-48PurM, N-terminal-like
[4-105] IPR0205624.1e-36Phosphoribosylglycinamide synthetase, N-terminal
[1-106] IPR0161851.8e-33PreATP-grasp-like fold
[4-98] IPR0138171.1e-32Pre-ATP-grasp fold
[335-428] IPR0205602.6e-26Phosphoribosylglycinamide synthetase, C-domain
[332-430] IPR0110542.5e-25Rudiment single hybrid motif
[124-194] IPR0138153.2e-20ATP-grasp fold, subdomain 1
[481-572] IPR0007281.6e-17AIR synthase-related protein
Orthology groupMCL14235 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206359-TA
ATGTCGGCAAACGTGCTCGTTATCGGTGGTGGAGGCAGGGAACATGCAATTTGTTGGAAATTAGCAGATTCTCCATTAATTTGTAAAATATTCTGTGCGCCCGGTAGTGTCGGTATATCGGCGACGAAGGAAAATGTTGAATGTGTGGATTTGAATATTAAGGATTTTCCGGGTCTGGCGAAGTGGTGCAAAGACAAGTCAATTGATCTCGTCATCATTGGACCCGAAGACCCTCTTGCTAATGGCATTGTGGATGCCCTCGAACCAGCTGGCATCAAATGCTTCGGACCCACAAAAGCCGGTGCACAGATCGAGGCCAATAAAGACTGGTCCAAGAAATTTATGAACAAATATCAAATACCAACGGCAAGATACCAATCATTCACAGATGCTGAAGCTGCTAAGCAATTCATTAAGAGTGCACCGTTTAGAGCGTTAGTCGTGAAAGCATCCGGTTTAGCGGCTGGGAAAGGAGTTGTGGTAGCGAGCAATGTAGATGAGGCCTGTGCAGCGGTGGATGAGATCTTAACCGAAGCCAAGTACGGAACTGCCGGGCAGGTGGTCGTGGTTGAAGAACTGTTGGAAGGGGAAGAGGTTTCGGTGCTGGCATTCACGGATGGGAACACGGTGTCCATGATGCCCCCCGCCCAGGATCACAAGCGCATCGGCGAGGGGGACACGGGACCCAACACGGGGGGTATGGGGGCATATTGTCCCTGTCCGCTTATCACCCCAGAACAGCTGGCCGATGTCAAGGATCAAGTGTTACAGAGAGCTGTGGACGGGCTCAGGGCTGAGGGCATCAAATATGTCGGGGTCCTCTACGCTGGTCTTATGGTGACCAAGTCTGGTCCAATGACCCTCGAGTTTAACTGCCGCTTCGGAGACCCAGAGACACAGGTCCTCATGATGTTGCTGGAATCGGATCTCTACTCCGTCATTAAGGCGTGTGTGAGCGGTACATTAAAGGAGACTCCGGTCAAATGGAACACCTCAATGTCGGCCGTGGGAGTGGTGATCGCTTCCAAAGGTTACCCTGAGAGCTCCACCAAGGGCTGTGTCATAAGCGGTCTATCCCAGGTGTCCAGGGAAGACGTGGTCGTGTTCATGAGCGGAGTGTCCCGTGGAGCAAACGACTCGCTGGTCACTGCCGGCGGCAGAGTGCTGCTGCTGGCGGCGAGGAGAGCCGACCTCCGGACTGCCGCCGCCGCTGCGACACGAGCCGCCGCCGCCGTGGACTTCCCCGGGAAACAGTACAGGAAGGACATCGCCAGGCGGGCCTTCTGCAAAATGAACGGCCTGTCATACCTGGAGAGCGGTGTGGACATTGAAGCGGCGGCGGCCTTGGTCCGTCTGATGGAGCCCCTGGCCACGGGGACCCACAGACCCGGGGTACTTGGAAGGCTGGGCTGCTACAGCGGACTGTTCCAACTGGCGGCCGTGGACCCCGGCCTCACAGACCCCGTGCTGGTCCAGGGAACGGACGGCGTTGGAACCAAGGTCAAGATAGCAGAGATGATGCAGAAGTACGACACCATCGGCCAGGACCTGGTGGCCATGTGCGTCAACGACATCCTGTGTGCGGGCGCCGAGCCCTTCGCCTTCCTGGACTACCTGGCGTGCGGCCGGCTGCAGCTGCACACCTCCACCACCATCGTCAAGGGAATCGCCGACGCCTGCGTCATGGCCGGCTGTGCTTTGTTGGGGGGAGAGACGGCGGAGATGCCGAGTATGTATGACGTGGGGAAGTACGATTTGGCGGGGTTCGCGGTGGGCGTGGTGGACAACCTCAAGCAGCTGCCCCGCTACAAGGAGATACGACCCGGGGACGTGGTGCTCGCGCTGCCCTCCACCGGCGTGCACAGCAACGGGTACAGCCTCGTGCAGAGGATCATGGCTGAAAGTGGACATAGTTTCTACGAAAAGGCTCCGTTCAGTAAATCCAACAAGAACTTCGGCGAGGAGTTCCTGGAGCCGACCGGTATCTACGTGAAGGCCCTCCTGCCGGCCATCAAGAAAGGCCTCGTGAAGGGCCTGGCACACATCACCGGGGGAGGCCTCCTGGAGAATATCCCCAGGATACTACCGCCCGGCGTCAGGGTCAGGCTCGACGCCACTAAGTTCCAGATAAACCCTATCTTCGGCTGGCTGCAAGCTAAAGGGATGGTGTCGGACTTCGAGATGCTGCGCACGTTCAACTGCGGTGTCGGCATGGTGGTGGTGGCGGACCCCGTGCTGGTGAGCGAGCTGGTCGCCGCCGTGGACGGCACCATCAGTGTGGTCGGGCAGCTGGAGGACATGAGAACGGAAGGAGGTCAGCAGGTCATAGTGGAGAACTTCCAGCAGGCCATGTCCCCTCTGACGTCACCCTACTCGTCCGCCAGTCCGTGTCAGAAGTCACTCTCCTACAAGGACAGCGGGGTCGACATCGAGGCCGGGGACTCGCTGGTGTCACTCATAAAGCCTTTGGCTAGATCCACGTCTCGATCCGGGGTCCTGGGAGGTCTTGGAGGCTTCGGCGGGTGTTTCCAGCTGAAGGCTGTGGAGCAGGAGTATAAGGACCCGGTGCTGGTGGTGGCGGCGGACGGTGTGGGCACCAAGCTGCGCGTGGCTCAGAAGATGAACCGACACGCCACCATCGGCGTTGACCTGGTGGCCATGTGCGTCAACGACATCCTGTGCAACGGCGCCGCGCCGCTCACCTTCCTGGACTACTTCGCCTGCGGAGCCCTGGACGTGACCGTGGCCAGGGACGTCGTGGCCGGGGTCGCGGACGGCTGCAAGCAGTCCTCAGCGGCTCTCATCGGCGGAGAGACGGCGGAGATGCCCGGCATGTACGAGGCCGGCGTGTACGACATCGCAGGGTTCGCGCTGGGAGTGGTGGAGAGGGACAACATACTGCCGAAGATCAACGACATCAATGTTGGCGACACGATAATAGGTCTGCCATCGAACGGCGTCCACAGCAACGGGTTCAGTCTCATCCACAGCCTCATGAAGAAGGCCGGTCTGAGTCTCAACGACAAGGCGCCCTTCAGCGAGGAAGGACTCACTCTCGGCCAGGAGCTGATCAAGCCGACCCGCATCTACGTCCGCAGCGTGCTGCCGGCGCTGCAGCGCGGCGTGGTGAAGGCGGTGGCGCACATCACGGGCGGCGGCCTCATGGAGAACATCCCGCGCATCATGCCGGACTCCGTGCGGGCCCGCCTCAACGCGCACTGGTGGAAAGTTCACCCTGTGTTCGCGTGGATCGCGGAGACCGGCGAGGTCAAGAACGACGAGATGCTGAGGACATTCAACTGCGGCATCGGCTTGGTGCTGATAGTGTCTCCGGAACACCAGGCAGAGGTGATGAACATCACTCGCTCGCACGGCGCGATGGTGATCGGCTCCATACAAGCCCGGCCCCCGGGCGGCGCTCGCGTGCTCGTCGACAACTTCACCTCCGCGCTGGACTTCACGAGGCGGATGCCGCACCTCACTAAGAAGAGGGTGGCGGTGCTGGTGTCGGGTAGCGGCAGCAACCTGCAGGCGCTCATGGACAGTGCGTCGGACCCCGCCCAGTGCATGTGTGCGGAGGTGGCGCTCGTCGTCAGCAACAAACCCGACGCCTTCGCCCTCAAACGGGCCCGGGACGCCGGCGTCAACACGCTGGTGCTGAGTCACAAGGACTACTCCAGCCGCGAGGAGTACGACCGCGCCCTCAGCGCCGCCCTGGACGCGCACCGGATCGACCTCGTGTGTCTGGCCGGCTTCATGAGGATACTCACGCCGGGCTTCGTTAAGAAGTGGAAGGGTCGCCTCATCAACATCCACCCGTCCCTGCTGCCGGCCCACCCTGGACTCCACGCTCAGAGACAGTGTCTACAGGCGGGAGACAAGGAGTCGGGCTGCACCGTACACTTCGTCGACGAGGGCATGGACACGGGTCCGATCATTCTCCAGGAGCGCGTGCCGGTGATGCCGGGAGACACGGAGCAGGTTCTCAGTGACAGGATCCTGTCCGCGGAACACCGCGCCTACCCTCAGGCGCTCAGACTGCTCGCTACGGGCCGGGTCCGGCTACATGAGGACACTATCATATGGCATTCATGA

Protein sequence:

>DPOGS206359-PA
MSANVLVIGGGGREHAICWKLADSPLICKIFCAPGSVGISATKENVECVDLNIKDFPGLAKWCKDKSIDLVIIGPEDPLANGIVDALEPAGIKCFGPTKAGAQIEANKDWSKKFMNKYQIPTARYQSFTDAEAAKQFIKSAPFRALVVKASGLAAGKGVVVASNVDEACAAVDEILTEAKYGTAGQVVVVEELLEGEEVSVLAFTDGNTVSMMPPAQDHKRIGEGDTGPNTGGMGAYCPCPLITPEQLADVKDQVLQRAVDGLRAEGIKYVGVLYAGLMVTKSGPMTLEFNCRFGDPETQVLMMLLESDLYSVIKACVSGTLKETPVKWNTSMSAVGVVIASKGYPESSTKGCVISGLSQVSREDVVVFMSGVSRGANDSLVTAGGRVLLLAARRADLRTAAAAATRAAAAVDFPGKQYRKDIARRAFCKMNGLSYLESGVDIEAAAALVRLMEPLATGTHRPGVLGRLGCYSGLFQLAAVDPGLTDPVLVQGTDGVGTKVKIAEMMQKYDTIGQDLVAMCVNDILCAGAEPFAFLDYLACGRLQLHTSTTIVKGIADACVMAGCALLGGETAEMPSMYDVGKYDLAGFAVGVVDNLKQLPRYKEIRPGDVVLALPSTGVHSNGYSLVQRIMAESGHSFYEKAPFSKSNKNFGEEFLEPTGIYVKALLPAIKKGLVKGLAHITGGGLLENIPRILPPGVRVRLDATKFQINPIFGWLQAKGMVSDFEMLRTFNCGVGMVVVADPVLVSELVAAVDGTISVVGQLEDMRTEGGQQVIVENFQQAMSPLTSPYSSASPCQKSLSYKDSGVDIEAGDSLVSLIKPLARSTSRSGVLGGLGGFGGCFQLKAVEQEYKDPVLVVAADGVGTKLRVAQKMNRHATIGVDLVAMCVNDILCNGAAPLTFLDYFACGALDVTVARDVVAGVADGCKQSSAALIGGETAEMPGMYEAGVYDIAGFALGVVERDNILPKINDINVGDTIIGLPSNGVHSNGFSLIHSLMKKAGLSLNDKAPFSEEGLTLGQELIKPTRIYVRSVLPALQRGVVKAVAHITGGGLMENIPRIMPDSVRARLNAHWWKVHPVFAWIAETGEVKNDEMLRTFNCGIGLVLIVSPEHQAEVMNITRSHGAMVIGSIQARPPGGARVLVDNFTSALDFTRRMPHLTKKRVAVLVSGSGSNLQALMDSASDPAQCMCAEVALVVSNKPDAFALKRARDAGVNTLVLSHKDYSSREEYDRALSAALDAHRIDLVCLAGFMRILTPGFVKKWKGRLINIHPSLLPAHPGLHAQRQCLQAGDKESGCTVHFVDEGMDTGPIILQERVPVMPGDTEQVLSDRILSAEHRAYPQALRLLATGRVRLHEDTIIWHS-