Monarch geneset OGS2.0

DPOGS201015
TranscriptDPOGS201015-TA1611 bp
ProteinDPOGS201015-PA536 aa
Genomic positionDPSCF300147 + 242420-244544
RNAseq coverage3315x (Rank: top 4%)
Annotation
HeliconiusHMEL0046480.089.37% 
BombyxBGIBMGA009103-TA0.084.70% 
DrosophilaScpX-PA0.064.14% 
EBI UniRef50UniRef50_D8VD260.086.38%Sterol carrier protein 2/3-oxoacyl-CoA thiolase n=13 Tax=cellular organisms RepID=D8VD26_MANSE
NCBI RefSeqNP_001037378.10.085.07%sterol carrier protein x [Bombyx mori]
NCBI nr blastpgi|2991497110.086.38%sterol carrier protein 2/3-oxoacyl-CoA thiolase [Manduca sexta]
NCBI nr blastxgi|2991497110.086.38%sterol carrier protein 2/3-oxoacyl-CoA thiolase [Manduca sexta]
Group
Gene OntologyGO:00081521.3e-50metabolic process
GO:00038241.3e-50catalytic activity
GO:00329344.3e-30sterol binding
GO:00167471.8e-11transferase activity, transferring acyl groups other than amino-acyl groups
KEGG pathwaydgr:Dgri_GH105480.0 
 K08764 (SCP2, SCPX)maps-> Peroxisome
    Primary bile acid biosynthesis
    PPAR signaling pathway
InterPro domain[4-236] IPR0160391.3e-50Thiolase-like
[408-530] IPR0030334.3e-30SCP2 sterol-binding domain
[201-229] IPR0160382.7e-24Thiolase-like, subgroup
[20-119] IPR0206161.8e-11Thiolase, N-terminal
[272-370] IPR0206178.3e-11Thiolase, C-terminal
Orthology groupMCL11175 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201015-TA
ATGCCTAGGAAAGTTTTCGTAGTTGGTGTTGGGATGACGAAGTTTGTTAAGCCCAATAGCGGCATGGATTATCCAGATTTAGGGAAGGAAGCTGTCGAGGCAGCTTTGGCCGACGCTCGTATAAAGTACCAAGACGTACAACAGGCTATATGCGGGTATGTTTTCGGTGATTCTACAAGCGGTCAGCGAGTATTATATCAGATTGGCATGACTGGTATTCCTATATATAACGTTAATAACAATTGCTCCACTGGTTCTAACGCACTATTCCTTGGAAAACAACTCGTAGAAGGTGGTGTTTCGGATGTTATTCTGGCTCTTGGATTCGAGAAAATGACACCAGGTGCTCTAGGCAATGGTACATTTTCAGACAGAACGAATCCTTTAGATAGACATACTCTCAAGATGGCGGATATGGCAGAATTAACTGCAGCTCCGATGACCGCTCAATACTTCGGTAACGCGGGTGCAGAGCACATGAAGAAATATGGTACGACTGAAGTCCATATAGCCAAAATAGCAGCTAAGAACCATCGTCATGGTGCTAAGAACCCAAGAGCGCAAGGTGGACGAGAGTACACAGTTGAGGAAGTGTTGAACTCTCGTAGAATTTATGGCCCACTGACGAAATTGGAATGTTGCCCTACTAGTGATGGGGCTGCAGCTGCCATCCTTATGTCTGAAGAGGCTGTCATTCGTTACGGACTTCAGAATAAAGCGGTAGAAATAATTGGTATGGAGATGGCCACTGACACTGAGGCTGTTTTCAACGAAAACAGTCTAATGAAGGTTGCCGGTTATGATATGACAGGATTAGCGGCTAAAAGATTATATGAAAAGACGGGTATCTCACCCATGCAAGTTGATGTTGTCGAGTTGCATGATTGTTTCGCAACAAATGAGTTGATCACATATGAGGGGCTTCAGTTGTGTGGTGAAGGTGAGGCTGGTAAATTCATTGATGCCGGTGACAACACATATGGTGGCCGTTGTGTGGTAAATCCGAGTGGTGGTTTGATTGCTAAGGGACATCCTCTGGGTGCGACTGGCCTGGCCCAGTGTGCCGAGCTAGTCTGGCAGCTGCGCGGAGAAGCGGGAGACAGACAGGTGCCCCGAGCTCGCATCGCCTTACAACACAACCTAGGACTCGGAGGAGCTGTCGTCATCACCATGTACCGTAAAGGATTCAGCAGTGCATCACCCAACCAGGTGGCTGCCATCGCAGCCAACCCAGAGAACTTCAAAGTATACAAATACATGAAGATCCTCGAGGAAGCCATGAAGTCTGATGAAGATAAACTCATAGAGAAAGTTAGAGGAGTTTATGGTTTTAAGGTTAGGAACGGACCGAACGGCGAAGAAGGTTACTGGGTCATCAACGCCAAGGAAGGGAAGGGCAGTGTGAACTATGACGGTAAAGACAAATGTGATGTGACCTTCACAATCAACGATGAGGATGTCGCTGATCTGATATCCGGAAAACTGAATCCCCAGAAGGCATTTTTCCAAGGAAAAATCAAAATCCAGGGTAACATGGGACTCGCTATGAAGCTGACCGATCTCCAGAGATCCGCTGCTGGCAGGATTGAAGCAATCCGCTCCAAACTATAA

Protein sequence:

>DPOGS201015-PA
MPRKVFVVGVGMTKFVKPNSGMDYPDLGKEAVEAALADARIKYQDVQQAICGYVFGDSTSGQRVLYQIGMTGIPIYNVNNNCSTGSNALFLGKQLVEGGVSDVILALGFEKMTPGALGNGTFSDRTNPLDRHTLKMADMAELTAAPMTAQYFGNAGAEHMKKYGTTEVHIAKIAAKNHRHGAKNPRAQGGREYTVEEVLNSRRIYGPLTKLECCPTSDGAAAAILMSEEAVIRYGLQNKAVEIIGMEMATDTEAVFNENSLMKVAGYDMTGLAAKRLYEKTGISPMQVDVVELHDCFATNELITYEGLQLCGEGEAGKFIDAGDNTYGGRCVVNPSGGLIAKGHPLGATGLAQCAELVWQLRGEAGDRQVPRARIALQHNLGLGGAVVITMYRKGFSSASPNQVAAIAANPENFKVYKYMKILEEAMKSDEDKLIEKVRGVYGFKVRNGPNGEEGYWVINAKEGKGSVNYDGKDKCDVTFTINDEDVADLISGKLNPQKAFFQGKIKIQGNMGLAMKLTDLQRSAAGRIEAIRSKL-