Monarch geneset OGS2.0

DPOGS210103
TranscriptDPOGS210103-TA1269 bp
ProteinDPOGS210103-PA422 aa
Genomic positionDPSCF300017 + 853929-862509
RNAseq coverage502x (Rank: top 25%)
Annotation
HeliconiusHMEL0133570.080.19% 
BombyxBGIBMGA012681-TA6e-9984.26% 
DrosophilaCG10585-PA1e-13361.19% 
EBI UniRef50UniRef50_Q9VP872e-13161.19%CG10585 n=28 Tax=Endopterygota RepID=Q9VP87_DROME
NCBI RefSeqXP_970126.11e-14671.20%PREDICTED: similar to candidate tumor suppressor protein [Tribolium castaneum]
NCBI nr blastpgi|910841472e-14571.20%PREDICTED: similar to candidate tumor suppressor protein [Tribolium castaneum]
NCBI nr blastxgi|910841478e-14171.20%PREDICTED: similar to candidate tumor suppressor protein [Tribolium castaneum]
Group
Gene OntologyGO:00082991.2e-05isoprenoid biosynthetic process
KEGG pathwaytca:6586693e-146 
 K12505 (PDSS2)maps-> Terpenoid backbone biosynthesis
InterPro domain[38-420] IPR0174462.8e-150Polyprenyl synthetase-related
[272-417] IPR0089495e-32Terpenoid synthase
Orthology groupMCL11473 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210103-TA
ATGAGTTTAAGTCGTTTGGGAGGTGCTCTGAAGTTTGATTCGGTGAAAGTTAGTCAGTGTAGTGCATTAGTTGCAAAACTTTGGGGACCTTGTATTCGGGAAGTGAACACGAGCAGTTCAAATCGGGCTGTTGCACAACACAACAGTAAACCAGATTGGAATAGGGCCGTTAGCGAGGCTGAGAAGATCGTTGGCTATCCGACCTCCTTCCTTAGCCTGCGATGGGTGCTCAGTGATGAAATTGCGAATGTCGCTTTGCACTTACGAAAACTTGTTGGAAGTAATCATCCGTTGCTCAAAACCGCAAAGAATCTCATATACAATGGTAAGAATAACATGCAGGCGTGGGGGCTGATAGTGCTGCTAGTTTCCAAGGCTGCGGGACACAGTCCGGAAATACCAGATATGGAACAGGACAAAGCAGCGGGAGTATTGCACAGCCAGCGCGCCCTGGCGGAGGTGACTGAGATGATCCGCACGTCTCACTTGGTACACAAAGGTCTGGTCAATATGAACACAAGGCTCGGCCCTGGAGAGCCCGACGACATGATGTTTGGAAACAAAATCGCATTACTCAGCGGGGACTACCTCCTGGCCAACTCCTGTACTGAACTGGCTAATTTGAGGAACCAGGAATTGGTAGAGCTTATGTCGTCAGCAGTGCGAGATTTGGCTGAGGCCGAATTCCTCGGGGAGAGAGACGAACAGAACAACCCGCTACCATCACGACCACTGCCTCACCACCAGAGAGAAGAGGCTTCAGAATGGGACTGCGTACTCTCCCCGCTGCCAATGGCGGGTGTGTCAGGGTGTATGGGCAGGGAGTGGAGCGCCCGCCACGTGTTGGCGGCCGGCGCACTCCTCGGGAAGAGCTGCTCGGCCGCTCTCAAGCTGGCCGGTCACGGTCAGGGGCTACAGACACAGGGTTATCTTTTCGGTTGCCACTTGGCGCTAGCGTGGCAGGCCTTCCTAGACCTGGAGGCGTTCTCTGGTCCGGAGCCCGCTTGCTTCTCGCTAGTGGGAGCTCCCCTCGCCTTCACCCTCGAAGAACGTCCCGAGCTCTACCGGTACATAGAGGCTGGTAGGCGGAGTGTTCACGACGTGGACTACCACGCGCTGTACCAGGCCGTGCTGGAGGGGACCGGTATCGAGCAGACGAAACATCTCCAGAATGAACACGTGACTCGCGCCAGGGAGGTGCTGGACTCCTTCCCCAACTGTGACGCACGGACGGCCCTCACTAACATCATAGTGGCCATGTTACCATAA

Protein sequence:

>DPOGS210103-PA
MSLSRLGGALKFDSVKVSQCSALVAKLWGPCIREVNTSSSNRAVAQHNSKPDWNRAVSEAEKIVGYPTSFLSLRWVLSDEIANVALHLRKLVGSNHPLLKTAKNLIYNGKNNMQAWGLIVLLVSKAAGHSPEIPDMEQDKAAGVLHSQRALAEVTEMIRTSHLVHKGLVNMNTRLGPGEPDDMMFGNKIALLSGDYLLANSCTELANLRNQELVELMSSAVRDLAEAEFLGERDEQNNPLPSRPLPHHQREEASEWDCVLSPLPMAGVSGCMGREWSARHVLAAGALLGKSCSAALKLAGHGQGLQTQGYLFGCHLALAWQAFLDLEAFSGPEPACFSLVGAPLAFTLEERPELYRYIEAGRRSVHDVDYHALYQAVLEGTGIEQTKHLQNEHVTRAREVLDSFPNCDARTALTNIIVAMLP-