Monarch geneset OGS2.0

DPOGS202375
TranscriptDPOGS202375-TA810 bp
ProteinDPOGS202375-PA269 aa
Genomic positionDPSCF300104 + 195911-198397
RNAseq coverage1566x (Rank: top 8%)
Annotation
HeliconiusHMEL0028951e-13996.71% 
BombyxBGIBMGA013898-TA1e-13895.88% 
DrosophilaProsalpha5-PA2e-10775.72% 
EBI UniRef50UniRef50_B0WTL41e-10978.33%Proteasome subunit alpha type n=35 Tax=Eukaryota RepID=B0WTL4_CULQU
NCBI RefSeqNP_001040146.11e-13595.47%proteasome zeta subunit [Bombyx mori]
NCBI nr blastpgi|1140509932e-13495.47%proteasome zeta subunit [Bombyx mori]
NCBI nr blastxgi|1140509938e-12695.47%proteasome zeta subunit [Bombyx mori]
Group
Gene OntologyGO:00516038.3e-57proteolysis involved in cellular protein catabolic process
GO:00042988.3e-57threonine-type endopeptidase activity
GO:00058398.3e-57proteasome core complex
GO:00041753.4e-14endopeptidase activity
GO:00197733.4e-14proteasome core complex, alpha-subunit complex
GO:00065113.4e-14ubiquitin-dependent protein catabolic process
KEGG pathwaytca:6618933e-124 
 K02729 (PSMA5)maps-> Proteasome
InterPro domain[31-220] IPR0013538.3e-57Proteasome, subunit alpha/beta
[8-30] IPR0004263.4e-14Proteasome, alpha-subunit, conserved site
Orthology groupMCL14217 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202375-TA
ATGTTCCTAACCCGTTCTGAATATGACCGTGGAGTCAATACCTTCAGCCCCGAGGGAAGGTTGTTCCAAGTAGAATACGCGATAGAAGCCATTAAGCTTGGATCGACCGCGATCGGCATCGGTACTTCTGAAGGAGTCGTATTAGCAGTAGAGAAGAGAATTACATCTCCTTTGATGGAACCAACAACTATTGAGAAAATAGTGGAAGTAGATCGTCATGTTGCGTGTGCTGTATCAGGGTTGATGGCGGATTCTAGAACATTGATAGAGAGGGCTCGTGTTGAATGTCAGAATCATTGGTTTGTGTACAACGAGCGTATGAGTGTGGAGTCATGTGCGCAGGCCGTGTCCAACCTTGCTATCCAGTTTGGTGACTCTGACGACGACAGTCGCACAGCCATGTCTAGGCCTTTTGGTGTGGCCGTCATGTTTGCAGGTATTGATGAGAAGGGGCCTCAACTATTCCACATGGACCCGAGTGGTACATTTGTGCAATATGATGCTAAGGCTATCGGCTCCGGCAGTGAAGGCGCTCAGCAGAGCTTGAAGGAAGTGTACCACAAGTCGATGACATTGAAAGAAGCTATCAAGTCGGCTCTAACCATTCTGAAGCAAGTTATGGAGGAGAAATTGTCCGAGAATAATGTTGAAGTTGTTACAATGACACCCGATTCATTATTCCATATGTTCACTAGAGAGCAGCTAGCCGAGTTGATATCGGCCATCCCGGAGCTTCAGTGGAGTGAGACGAGCTCCAGGAACACACTACAGGACCTCTCCAAACTCACCCTCGGCAAACAGACGGTGTGA

Protein sequence:

>DPOGS202375-PA
MFLTRSEYDRGVNTFSPEGRLFQVEYAIEAIKLGSTAIGIGTSEGVVLAVEKRITSPLMEPTTIEKIVEVDRHVACAVSGLMADSRTLIERARVECQNHWFVYNERMSVESCAQAVSNLAIQFGDSDDDSRTAMSRPFGVAVMFAGIDEKGPQLFHMDPSGTFVQYDAKAIGSGSEGAQQSLKEVYHKSMTLKEAIKSALTILKQVMEEKLSENNVEVVTMTPDSLFHMFTREQLAELISAIPELQWSETSSRNTLQDLSKLTLGKQTV-