Monarch geneset OGS2.0

DPOGS200286
TranscriptDPOGS200286-TA1089 bp
ProteinDPOGS200286-PA362 aa
Genomic positionDPSCF300026 - 619131-620219
RNAseq coverage1050x (Rank: top 12%)
Annotation
HeliconiusHMEL0020360.095.30% 
BombyxBGIBMGA005560-TA2e-17392.65% 
DrosophilaPros54-PA1e-11861.82% 
EBI UniRef50UniRef50_P550352e-11661.82%26S proteasome non-ATPase regulatory subunit 4 n=98 Tax=Coelomata RepID=PSMD4_DROME
NCBI RefSeqNP_001091810.16e-17292.65%proteasome 26S non-ATPase subunit 4 [Bombyx mori]
NCBI nr blastpgi|2613359910.095.30%putative proteasome 26S non ATPase subunit 4 [Heliconius melpomene]
NCBI nr blastxgi|2613359910.095.30%putative proteasome 26S non ATPase subunit 4 [Heliconius melpomene]
Group
Gene OntologyGO:00055154.5e-09protein binding
GO:00062811.4e-07DNA repair
GO:00063551.4e-07regulation of transcription, DNA-dependent
GO:00082701.4e-07zinc ion binding
KEGG pathwayame:4096092e-133 
 K03029 (PSMD4, RPN10)maps-> Proteasome
InterPro domain[2-176] IPR0020354.5e-09von Willebrand factor, type A
[10-142] IPR0071981.4e-07Ssl1-like
Orthology groupMCL10447 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200286-TA
ATGGTCTTGGAAAGTACTATGATTTGTGTAGACAACAGTGATTACATGAGAAATGGAGATTTTCTTCCAACAAGACTGCAAGCTCAGCAAGATGCTGTTAATTTAGTGTGTCATTCCAAAACACGGTCTAATCCGGAAAATAATGTTGGTTTACTGACTTTGGCCAATGTGGAAGTACTGGCCACTTTAACTAGCGACGTTGGCAGAATACTCTCAAAGCTTCACCGTGTTCAACCCAATGGGGACATCAATATACTTACTGGTATAAGGATTGCACATCTGGCTTTAAAACACCGACAGGGAAAAAATCATAAAATGCGTATTGTTGTCTTTGTTGGCTCTCCTATTAATACCGATGAGAAGGAACTGGTGAAATTGGCTAAAAGACTCAAAAAAGAGAAGGTTACTTGTGATGTTGTATCATTTGGTGAGGATTCTGAGAATAATCCCCTATTGACTACATTTATAAACACATTGAATGGTAAGGATAATACATCTGGAGGCAGTCATCTTGTCTCAGTTCCAGCTGGTGGATGTGTTGTACTTTCTGAAGCATTGATCTCTAGTCCAATAATTGGTGGAGATGGTGCTGGCCCTTCTGGCTCAGGCTTATCACCATTCGAGTTTGGTGTAGATCCTAATGAAGATCCTGAGCTCGCCTTAGCTTTAAGAGTGTCCATGGAAGAGCAACGACAGAGGCAAGAAGAAGAGTCCCGTCGTCAACAAACAAATGCTGAAGGCGAAGCTGGAAAAACTGGAGAGCCTCAAAACACTGGCATGGAAAGGGCTCTAGCAATGTCATTGGGGAGAGAAGCCATGGAGTTATCGGAGGAGGAACAGATTGCTCTAGCTATGCAGATGAGCATGCAGCAAGATGCACCACAAGCTGAAGAGAGTATGGATGTGTCAGAAGAATATGCTGAGGTTATGAACGATCCTGCATTCCTGCAAAGCGTGCTTGAGAACCTACCTGGAGTAGATCCACAAAGTGAAGCAATTCGTAATGCCATGTCAACTATTAAGAAAGATAAAGATGAAAAAGATGATAAGGATCAGAAGGACAAAGATGGTAATGGGCCAAGCTCTTAA

Protein sequence:

>DPOGS200286-PA
MVLESTMICVDNSDYMRNGDFLPTRLQAQQDAVNLVCHSKTRSNPENNVGLLTLANVEVLATLTSDVGRILSKLHRVQPNGDINILTGIRIAHLALKHRQGKNHKMRIVVFVGSPINTDEKELVKLAKRLKKEKVTCDVVSFGEDSENNPLLTTFINTLNGKDNTSGGSHLVSVPAGGCVVLSEALISSPIIGGDGAGPSGSGLSPFEFGVDPNEDPELALALRVSMEEQRQRQEEESRRQQTNAEGEAGKTGEPQNTGMERALAMSLGREAMELSEEEQIALAMQMSMQQDAPQAEESMDVSEEYAEVMNDPAFLQSVLENLPGVDPQSEAIRNAMSTIKKDKDEKDDKDQKDKDGNGPSS-