Monarch geneset OGS2.0

DPOGS213206
TranscriptDPOGS213206-TA1002 bp
ProteinDPOGS213206-PA333 aa
Genomic positionDPSCF300114 + 212053-214698
RNAseq coverage402x (Rank: top 30%)
Annotation
HeliconiusHMEL0102991e-14170.77% 
BombyxBGIBMGA007399-TA3e-13968.95% 
DrosophilaCG3436-PA6e-9549.10% 
EBI UniRef50UniRef50_Q96DI73e-9650.56%U5 small nuclear ribonucleoprotein 40 kDa protein n=31 Tax=Amniota RepID=SNR40_HUMAN
NCBI RefSeqXP_625093.11e-11058.00%PREDICTED: similar to CG3436-PA, isoform A [Apis mellifera]
NCBI nr blastpgi|3838644293e-11058.29%PREDICTED: U5 small nuclear ribonucleoprotein 40 kDa protein-like [Megachile rotundata]
NCBI nr blastxgi|3838644292e-10758.29%PREDICTED: U5 small nuclear ribonucleoprotein 40 kDa protein-like [Megachile rotundata]
Group
Gene OntologyGO:00055153.6e-54protein binding
KEGG pathwayame:5527153e-110 
 K12857 (SNRNP40, PRP8BP)maps-> Spliceosome
InterPro domain[53-330] IPR0110463.6e-54WD40 repeat-like-containing domain
[49-327] IPR0159431.3e-53WD40/YVTN repeat-like-containing domain
[93-128] IPR0197811e-07WD40 repeat, subgroup
[89-128] IPR0016801.3e-07WD40 repeat
[72-86] IPR0204724.3e-06G-protein beta WD-40 repeat
Orthology groupMCL13719 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213206-TA
ATGCCGGAACTGGAAATAAAAAGAAAAGCTGAAGAATATGCTGTGGTACCGGCTAAGAAAACTCGACATGAAATATCTGTTGTTGGCACGAGAGAAAAGGCTGTAGTAACCTCCTCGGTACCCAGAACATCAAATCTTTATGCACCTATCATGCTGTTAGAAGGACACCAAGGCGAAATTTTCACAGCAAAGTTTCATCCTGAAGGCAAACATTTGGCTTCGGCAGGCTTTGACAGACAGATATTACTTTGGAATGTGTATGGGCAGTGTGACAATGTAATGGTAATGAAAGGTCACACGGGTGCCATAATGGAGTTATGTTTTTCACCTGACGGTGCTCACATGTACACATGTGCCACAGACAACTCAGTGGCGGTATGGGATGTCCCCACTGGAGTTAGGATCAAGAAATTAAAAGGACATGCAAATTTTGTTAATTCTGTATCAGGTTATTTTTTCAAGAAAGAGCAACAAACATCCACACAAACTCACACTAATATTAGTATAATATATATATACATAAATAATGATAGTTCTACAATTTTAATTACCCATTCCACCAATCAGCATGATAAATACAATTCTAGATACAAATTCCCCTACATTCAAAGATTTTTGCAAGAACAGGACACACACACCAAGTGTTTATCCTTATCATATGATGGATCCTACTTGCTCTCCAACTCGAGTGACTCCACTCTTCGGATATGGGATGTTCGCCCGTTCGCTCCATCAGAACGTTGTGTGAAACTCCTGTCAGGACACTCGCACAACTTCGAGAAGAACCTGCTCAGATGCTGCTGGTCTCCGGATGGGTCTAAGGTGGCGGCGGGATCTTCAGATCGTTTCCTGTACGTGTGGGACACCACATCAAGACGAGTCCTGTACAAGCTTCCCGGCCACAATGGTTCCGTCAACGACGTACATTTCCACAGTCGTGAACCGATAGTACTTTCAGCCTCCAGCGATAAACAGATATACCTGGGTGAAATTGATAACTGA

Protein sequence:

>DPOGS213206-PA
MPELEIKRKAEEYAVVPAKKTRHEISVVGTREKAVVTSSVPRTSNLYAPIMLLEGHQGEIFTAKFHPEGKHLASAGFDRQILLWNVYGQCDNVMVMKGHTGAIMELCFSPDGAHMYTCATDNSVAVWDVPTGVRIKKLKGHANFVNSVSGYFFKKEQQTSTQTHTNISIIYIYINNDSSTILITHSTNQHDKYNSRYKFPYIQRFLQEQDTHTKCLSLSYDGSYLLSNSSDSTLRIWDVRPFAPSERCVKLLSGHSHNFEKNLLRCCWSPDGSKVAAGSSDRFLYVWDTTSRRVLYKLPGHNGSVNDVHFHSREPIVLSASSDKQIYLGEIDN-