Monarch geneset OGS2.0

DPOGS215010
TranscriptDPOGS215010-TA1599 bp
ProteinDPOGS215010-PA532 aa
Genomic positionDPSCF300256 + 172354-177053
RNAseq coverage444x (Rank: top 28%)
Annotation
HeliconiusHMEL0101750.078.44% 
BombyxBGIBMGA012191-TA0.070.25% 
DrosophilaU4-U6-60K-PA1e-15449.57% 
EBI UniRef50UniRef50_O431723e-15250.28%U4/U6 small nuclear ribonucleoprotein Prp4 n=80 Tax=Coelomata RepID=PRP4_HUMAN
NCBI RefSeqXP_974218.22e-18057.76%PREDICTED: similar to wd-repeat protein [Tribolium castaneum]
NCBI nr blastpgi|3071689541e-18057.61%U4/U6 small nuclear ribonucleoprotein Prp4 [Camponotus floridanus]
NCBI nr blastxgi|3838566417e-17858.24%PREDICTED: U4/U6 small nuclear ribonucleoprotein Prp4-like [Megachile rotundata]
Group
Gene OntologyGO:00055157.6e-74protein binding
GO:00083801.3e-15RNA splicing
KEGG pathwaytca:6630647e-180 
 K12662 (PRPF4, PRP4)maps-> Spliceosome
InterPro domain[211-528] IPR0159437.6e-74WD40/YVTN repeat-like-containing domain
[210-528] IPR0110461.8e-70WD40 repeat-like-containing domain
[85-137] IPR0036481.3e-15Splicing factor motif
[90-119] IPR0149068.2e-13Pre-mRNA processing factor 4 (PRP4)-like
[490-527] IPR0197816.7e-11WD40 repeat, subgroup
[404-443] IPR0016803.8e-09WD40 repeat
Orthology groupMCL14108 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215010-TA
ATGTCTGACGATGAAGTAGTGGCGGTGAAAAAGCCAAAACTGTATTATGGGTCTCTGGAGGAGCAGGAGAAGGCTCGTCTGGCAGCTCTGGCGGCTGCTGCCAGGGAGGGAGTCAAAGAAAGTGCCAAAGAAACTGGTGACATACAAATTTCCAATGAATACATGGAGCTAGAAGATGAGATAACAAAAGACAAAAAGGCATTGCTAGAGGAGTTTGAACGGAGGAGAAAAGCTCGTCAGTTGAATGTATCAACGGACGATGACGAGGTTAGACGGAGTCTCCGGCAGCTCGGTGAGCCTGTGTGTCTGTTCGGGGAAGGCCCAGCCGAGAGGAGGGTCCGGTTAAGGGACTTGCTCAGCTATCTAGGTGAGGATGCCATCCACAAGGCCCTGGAAGAGGAGGAGGCCCGCCTGGAGAGGGACCGGGGCCGGGAGGGGACCTGGTACCACGAAGGCCCCGCGGCGCTGAGGAGGGCGAGGATCGATATAGCCAGGTTCTCACTGCCGAGGGCCAAGCAAAGACTGGCCCAAGCTCGCTCAGAGTTGGAACTGGCCGGCAGCGTGCGAGCGGCCGCCAAGCAGGACGCTCAGAGGAAGGCCGCGGCTAACTCCATATATTGCAGTCAGATCGGTGACACGAGGCCTATAAGCTTCTGCAGGTTCAGTTCGGACAGTAAAATGCTCATAACATCGAGCTGGTCGGGCGTGTGCCGCGTGTGGTCGGTCCCTGGGTGTGTGGAGGTCCAGACGTTGTTGGGACACACGGGGAACGTCAGCTCTGCGACCTTCCACCCGAAGGCGATGATGCCGCATCATCTGCAGCTCAAGGCGGAAAAGGGGGAGAAGTCTGAGGATAAATCCGAGGATATGTCCGTGGATGTGTCGGACGCGTCGCATAACGTCGCGATGGCTTCCAGCGGATATGACGGCAGCGTGTTCCTGTGGAACTTTGTCAGCGAGTCTCCGCTGGCGTCCTTGCCCGGCCACGGCCCGGCCCGCGTGTCCAGGGTGGAGTTCCATCCGTCAGGTCGCTACCTGGCCGCCACGGTCTTCGATCACTCGTGGAGGCTGTGGGATCTGGAAACACAGACCGAGGTCCTTCACCAGGAAGGTCACGCCAAGCCGGTGTACAGCGTAGCCTTCCAGTGCGACGGGTCCCTGGCGGTGACCGGTGGAATGGACTCTTTCGGGCGCGTTTGGGACCTTAGGACGGGTCGCTGTGTGATGTTCCTCGAGGGTCACCTCGGCCCCGTGCTGGGGGTGGACTGGGCCCCCGCGGGTCACCAGCTCGCCACGGCCGCCGCCGATCACCAGGCGAAGATCTGGGACCTGAGGCGCCGGTCGTCCATATACACCATCCCTGCGCACACGCACCTCATCAGCGACATTCGTTATCAACGCACCCAGGGTCACTTCCTGTTGACCTCGTCCTATGACCACTCCGCCAAGCTGTGGTCCAACCCCGCCTGGCACCCGCTGAGGACACTCTCCGGACACGACAACAAGGTGATGAGCTGTGATATTTCACCCGACAATAAGTACATAGCGACCAGCTCCTACGACAGAACATTCAAGCTCTGGGCTCCGGACATGGCTTAA

Protein sequence:

>DPOGS215010-PA
MSDDEVVAVKKPKLYYGSLEEQEKARLAALAAAAREGVKESAKETGDIQISNEYMELEDEITKDKKALLEEFERRRKARQLNVSTDDDEVRRSLRQLGEPVCLFGEGPAERRVRLRDLLSYLGEDAIHKALEEEEARLERDRGREGTWYHEGPAALRRARIDIARFSLPRAKQRLAQARSELELAGSVRAAAKQDAQRKAAANSIYCSQIGDTRPISFCRFSSDSKMLITSSWSGVCRVWSVPGCVEVQTLLGHTGNVSSATFHPKAMMPHHLQLKAEKGEKSEDKSEDMSVDVSDASHNVAMASSGYDGSVFLWNFVSESPLASLPGHGPARVSRVEFHPSGRYLAATVFDHSWRLWDLETQTEVLHQEGHAKPVYSVAFQCDGSLAVTGGMDSFGRVWDLRTGRCVMFLEGHLGPVLGVDWAPAGHQLATAAADHQAKIWDLRRRSSIYTIPAHTHLISDIRYQRTQGHFLLTSSYDHSAKLWSNPAWHPLRTLSGHDNKVMSCDISPDNKYIATSSYDRTFKLWAPDMA-