Monarch geneset OGS2.0

DPOGS202104
TranscriptDPOGS202104-TA1488 bp
ProteinDPOGS202104-PA495 aa
Genomic positionDPSCF300150 - 378731-391148
RNAseq coverage300x (Rank: top 37%)
Annotation
HeliconiusHMEL0145942e-17295.56% 
BombyxBGIBMGA006896-TA3e-15188.40% 
DrosophilaPrp31-PA0.070.20% 
EBI UniRef50UniRef50_E0W0J50.068.29%U4/U6 small nuclear ribonucleoprotein Prp31, putative n=1 Tax=Pediculus humanus corporis RepID=E0W0J5_PEDHC
NCBI RefSeqXP_969081.10.071.77%PREDICTED: similar to AGAP012142-PA [Tribolium castaneum]
NCBI nr blastpgi|3320174460.072.91%U4/U6 small nuclear ribonucleoprotein Prp31 [Acromyrmex echinatior]
NCBI nr blastxgi|3320174460.075.00%U4/U6 small nuclear ribonucleoprotein Prp31 [Acromyrmex echinatior]
Group
KEGG pathwaytca:6575320.0 
 K12844 (PRPF31)maps-> Spliceosome
InterPro domain[339-468] IPR0191751.4e-47Prp31 C-terminal
[190-336] IPR0026871.2e-41Pre-mRNA processing ribonucleoprotein, snoRNA-binding domain
[95-147] IPR0129761.8e-20NOSIC
Orthology groupMCL11660 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202104-TA
ATGTCTCTTGCTGATGAGCTATTGGCTGACTTAGAAGAAAATGATGACGGAGAGCTTGAAGCTATAATTGAGAATAAAACTGCAGATTCTCACGAGTTTGCTGTACCCTTTCCTGTGATACCTAAAGAAGAAGAAATAAAAAATGTATCAATTCGAGAATTGGCTAAATTAAGAGATTCAGATCGGCTTAAACGGGTTGTAGCAGAAGTAGAGCAAAATGCGGGTAATAAAAGAAAGAAAATTGAGGTTGTTGGTTTAATGGAATCTGATCCTGAATATCAATTAATAGTTGAAGCTAATAATATAGCAGTTGAAATTGATGGTGAAATTGCTACTATTCACAGGTTTGTTCGGGATAAATATCAGAAAAGGTTTCCAGAGCTGGAGTCATTGATTGTAACACCATTAGAATATATCCGTACTGTAAAGGAGTTAGGAAATGACCTTGACAAAGCTAAGAATAATGAGATTCTTCAAAGTTTTCTCACTCAGGCAACTATTATGATAGTGTCCGTCACTGCTTCCACAACACAAGGAAAATTATTGTCAGATCATGAACTGAGTGAAATCTTTGAAGCATGTGATATGGCTGCAGAGTTGAATAATTTTAAATCAAATATCTACGAGTACGTTGAGAGCAGGATGACTTTCATAGCTCCAAACATAACAGCTATTGTTGGTGCATCAACAGCAGCGAAAATTCTTGGAGTGGCAGGTGGTCTATCCAAGCTGTCCAAAATGCCAGCATGCAATGTTCTGCCACTTGGACAGCAAAAGAAGACGCTGTCTGGCTTCTCCCAAGCCGCTTCACTACCTCATACTGGCTTTATATACTTTTCTCAAATAGTACAAGATACAACTCCTGAATTGAGATACAAAGCAGCTAAGCTTGTATCAACAAAATTAACTCTGGCGGCTAGAGTTGATGCTTGCCATGAAAGTACAGATGGTGCCATTGGTCGGTCATTGAGGGAAGGAATAGAGAAGAAATTAGACAAATTACAGGAACCGCCTCCAGTGAAGTTCGTGAAGCCGCTTCCAAAGCCGATTGAACAGAGTAGGAAGAAACGTGGCGGGAAACGTGTGAGGAAGATGAAGGAGCGATACGCCATGACGGAGTTCAGGAAGAACGCCAACAGACTCAACTTCGCTGACATCGAAGACGACGCTTATCAAGAAGACCTGGGGTACACTCGTGGTACGATCGGGAAATCTAGAACGGGTCGCGTCCGCCTGCCTCAAATAGACGAGAAGACCAAAGTTCGCATCAGCAAAACCTTGCAAAAGAACCTGCAAAAACAAAACCAGCAGTACGGCGGGGCTACGAGTATAAGAAGACAAGTGTCAGGAACGGCCTCCTCGGTGGCCTTCACGCCTTTGCAGGGTCTCGAGATAGTGAATCCTCAGGCCGCTGAGACGAGAGTGAATGAAGCGAACGCGAAATACTTTTCAAATACCTCTGGATTCCTATCGGTTGGAAAGACTTAA

Protein sequence:

>DPOGS202104-PA
MSLADELLADLEENDDGELEAIIENKTADSHEFAVPFPVIPKEEEIKNVSIRELAKLRDSDRLKRVVAEVEQNAGNKRKKIEVVGLMESDPEYQLIVEANNIAVEIDGEIATIHRFVRDKYQKRFPELESLIVTPLEYIRTVKELGNDLDKAKNNEILQSFLTQATIMIVSVTASTTQGKLLSDHELSEIFEACDMAAELNNFKSNIYEYVESRMTFIAPNITAIVGASTAAKILGVAGGLSKLSKMPACNVLPLGQQKKTLSGFSQAASLPHTGFIYFSQIVQDTTPELRYKAAKLVSTKLTLAARVDACHESTDGAIGRSLREGIEKKLDKLQEPPPVKFVKPLPKPIEQSRKKRGGKRVRKMKERYAMTEFRKNANRLNFADIEDDAYQEDLGYTRGTIGKSRTGRVRLPQIDEKTKVRISKTLQKNLQKQNQQYGGATSIRRQVSGTASSVAFTPLQGLEIVNPQAAETRVNEANAKYFSNTSGFLSVGKT-