Monarch geneset OGS2.0

DPOGS209458
TranscriptDPOGS209458-TA2220 bp
ProteinDPOGS209458-PA739 aa
Genomic positionDPSCF300275 + 127173-133508
RNAseq coverage628x (Rank: top 20%)
Annotation
HeliconiusHMEL0025975e-16665.89% 
BombyxBGIBMGA005838-TA0.057.58% 
DrosophilaCG6686-PA2e-13444.46% 
EBI UniRef50UniRef50_B0WPY33e-13944.19%U4/U6.U5 tri-snRNP-associated protein 1 n=3 Tax=Culicinae RepID=B0WPY3_CULQU
NCBI RefSeqXP_392863.23e-14746.18%PREDICTED: similar to CG6686-PB, isoform B isoform 1 [Apis mellifera]
NCBI nr blastpgi|3227869654e-14944.64%hypothetical protein SINV_02571 [Solenopsis invicta]
NCBI nr blastxgi|3838613261e-16546.81%PREDICTED: U4/U6.U5 tri-snRNP-associated protein 1-like [Megachile rotundata]
Group
KEGG pathwayame:4093479e-147 
 K11984 (SART1, HAF, SNU66)maps-> Spliceosome
InterPro domain[1-733] IPR0050117.7e-172SART-1 protein
Orthology groupMCL11158 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209458-TA
ATGGGTTCGAAGAAACATAAAAAAGAATCGAAAAAGAGGAAGCACAGAAGTAGATCCAGGTCTCCTTTGGATGGTGAAGAGCGTGAACGAAAGAGGCACAGGAAACACAAGGATCGCAAGAAAGATCGCTCTCCGGATGTGGAGGAGGTGCCGGTGGACTCGCACCTGGAGCGCGAGGCGGAACCCTCGCCCGCGCCCAGCAGCAGCTCCAGCGCTAGGAGACATGAAGGAAGAGACAAAAGCCAGGAGTCGGAACGTGACTCGTCACCCACGCCCGCCTCGGGGTCGGCTCAGGAGAGTCTCTCCATAGAAGAGACCAACAAACTGCGAGCCAAGCTGGGGCTGAAACCACTTGAAGTGCCAGAGAAACCTGCCGACGACGGTAAGTTCAAGGATGACCTGGGAGAGTTCTACCACGTGCCCGCCTCCAACATCGCCGCGCAGAAGAAAGCGGACAAGTTGAGAGAGAAGCTCGCCGAGAGGAGGGACAAGAGGGACATCGACAACAAGTTACAGACCTCCCTGCTGGCCGAGGGTTCCGATGATGAAGACGCCTCCGCCTGGGTCAGGAAGACGCGGGACATGGAGAAACAGAAGCAGGAAGCCGCCAAGAGAGCGGCTCTGCTGGACGAGATGGACGAGGTGTTCGGTGTGGGCGCGCTCGTGGCGGACGACCAGCACCGAGACCGACAGGAGGCCTACCACCAGGAACATCTCAAGGGGCTGCGGGTGGCGCACACTCTGGACGCGTTGCCCGAGGAGCGCGAGACGGTGTTGACGCTGGCGGACAAGGAGGTGCTAGCGGACGACGACGAGGATGTACTCGTCAATGTGAACATCGTGGACGACGAGAAATATAAAAAGAACATCGAGGAGCGCAAGAAGGCCCGCACGGGCTACCAGGCCTACGACGAGGAGGCCGACATACAGGCCGCCCTGGGGTACAGCAGACCCGTGCTGGCCAAGTACGACGACGAGATAGAACCCAGCAAGGGAGACAAGACCAGGGGCTTCTTCATAGGAGACGAGGACGCGCTCATGGAGCAGAAACTGAAGGACATGATGCGTGCGGAACTGATAGCGGGCGGTCCGGATAAAGTGCTGGAATCGCTCCAGAGCACCGGCCTCAGACCCGCCAGCGACTACCTGCAGCCCGACGAGCTCCAGGCGAGGTTCAAGAAAGTTAAGAAGAAGGGCAAAATACGTAAGAAGGCCAAGCAGGAGCCCATAGACGTGGAGGAGCACGAGGCGGGCTCCGTGCCTCTGGACACCGATGACACGGAGATCAGCCAGGAGGTGACTGCGCCTGTTCTGGACGAAGACGAAGTGGAGGTGGACACGGAGTTACAGGCGGCGCTCCATCGCGCCAGGAGACTGAGGCAGGGAGGGGACCAACACAGGACTCCTAAGCTGGAGGAGATCCTCCAACAAATTAAAGAAGAGAAAACGGAAGAGAGTCAAGAAGCCGGGGGCAGCATGGTGCTGGATGCCACCGCCGAGTTCTGTCGCACGCTCGGGGACATACCGACATACGGCCTGGCGGGGAACAGGGAACATACCGCCGAGATCATGGACTTCGATCGCGAGGAAGCCGAGCCGGAGCCGGAGAGCGGAGCCAGCGGCGGCGCCTGGAGCAGGGTCGACGTGCGCACAGACCGGCCTGCCGACCTCGAGCATGAAGGAGCGTCGGGCGCCGGGCTGGAGGCAGAGCCCGCACTGGGGGCCGGCGTGGCGGGCGCGCTGCGACTGGCGCTCAGCAAGGGCTATCTGGAGAGAGACGGCGCACTGCCCGCTCCTAGACCGACACGCTCCTCACTGGCCGCGCTCGCCGCGCTGCACTACTCCATCGAGGACAAGACTTACGGCGAGGACGACAAGTACGGTCGGCGCGAGAGAGGAGGTCACTCGGGACCTCTGAGCGAGTTCAGGGAGAAGAGTGACTTCAGACCCGACATCAAGCTGGAGTACGTCGACGACGACGGGCGACCGCTCTGTCCCAAGGAGGCCTTCCGCTACCTCTCCCACAAGTTCCACGGCAAGGGACCCGGCAAGAACAAGCAGGAGAAGAGAATCAAAAAGGCCGTGCAGGAGGGACTGATGAAGAAGATGAGTTCCACGGACACGCCTCTCAACACGTTACAGATGCTGCAGCAGAAACAACGAGAGACACAGTCGCCCTTCGTGGTGCTCAGCGGCGCCAAGAGAGACGCGCCCAACTGA

Protein sequence:

>DPOGS209458-PA
MGSKKHKKESKKRKHRSRSRSPLDGEERERKRHRKHKDRKKDRSPDVEEVPVDSHLEREAEPSPAPSSSSSARRHEGRDKSQESERDSSPTPASGSAQESLSIEETNKLRAKLGLKPLEVPEKPADDGKFKDDLGEFYHVPASNIAAQKKADKLREKLAERRDKRDIDNKLQTSLLAEGSDDEDASAWVRKTRDMEKQKQEAAKRAALLDEMDEVFGVGALVADDQHRDRQEAYHQEHLKGLRVAHTLDALPEERETVLTLADKEVLADDDEDVLVNVNIVDDEKYKKNIEERKKARTGYQAYDEEADIQAALGYSRPVLAKYDDEIEPSKGDKTRGFFIGDEDALMEQKLKDMMRAELIAGGPDKVLESLQSTGLRPASDYLQPDELQARFKKVKKKGKIRKKAKQEPIDVEEHEAGSVPLDTDDTEISQEVTAPVLDEDEVEVDTELQAALHRARRLRQGGDQHRTPKLEEILQQIKEEKTEESQEAGGSMVLDATAEFCRTLGDIPTYGLAGNREHTAEIMDFDREEAEPEPESGASGGAWSRVDVRTDRPADLEHEGASGAGLEAEPALGAGVAGALRLALSKGYLERDGALPAPRPTRSSLAALAALHYSIEDKTYGEDDKYGRRERGGHSGPLSEFREKSDFRPDIKLEYVDDDGRPLCPKEAFRYLSHKFHGKGPGKNKQEKRIKKAVQEGLMKKMSSTDTPLNTLQMLQQKQRETQSPFVVLSGAKRDAPN-