Monarch geneset OGS2.0

DPOGS215731
TranscriptDPOGS215731-TA2664 bp
ProteinDPOGS215731-PA887 aa
Genomic positionDPSCF300041 + 462108-470478
RNAseq coverage411x (Rank: top 30%)
Annotation
HeliconiusHMEL0096590.073.21% 
BombyxBGIBMGA003602-TA0.077.09% 
DrosophilaCG9346-PA0.049.33% 
EBI UniRef50UniRef50_Q7Q7V40.046.78%AGAP005006-PA n=7 Tax=Endopterygota RepID=Q7Q7V4_ANOGA
NCBI RefSeqXP_315111.40.046.78%AGAP005006-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3287936090.054.73%PREDICTED: u2 snRNP-associated SURP motif-containing protein-like [Apis mellifera]
NCBI nr blastxgi|3800267630.051.41%PREDICTED: LOW QUALITY PROTEIN: U2 snRNP-associated SURP motif-containing protein-like [Apis florea]
Group
Gene OntologyGO:00063963.4e-20RNA processing
GO:00037233.4e-20RNA binding
GO:00001663e-18nucleotide binding
GO:00036761.5e-17nucleic acid binding
KEGG pathwayaga:AgaP_AGAP0050060.0 
 K12842 (SR140)maps-> Spliceosome
InterPro domain[322-376] IPR0000613.4e-20SWAP/Surp
[153-241] IPR0126773e-18Nucleotide-binding, alpha-beta plait
[163-239] IPR0005041.5e-17RNA recognition motif domain
[430-569] IPR0065692.2e-16RNA polymerase II, large subunit, CTD
[424-577] IPR0089422.1e-06ENTH/VHS
Orthology groupMCL13424 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215731-TA
ATGAGTAGTCGTGGTTTATCTAAAAAAGAGATAGAAGAATTAAGGAAGAAAGAAGAAGAGGAAGCTGCGGCGCACGTATTTAAGGAATTTGTTGAAACATTCCAAGAAGTGCCGAGCACCACATCCAAGGTCTGGGTCAAGGCTGGAACTTATGATGCCGGGGCGAGAAAGGAAGATACATCAGAGCGTGGTAAGCTCTACAAGCCCACATCACGCCTGGAGGAGAAGCGTAGTGCCAGCGAGGCGGATGTGGTCCGTAGCCTGGCTCGGTCCGACCCGCCCGGACGACCCAAGAAGAAGAGCGGAGACAAGAAGAAAAGCAACCTAGAACTGTTCAAGGAGGAGCTCAGGCAGATCCAGGAGGAGCGCTCAGAACGTCACAAGTACAAGAACGTGCTGAGGGACCGAGGAGTGGGGGTACCCGAGCCAGTGATTGACGTCATCCCCGACGTGGGCTCCTACGACACTGGCGACCCCAACACCACCAACCTGTACCTGGGGAACTTGAACCCAAAGATAACAGAGCAGCAGTTGATGGAGATCTTCGGTCGGTACGGTCCGCTGGCCAGCATCAAGATCATGTGGCCTCGCTCCGACGAGGAGAAGGCGCGCGGCAGGAACTGCGGGTTCGTGGCGTTCATGTCCAGGAAGGACGGAGAGAGGGCGCTGAGGTGCATCAATGGAAAGGAGATAATGAACTATGAGATGAAGCTGGGTTGGGGCAAGGCGGTGGTGATTCCTCCGGTGCCTATATACATCCCGCCCTCCCTCCAGCAACCCTGCAAGCCTCCCCCGCCCTCCGGGCTGCCTTTCAACGCTCAGCCGCCCAGACACCTCGCCAACAAGATACCGAGGATCCGTCCGGGTGAATACTACCCGAGCGACTCCGGCGACAAACAACTGTACGATCAGATATTATCCCAATCCATAGTCAAAGTCGTCATTCCGACTGAAAGGAACATCCTGATGCTGATCCACCGTATGGTGGAGTTCGTGATCCGCGAGGGGCCGATGTTCGAGGCCATCATCATGAACAAGGAGATGAACAACCCCTACTTCCGGTTCCTGTTCGAGAACCAGTCGCCCGCACACGTGTACTACAGGTGGAAACTGTTCTCCATGCTGCAGGGGGACTCGCCCAAGTCCTGGAACCTCGAGGACTTCAGGATGTTCAAAGGTGGTTCCGTGTGGCGTCCGCCCGTCATGAACCTGTACACGGCCGGCATGCCCGACGAGCTCGTGGACGAAGAGGACGCCAAGGAAAACATCCGAGGAACGCTCTCCAACAATCAGCGCGACCGTCTGGAGGAGTTGATCCGCAACCTGTCGCCGGCCCGCCGCAGCGTGGGCGAGGCCATGGCGTGGTGCCTGGAGCACGCGGAGGCGGCGGGCGAGGTGCCGTGCTGCGTGAGCGAGGCGCTGTCCCAGCCGCGCACCACGCCCGCCCGCCGCGTGGCCCGCCTCTACCTGCTGTCCGACATCCTGCACAACGCCGGCGCCAAGCTCACTAACGCCAGCGCCTATCGCGGAGCGTTCCAATCTCGTCTGGTGGACATCATGCGCGAGTGTCGCGTGGCGTGGACCCGGATGTCGTCTCGCATGCAGCAGGAGGGCTTCCGGGCGCGCGTCACACGCATCCTGCAGGCCTGGGCCGACTGGGCCGTCTACCCCACAGACTTTCTGCTGCACATCAACGACGTCTTCCTCGGACAGAATAAGGAGGGCGAGACGCGTCCGACCCTTGAGGTCGATCGTGACGAGGGGGACGAGGACGGTAACGCCTCCCCGGGGTCCTCCGCCTCGGGCGCCTCTGGCGGCTCCGGGGGTTCCGGGGGCTCCGGCCCGCTGGACGGAGCCGCGCTCAGGAGACTCGCTGAACAGAGACCGCCACAACTTAATATATCTGGCCTTCAATTGCTGCTGTATGACGGTGTGCCGGTTGACGAAGACATAGACGGCGTTCCGCTGGACACGGAGGAGTGTTCTGCGGCCGAGGGCTCGAGCGTCGGTCGCAGTACGGCCGCGTTCGTGCCCTCGCGCTGGGAGAGCGTCGACCCCGCGCCCTCCACACACTCCAGGGACGACTCGCCGCCCAAAGACAGCTCTGATGCGGAGCGTTCGAGTCAATTCACTAGCACGGGTGTAGGGGGGGTGGGGGGAGACTGCGGCGAGGGGGCGCTGAGGAGAGAAACACTGAGGGATATAGAGGTCCGGGTGTTGAAGTACGCGGACGAGTTGGAGGCGGGCCTCAGGCCTAGCAAGTCAGGCCTGCCACACGCGCAGCAGATACAACAACACAGGAAGAAACTCATCAGGAAGGCTCTCCGTGAGGCGAAGGAGGCTCGCGAGGCCGAGGAGCCCCTCTCCCCAGAGGACGACACATTCTCCACCAGCTCCCGGAAGTCGAAGAAGTCGCGGGAACCGTCCATGTCGCCGCCCACCAAGAGGAACAGGCCGGGAGATCGAAAATCAAGATCAAAGTCGAGATCACGCGAACGAGATAGAGAACGAGAGAGGTCTCGAGATAGAGATCGCGAGAGAGAAAGAGAAAGAGACAGAGATAGAGAACGCGACAGAGACAGAGATAGGGAGAGACGGAGGCGGTCGCCGAGCACGCCGCCCCATCACAGGAAACACGCGAAGCACAGTAAATATAAGTACTAG

Protein sequence:

>DPOGS215731-PA
MSSRGLSKKEIEELRKKEEEEAAAHVFKEFVETFQEVPSTTSKVWVKAGTYDAGARKEDTSERGKLYKPTSRLEEKRSASEADVVRSLARSDPPGRPKKKSGDKKKSNLELFKEELRQIQEERSERHKYKNVLRDRGVGVPEPVIDVIPDVGSYDTGDPNTTNLYLGNLNPKITEQQLMEIFGRYGPLASIKIMWPRSDEEKARGRNCGFVAFMSRKDGERALRCINGKEIMNYEMKLGWGKAVVIPPVPIYIPPSLQQPCKPPPPSGLPFNAQPPRHLANKIPRIRPGEYYPSDSGDKQLYDQILSQSIVKVVIPTERNILMLIHRMVEFVIREGPMFEAIIMNKEMNNPYFRFLFENQSPAHVYYRWKLFSMLQGDSPKSWNLEDFRMFKGGSVWRPPVMNLYTAGMPDELVDEEDAKENIRGTLSNNQRDRLEELIRNLSPARRSVGEAMAWCLEHAEAAGEVPCCVSEALSQPRTTPARRVARLYLLSDILHNAGAKLTNASAYRGAFQSRLVDIMRECRVAWTRMSSRMQQEGFRARVTRILQAWADWAVYPTDFLLHINDVFLGQNKEGETRPTLEVDRDEGDEDGNASPGSSASGASGGSGGSGGSGPLDGAALRRLAEQRPPQLNISGLQLLLYDGVPVDEDIDGVPLDTEECSAAEGSSVGRSTAAFVPSRWESVDPAPSTHSRDDSPPKDSSDAERSSQFTSTGVGGVGGDCGEGALRRETLRDIEVRVLKYADELEAGLRPSKSGLPHAQQIQQHRKKLIRKALREAKEAREAEEPLSPEDDTFSTSSRKSKKSREPSMSPPTKRNRPGDRKSRSKSRSRERDRERERSRDRDRERERERDRDRERDRDRDRERRRRSPSTPPHHRKHAKHSKYKY-