Monarch geneset OGS2.0

DPOGS205919
TranscriptDPOGS205919-TA1257 bp
ProteinDPOGS205919-PA418 aa
Genomic positionDPSCF300089 + 406255-410027
RNAseq coverage969x (Rank: top 13%)
Annotation
HeliconiusHMEL0212370.0100.00% 
BombyxBGIBMGA007082-TA0.098.43% 
DrosophilaU2af50-PB0.085.79% 
EBI UniRef50UniRef50_Q245620.085.79%Splicing factor U2AF 50 kDa subunit n=26 Tax=Bilateria RepID=U2AF2_DROME
NCBI RefSeqNP_001040494.10.098.43%U2 small nuclear ribonucleoprotein auxiliary factor 2 [Bombyx mori]
NCBI nr blastpgi|1140527350.098.43%U2 small nuclear ribonucleoprotein auxiliary factor 2 [Bombyx mori]
NCBI nr blastxgi|1140527350.097.61%U2 small nuclear ribonucleoprotein auxiliary factor 2 [Bombyx mori]
Group
Gene OntologyGO:00063973.7e-159mRNA processing
GO:00056343.7e-159nucleus
GO:00037233.7e-159RNA binding
GO:00001662e-21nucleotide binding
GO:00036761.2e-17nucleic acid binding
KEGG pathwaytca:6633170.0 
 K12837 (U2AF2)maps-> Spliceosome
InterPro domain[4-418] IPR0065293.7e-159U2 snRNP auxilliary factor, large subunit, splicing factor
[206-287] IPR0126772e-21Nucleotide-binding, alpha-beta plait
[211-284] IPR0005041.2e-17RNA recognition motif domain
Orthology groupMCL12982 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205919-TA
ATGGGTGAAGATAAAGTAGATCGCCGACGTAGGTCGCGGTCCCGAGAACGTCGCCGCTCACGTTCCCGGTCCAGAGAACGCCGTGAGAAAAGAAGAAGCAAGTCGAGGTCTCCTTCTAAAAGATCACGCAGACGCAAGCCCTCTCTATACTGGGATGTACCACCTCCCGGGTTCGAACACATTACACCATTACAATACAAGGCCATGCAGGCAGCAGGCCAAATCCCGGCAAACATAGTAGCAGATACACCACAGGCTGCTGTGCCAGTAGTAGGGTCCACAATAACTAGACAAGCAAGAAGATTATATGTAGGGAACATACCATTTGGTGTCACAGAAGAGGAAACAATGGAGTTCTTTAATCAGCAAATGCATCTTTCTGGTTTGGCTCAAGCTGCGGGCAACCCTGTTTTAGCGTGTCAGATAAACTTAGATAAAAACTTTGCATTCCTTGAGTTTAGATCTATTGATGAGACTACACAGGCCATGGCATTTGATGGCATTAATTTTAAGGGTCAAAGTTTGAAAATAAGGCGACCTCATGACTATCAACCAATGCCAGGAACCGAAAACCCAGCAATCAATGTACCTGCTGGTGTTATCAGTACTGTAGTTCCAGATTCACCCCATAAAATCTTTATTGGAGGTCTTCCTAACTATCTTAATGAAGATCAAGTGAAAGAACTTCTGATGTCATTTGGTCAGCTGCGAGCTTTCAACTTGGTGAAGGATTCTTCGACGGGCCTAAGCAAGGGTTATGCCTTTGCTGAATATGTTGACATTTCTATGACTGATCAGGCTATCGCTGGTTTGAATGGCATGCAGCTGGGTGACAAGAAACTCATTGTCCAACGGGCAAGCATTGGAGCAAAGAACTCGACATTAGCTATGACAGGGGCTGCTCCGGTGACTCTTCAAGTGGCAGGGCTGACATTGGCTGGTGCAGGCCCTGCCACAGAGGTACTCTGCCTCCTGAACATGGTTACACCGGATGAGCTTCGAGACGAAGAGGAGTATGAAGACATTTTGGAAGATATCAAAGAGGAATGCAACAAATATGGGTGTGTGCGTAGTATAGAAATTCCAAGGCCTATTGAAGGCGTCGAAGTGCCTGGTTGTGGAAAGGTATTTGTCGAATTTAACAGCATCGCAGATTGCCAGAAAGCTCAACAAACATTGACAGGCAGAAAATTCAGCAACCGCGTCGTAGTTACCTCTTACTTCGACCCCGACAAATATCACCGCAGAGAGTTTTAA

Protein sequence:

>DPOGS205919-PA
MGEDKVDRRRRSRSRERRRSRSRSRERREKRRSKSRSPSKRSRRRKPSLYWDVPPPGFEHITPLQYKAMQAAGQIPANIVADTPQAAVPVVGSTITRQARRLYVGNIPFGVTEEETMEFFNQQMHLSGLAQAAGNPVLACQINLDKNFAFLEFRSIDETTQAMAFDGINFKGQSLKIRRPHDYQPMPGTENPAINVPAGVISTVVPDSPHKIFIGGLPNYLNEDQVKELLMSFGQLRAFNLVKDSSTGLSKGYAFAEYVDISMTDQAIAGLNGMQLGDKKLIVQRASIGAKNSTLAMTGAAPVTLQVAGLTLAGAGPATEVLCLLNMVTPDELRDEEEYEDILEDIKEECNKYGCVRSIEIPRPIEGVEVPGCGKVFVEFNSIADCQKAQQTLTGRKFSNRVVVTSYFDPDKYHRREF-