Monarch geneset OGS2.0

DPOGS208546
TranscriptDPOGS208546-TA684 bp
ProteinDPOGS208546-PA227 aa
Genomic positionDPSCF300064 + 940871-942733
RNAseq coverage932x (Rank: top 14%)
Annotation
HeliconiusHMEL0087639e-13399.56% 
BombyxBGIBMGA010333-TA3e-13198.24% 
DrosophilaU2af38-PA7e-10896.83% 
EBI UniRef50UniRef50_Q010816e-9183.60%Splicing factor U2AF 35 kDa subunit n=64 Tax=Opisthokonta RepID=U2AF1_HUMAN
NCBI RefSeqXP_001606160.13e-11383.13%PREDICTED: similar to U2 snrnp auxiliary factor, small subunit [Nasonia vitripennis]
NCBI nr blastpgi|1565433226e-11283.13%PREDICTED: splicing factor U2af 38 kDa subunit-like [Nasonia vitripennis]
NCBI nr blastxgi|910898272e-11992.14%PREDICTED: similar to AGAP002956-PA [Tribolium castaneum]
Group
Gene OntologyGO:00056344.3e-170nucleus
GO:00037234.3e-170RNA binding
GO:00001661e-24nucleotide binding
GO:00036766.1e-22nucleic acid binding
GO:00082701.6e-07zinc ion binding
KEGG pathwaynvi:1001139259e-113 
 K12836 (U2AF1)maps-> Shigellosis
    Spliceosome
InterPro domain[1-227] IPR0091454.3e-170U2 auxiliary factor small subunit
[43-148] IPR0126771e-24Nucleotide-binding, alpha-beta plait
[69-146] IPR0039546.1e-22RNA recognition motif domain, eukaryote
[15-38] IPR0005711.6e-07Zinc finger, CCCH-type
[95-143] IPR0005041.4e-06RNA recognition motif domain
Orthology groupMCL15143 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208546-TA
ATGGCCGAATATTTAGCGGCTATATTTGGTACTGAAAAGGATAAGGTAAATTGCTCATTCTATTTTAAAATAGGAGCTTGTCGACATGGTGATCGATGTTCGAGAATTCATAACAAACCAACTTTTTCGCAAACTGTGTTGCTACAAAATTTATATGTAAATCCACAAAACTCAGCCAAGTCTGCTGATGGCAGTCATTTGGTTGTGGCAAATGTATCGGACGAGGAGATGCAAGAGCACTATGATAACTTTTTTGAAGATGTTTTTGTTGAATGTGAAGATAAATATGGTGAAATTGAGGAAATGAATGTATGTGACAACCTCGGAGATCACTTAGTAGGCAATGTTTATATTAAGTTTCGCCGAGAGGAAGATGCCGAGAAGGCTGTAAATGACCTCAATAACCGCTGGTTTGGAGGCCGGCCTGTTTATGCTGAGTTGTCTCCTGTAACTGACTTCCGTGAAGCTTGCTGTCGTCAGTATGAAATGGGGGAATGTACCAGGAGTGGCTTTTGTAATTTCATGCACTTGAAACCAATTTCAAGAGAACTGAGGAGATACCTGTACGCTCGTCGTAAGGGCGGCCGCCGGTCCAGGTCCAGGTCTCGTGAGCGTCGCCGGCGATCGAGATCGCGCGACCGTCGCCGCGAGCCGCCTCGCAACTCGCGCTCTGGACGGTACTGA

Protein sequence:

>DPOGS208546-PA
MAEYLAAIFGTEKDKVNCSFYFKIGACRHGDRCSRIHNKPTFSQTVLLQNLYVNPQNSAKSADGSHLVVANVSDEEMQEHYDNFFEDVFVECEDKYGEIEEMNVCDNLGDHLVGNVYIKFRREEDAEKAVNDLNNRWFGGRPVYAELSPVTDFREACCRQYEMGECTRSGFCNFMHLKPISRELRRYLYARRKGGRRSRSRSRERRRRSRSRDRRREPPRNSRSGRY-