Monarch geneset OGS2.0

DPOGS210363
TranscriptDPOGS210363-TA1233 bp
ProteinDPOGS210363-PA410 aa
Genomic positionDPSCF300025 + 462194-465737
RNAseq coverage187x (Rank: top 49%)
Annotation
HeliconiusHMEL0138281e-15893.19% 
BombyxBGIBMGA011920-TA0.083.29% 
DrosophilaClp-PA4e-10160.27% 
EBI UniRef50UniRef50_D6WBZ03e-9865.45%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WBZ0_TRICA
NCBI RefSeqNP_001040511.19e-15189.89%cleavage and polyadenylation specific factor 4 [Bombyx mori]
NCBI nr blastpgi|1140523762e-14989.89%cleavage and polyadenylation specific factor 4 [Bombyx mori]
NCBI nr blastxgi|1140523766e-15888.97%cleavage and polyadenylation specific factor 4 [Bombyx mori]
Group
Gene OntologyGO:00037231.7e-26RNA binding
GO:00057301.7e-26nucleolus
GO:00082701.1e-07zinc ion binding
GO:00036761.1e-07nucleic acid binding
KEGG pathwaycqu:CpipJ_CPIJ0058182e-58 
 K12845 (SNU13, NHP2L)maps-> Spliceosome
InterPro domain[290-303] IPR0024151.7e-26H/ACA ribonucleoprotein complex, subunit Nhp2, eukaryote
[303-393] IPR0040384.6e-26Ribosomal protein L7Ae/L30e/S12e/Gadd45
[317-331] IPR0184926e-20Ribosomal protein L7Ae/L8/Nhp2 family
[63-89] IPR0005711.1e-07Zinc finger, CCCH-type
[244-265] IPR0130844.7e-06Zinc finger, CCHC retroviral-type
Orthology groupMCL12264 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210363-TA
ATGGAAGTTATAGTGGCAAACGTGGATCATATAAAATTTGACATCGATTATGCTTTAGAGCAGCAGTATGGAGCCCTGCCTTTGCCCTTTCCTGGAATGGATAAATCTACTGCAGCAGTCTGCGAATTCTACAGCCAACCCGGGGGATGTGGTAATGGACCTCAGTGCCCTTATCGTCATGTGAGGGGAGACCGAACAGTAGTATGTAAGCATTGGCTGAGAGGTCTCTGCAAGAAAGGCGACCAGTGTGAATTTCTACACGAATATGACATGACCAAAATGCCTGAGTGCTATTTTTATGCTAGATTCAACGCTTGCCACAATAAGGAGTGTCCATTTCTTCATATAGATCCAGAAAGCAAAATTAAAGATTGTCCATGGTATGATCGTGGGTTTTGTAGACATGGTCCACATTGTAGACATCGCCATGTCAGAAGAGTTCTCTGTATAAATTACTTGGCAGGCTTCTGCCCTGATGGTGCAAACTGCAAGTACATGCACCCACGGTTTGAATTACCAGCTCCTCCAGAACAGACAAAGGATGCTAAAAGACTTCCCGTCTGCCACTACTGTTCAGAAGTAGGACATAAGGCTTCCACTTGTCATAAGATTCCTCCTGATCAAAGAGAAGTCGCCCAAAAACAGGAGGAGGCACGTTATAGAGCCTTGGGCTATGTCAAGCCTGCTGTAGATGGTGAAGAACTGAGATTACAAAGACTGATCCACAAACCTTTAGAGGAAGTGACTTGTTTCAAGTGCGGTACAAAGGGACATTATGCCAACAAGTGCCCCAAAGGTCACCTGGCCTTCCTATCAAATCAACCCCCTCCCGGCAACCCAAATGCTGAATCCGAAGCGTCTGTTAATCCCAAAGCCTATCCTTTAGCTGACACGGCTCTAACAGCTAAAATTTTAAACCTCGTGCAGCAAGCGGCTAACTACAAACAGTTGCGTAAAGGTGCCAATGAAGCCACCAAGACCTTGAACAGAGGACTGTCCGAGTTCGTCATTATGGCGGCGGACGCCGAACCACTGGAAATCGTTCTGCACATTCCAATTCTTTGCGAAGATAAGAATGTGCCTTATGTGTTTGTCAGATCCAAACAAGCTTTGGGTCGAGCCTGTGGAGTGTCCCGGCCGATAGTGGCGTGTTCCATCACTATCAATGAGGGATCACAACTGAAGCCGCAGATCCAAAGTATTCAGCAAGAGATAGAGAGACTCTTAGTGTGA

Protein sequence:

>DPOGS210363-PA
MEVIVANVDHIKFDIDYALEQQYGALPLPFPGMDKSTAAVCEFYSQPGGCGNGPQCPYRHVRGDRTVVCKHWLRGLCKKGDQCEFLHEYDMTKMPECYFYARFNACHNKECPFLHIDPESKIKDCPWYDRGFCRHGPHCRHRHVRRVLCINYLAGFCPDGANCKYMHPRFELPAPPEQTKDAKRLPVCHYCSEVGHKASTCHKIPPDQREVAQKQEEARYRALGYVKPAVDGEELRLQRLIHKPLEEVTCFKCGTKGHYANKCPKGHLAFLSNQPPPGNPNAESEASVNPKAYPLADTALTAKILNLVQQAANYKQLRKGANEATKTLNRGLSEFVIMAADAEPLEIVLHIPILCEDKNVPYVFVRSKQALGRACGVSRPIVACSITINEGSQLKPQIQSIQQEIERLLV-