Monarch geneset OGS2.0

DPOGS201341
TranscriptDPOGS201341-TA2991 bp
ProteinDPOGS201341-PA996 aa
Genomic positionDPSCF300176 + 717358-721804
RNAseq coverage489x (Rank: top 25%)
Annotation
HeliconiusHMEL0123920.092.50% 
BombyxBGIBMGA003125-TA0.073.29% 
DrosophilaUpf2-PA0.042.19% 
EBI UniRef50UniRef50_F4X3S50.054.51%Regulator of nonsense transcripts 2 n=5 Tax=Endopterygota RepID=F4X3S5_ACREC
NCBI RefSeqXP_396597.20.055.74%PREDICTED: similar to UPF2 regulator of nonsense transcripts homolog [Apis mellifera]
NCBI nr blastpgi|3287781140.054.93%PREDICTED: regulator of nonsense transcripts 2 [Apis mellifera]
NCBI nr blastxgi|3227924050.055.80%hypothetical protein SINV_11858 [Solenopsis invicta]
Group
Gene OntologyGO:00160703.6e-74RNA metabolic process
GO:00054883.3e-65binding
GO:00055157.8e-35protein binding
KEGG pathway 
InterPro domain[524-779] IPR0160213.6e-74MIF4-like, type 1/2/3
[525-772] IPR0160243.3e-65Armadillo-type fold
[325-514] IPR0038907.8e-35MIF4G-like, type 3
[783-943] IPR0071932.2e-21Up-frameshift suppressor 2
Orthology groupMCL14441 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201341-TA
ATGTCAGATATATCAGCTGCAATCACTCTGTGTTCTACATTGCATCAAACATATTCGGAATTTAGTTCGTTCTTTTTTGAAAACTGGCAAAAGATACTATCCTTCAAAGCAACAGATAAGATAACCAATTCTTCAAAATTAAGAGTGGACTTAAGATTCTATGCAGAACTTATTGCTGTTGGTATATTTACCAACAAGACTGGTTTACCACTGTTGGGAATTGTGCTCACAGTTTTAATCAATATGGATAAAGAAGAGCATAACAATATTCCAATTTTGTTGTCATTTTGTAAACATTGTGGTGAAGACTATGCAGGATTAGTTCCAAAAAAAATTAAGGATACTGCTAAGAAATTTAATGTGACAGTACCAAGAAATACTTTTATACCAGCGGAGAAGCAAACAGCTGTGAGAAGTTTATTGAAAGACTATTTCTTATCATTGACAAAGCATTTACTGGCTGAACGAGCTCAGCTTCAGGCTCTGCATGCAGCCAATCAGAGAACCCTTCATACAAGAGGAGAGTTATCACAAGAGAGGAAAGACCAATTGGAACAACATCAGGCCACATATGATAAACTACTTGTAGGAGCTCAGAACTTTGCTGAAGTGTTAGGAGAAGACCTAGGTGAGGCTGGAGAACCTTTGACTTTGTCCATGACAGTTATTGAGACTCAGGGTACAGTAACCATTGGTGGGAATGATGAGATCATTATGCAAGCTGGTACTGACCCTTGGCAGGACGAAGACACGAGAACTTTCTATACCAGTCTTCCCGATTTGAAAGTTTTCATGCCTAATTACCAATTGAAAGAGGCTGTGAAAAATAAAACGGAAACTGTTACCGAAGAGATGTTAGACGAGGATCTCAAAGAGGATGAGCTCAGTGACAATGAGGAACCGGCACCTGTTGTTGCTGATGTGGAGCAAGAAGAGGCACAACCAGCTAATGTGTCTAATAAATACGCCCTTGATGCTTTCTTAAATGAATTGCCAAATTGTATCAATAGAGAGTTAATTGACAATGCGGCTGTAGACTTTGTTTTGAATTTGAACACTAAAAATAATAGAAAAAAATTAACACGGGTCCTATTTAGTGTTGCCAGAACAAGATTAGATCTATTACCATTTTACTCGAGATTCGCGTCCATACTGTATCCAGTTTTACCCGACGTGTGTGTTGATTTGTGTCAAATGTTAAAACAAGATTTTAAGTATCATGTCAGGAAGAAGGATCAAATTAACATTGAATCAAAGATAAAAGTGGTGAGGTTTATTGGAGAACTTGTTAAATTTGGTCTCTACTCCAAAATGGAAGCTTTGTACTGTCTGAAAGTTCTGTTACACGATTTTAAACATCATCACATTGAAATGGCTTGTAACTTATTGGAGACTTGCGGAAGGTATTTGTACTGCAATCCTGATACACACCAAAGAACGATGATATATTTACAGCAGATGATGAGAAAAAAGACTGTTTCTGCTCTCGATTCACGTTACGTGACCCAAATCGAAAATGCATTTTATTACGTATGTCCACCCGAAGCACCGGCACAACCGAAAGAGGAAGAGCCTCCCATGCACCAGTTCATTAGAAAAATTCTTCACGAAGATCTACAAAAAAGTAACGAAGAAAAAATTTTGAGGCTTATGAGGAAACTTAATTGGGATGACCCTGAAGTAGCGGCAGTGGCAATCCAACATCTGGCTGGCGGGTGGAGAGTCAGGGCGAGTGCGAGAAGGGCATTGGCTCGCTTAACAGCTGAACTGGCTGCCTGGCAAGAAAACGTTGCCCCCGCTGTTGTTGACACCATACTGGAGGAAATTAGAGTTACTATGGAAGACCCTCATCCAAAGTACAATCAGAGGAGAATAGCTAGTGTCCGATATCTTGGAGAACTCTATAATTACAAGCTCCTGGATTCCCGAGACGTTTTCACGGTTCTCTACTCTTTTATTACATTCGGTGTATCGAACGACCATTCTAACGTATCTCCACTAGATCCGTCCGACAATGTCTTCAGGATAAGATTAGTTTGTGCTCTACTAGAGACTTGTGGCGCATATTTTAATAGTGGATCTAGTAAGAAACGACTGGATTACTTTTTGGTTTTCTTCCAAAATTACTATTGGTTTAAATACAGTGATCCTTACTGGACCGAGGAGAATAAATTTCCGATATACGTCAAATACATATATCAGGAATGTTTGAGCAGTTTGCGGCCCAAACTGACATTGTTTACTAGCTGGCAACAGTGTAAGGACGCTATAGAGGAGATAAGACAGACATTATACCCGGATTTGGGGGAAGACGAACACTTTGACAATGATGACCAGGGCGAGGATAGTGTTGCTGATGGTTTAGACACCATCATAGAGACGGATGATGAAACAGATAATCCACACATGCCAGAAGAAAGCTCTGACGAAGACCCCATCACGGAAAGTGCTGGAAATGACGAGAACGACGTGCAGACAGAAGACCTTCCCATCGAGCCGAGGCGTCCAGCTGTAAAACCTGTGGAAGATGTGGAGTTTGAATCGGCATTCGAGAAGATGGTTATGGAGAACATTGCGGAAAGACAACGTGAGAATAGACCACAGCAAAGAGATATAGCTGTGCCAATGACATGTAGACAAACTACTAAAAAAACTTATGAACAGTTACTGCAAGGTAAGGAAGGAGTAGAATTTGTGTTGATGGTGAGAAAAGGTATGAAACCACAGTACAAGTCGTTCAACGCGCCACCGGAGCTCGCGAGCAATTTACAACAACAAGCCCTAGCGGATAAACAGGAAATGGAAAGAGTTAAACGTTTAACATTAAACATTTCTGAACGCCAAGAAGAGGAAGAATATAGCGCGGAGAGTGGGGGAGGTTCTGGAGGAGGTGGCAACCCCAATAGAGGGCAGCACGTTCGACAAAAGTATCAACACCCTAAAGGGGCACCGGATGCAGATCTTATATTTGGACCTAAGAAATTCAAATAA

Protein sequence:

>DPOGS201341-PA
MSDISAAITLCSTLHQTYSEFSSFFFENWQKILSFKATDKITNSSKLRVDLRFYAELIAVGIFTNKTGLPLLGIVLTVLINMDKEEHNNIPILLSFCKHCGEDYAGLVPKKIKDTAKKFNVTVPRNTFIPAEKQTAVRSLLKDYFLSLTKHLLAERAQLQALHAANQRTLHTRGELSQERKDQLEQHQATYDKLLVGAQNFAEVLGEDLGEAGEPLTLSMTVIETQGTVTIGGNDEIIMQAGTDPWQDEDTRTFYTSLPDLKVFMPNYQLKEAVKNKTETVTEEMLDEDLKEDELSDNEEPAPVVADVEQEEAQPANVSNKYALDAFLNELPNCINRELIDNAAVDFVLNLNTKNNRKKLTRVLFSVARTRLDLLPFYSRFASILYPVLPDVCVDLCQMLKQDFKYHVRKKDQINIESKIKVVRFIGELVKFGLYSKMEALYCLKVLLHDFKHHHIEMACNLLETCGRYLYCNPDTHQRTMIYLQQMMRKKTVSALDSRYVTQIENAFYYVCPPEAPAQPKEEEPPMHQFIRKILHEDLQKSNEEKILRLMRKLNWDDPEVAAVAIQHLAGGWRVRASARRALARLTAELAAWQENVAPAVVDTILEEIRVTMEDPHPKYNQRRIASVRYLGELYNYKLLDSRDVFTVLYSFITFGVSNDHSNVSPLDPSDNVFRIRLVCALLETCGAYFNSGSSKKRLDYFLVFFQNYYWFKYSDPYWTEENKFPIYVKYIYQECLSSLRPKLTLFTSWQQCKDAIEEIRQTLYPDLGEDEHFDNDDQGEDSVADGLDTIIETDDETDNPHMPEESSDEDPITESAGNDENDVQTEDLPIEPRRPAVKPVEDVEFESAFEKMVMENIAERQRENRPQQRDIAVPMTCRQTTKKTYEQLLQGKEGVEFVLMVRKGMKPQYKSFNAPPELASNLQQQALADKQEMERVKRLTLNISERQEEEEYSAESGGGSGGGGNPNRGQHVRQKYQHPKGAPDADLIFGPKKFK-