Monarch geneset OGS2.0

DPOGS211215
TranscriptDPOGS211215-TA2667 bp
ProteinDPOGS211215-PA888 aa
Genomic positionDPSCF300007 + 1098113-1101721
RNAseq coverage167x (Rank: top 51%)
Annotation
HeliconiusHMEL0124730.094.13% 
BombyxBGIBMGA003195-TA0.090.41% 
Drosophilal(2)37Cb-PA0.080.45% 
EBI UniRef50UniRef50_UPI00022CA6200.078.93%UPI00022CA620 related cluster n=3 Tax=unknown RepID=UPI00022CA620
NCBI RefSeqXP_971279.10.082.73%PREDICTED: similar to pre-mRNA-splicing factor ATP-dependent RNA helicase prp22 [Tribolium castaneum]
NCBI nr blastpgi|910828730.082.73%PREDICTED: similar to pre-mRNA-splicing factor ATP-dependent RNA helicase prp22 [Tribolium castaneum]
NCBI nr blastxgi|910828730.081.90%PREDICTED: similar to pre-mRNA-splicing factor ATP-dependent RNA helicase prp22 [Tribolium castaneum]
Group
Gene OntologyGO:00043861.2e-35helicase activity
GO:00055249.3e-18ATP binding
GO:00036769.3e-18nucleic acid binding
GO:00080261e-06ATP-dependent helicase activity
KEGG pathwaytca:6599200.0 
 K12813 (DHX16)maps-> Spliceosome
InterPro domain[638-729] IPR0075021.2e-35Helicase-associated domain
[244-429] IPR0140017.7e-34DEAD-like helicase
[764-863] IPR0117093.7e-32Domain of unknown function DUF1605
[477-577] IPR0016509.3e-18Helicase, C-terminal
[253-405] IPR0115451e-06DNA/RNA helicase, DEAD/DEAH box type, N-terminal
Orthology groupMCL10030 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211215-TA
ATGGAAAAACGTAGTAAGCGACATCACACTCCCTCGCCATCGGACTCGGACAGTGCTGAAGAAACTCGGGCTAAAGATATAAAAGAGCGAGATGAGTTCGCCAAAAGGCTACGAAAGAGAGATGAAGAAAAAGTTAAGAAAGTTTCTGAAAGTTCCAACAAACGGTCTTATGAAGAAGCAGCAAAGAGGTTAAAGTTAGAGGCGGAAGACAGAGACAAAATATTACCAAAACTTCGTATTCAATCCAGAAGAAAATACCTACAGAAAAGAAAAGATGATAAAGTTATTGAACTAGAAGATGATATTGCCGATGATGAATATCTTTTTGATGAGAGCATATTGACAGAGAGAGAGAAAAGGGAAAGGGAACATAAGAAAACTCTGTTACAGTTAGCTAAGGAACATGAAAAAGCTCGGGAACTTGAAAATGTTCAAAGATATCACATGCCTCAAGATCTGGGAAAAGGGGAAAAAGGAGAATATATTGAAGTTGATGAAAATGAAAAATTGCCAAATTCAGAACAAAGAAAATGGGAACAGGAACAAATCAAATCTGCCTTTTTTAAATTTGGTGCAAAGGATGCAAAGGCACAAGATGAATATGAATTGCTTTTGGACGAACAAATAGATTTCATTCAAGCTCTACAATTAGAAGGCAACCAAGAAAAAAAAGATGAGGAAAAGATATCAGAATATAAAAAAGCAAGATTGACCATTGAAGAGACAAAGAAATCACTACCAGTCTTTCCATTCAGAGATTCTTTGATAGAAGCTATAAAAAACTACCAAATATTAATTGTGGAAGGTGAAACAGGTTCCGGTAAGACCACTCAAATCCCTCAATATTTGCATGAGGCTGGATTCACTGATGATGGCAAGAAGATTGGTTGCACTCAACCCCGAAGAGTAGCAGCAATGTCTGTGGCGGCCAGGGTTGCCCAAGAAATGAATGTTAAATTAGGCAATGAAGTTGGTTATAGCATTAGATTTGAAGACTGTACCTCAGACAGAACTGTGATTAAATATATGACTGATGGCACCTTGCACAGAGAATTTCTATCAGAACCGGACTTGGCATCCTATAGTGTAATGATTATAGATGAAGCTCATGAAAGAACGTTACATACTGACATTCTTTTCGGTCTCGTAAAAGATATTACTAGATTTCGACCAGATCTTAAGTTATTAATATCCAGCGCTACGTTAGACGCCGAAAAGTTTTCTACGTTTTTCGACGATGCGCCGATATTTAGAATCCCCGGTCGAAGGTTTCCAGTGCATATATATTACACAAAAGCACCAGAAGCCGATTATATCGATGCCTGTGTTGTTACAGTTTTACAGATACATGCCACTCAACCGCTTGGCGATATATTAGTGTTTCTCACTGGACAAGAAGAAATAGAAACTTGCGTCGAAATGTTACAAGAGAGAACTAAGAAAATAGGAAAGAAATTAAAAGAGCTCATCATTTTACCCGTATATGCAAATCTACCCACTGACATGCAAGCAAAAATATTTGAACCAACTCCCGAAGGAGCTAGAAAAGTTGTTCTAGCCACTAATATCGCGGAGACATCTCTTACGATTGATAACATTATATATGTTATTGATCCAGGATTTGCTAAACAAAATAATTTCAATTCCAAAACTGGAATGGAAAGCTTGATGGTGGTACCAATATCTAAAGCATCAGCCAATCAAAGGGCTGGCAGGGCTGGGAGGGTAGCTGCTGGTAAATGCTTCAGACTGTACACGGCTTGGGCATACAAACATGAACTGGAAGATAATACTGTACCGGAAATTCAAAGGATAAACTTAGGAAATGCAGTGTTAACATTGAAAGCATTGGGCATTAACGATCTGATTCATTTCGATTTTCTAGATCCACCGCCACATGAGACTTTAGTGTTAGCTTTGGAGCAGTTATATGCTTTAGGAGCCCTAAATCATCATGGCGAATTGACAAAGGCTGGACGGAGAATGGCAGAATTTCCAACAGATCCAATGTTAGCGAAAATGTTACTTGCTAGTGAAAAGTACAAATGTTCCGAAGAAATCGTATCAATTGCGGCTATGTTGTCTGTAAATAGCTCTGTATTTTATAGACCAAAGGATAAAATAATACATGCCGACACAGCCAGGAAGAATTTTTTCCACCGTCATGGAGATCACTTAACCATAATGAACGTTTACAATCAGTGGGCTGACTCAGATTACTCCGTCCAATGGTGTTATGAGAATTTCATACAATATAGGTCGATGAAACGTGCTCGCGACGTCCGCGAGCAGCTGGTGGGTCTAATGGAAAGGGTTGAAATAGATATGGTATCAAGTATATCTGATGACACCAACATCCGCAAAGCCATCACTGCTGGATATTTCTACCATATTGCCAAATTCTCTAAAGGTGGCCATTACAAAACAGTAAAACATAACCAAACTGTTATGATACATCCAAACAGTGCTTTATTTGAAGAGCTACCAAGGTGGGTCATATACCATGAGCTGGTGTTCACTTCCAAGGAATTTATGCGACAAGTTACAGAAATTGAAAGCAAATGGTTACTAGAAGTGGCGCCCCATTATTATAAGTCTAAAGAATTAGAGGATTCTACAAATAAAAAAATGCCAAAAACAATTGGCAAATCTGCAAATAATTTTTAA

Protein sequence:

>DPOGS211215-PA
MEKRSKRHHTPSPSDSDSAEETRAKDIKERDEFAKRLRKRDEEKVKKVSESSNKRSYEEAAKRLKLEAEDRDKILPKLRIQSRRKYLQKRKDDKVIELEDDIADDEYLFDESILTEREKREREHKKTLLQLAKEHEKARELENVQRYHMPQDLGKGEKGEYIEVDENEKLPNSEQRKWEQEQIKSAFFKFGAKDAKAQDEYELLLDEQIDFIQALQLEGNQEKKDEEKISEYKKARLTIEETKKSLPVFPFRDSLIEAIKNYQILIVEGETGSGKTTQIPQYLHEAGFTDDGKKIGCTQPRRVAAMSVAARVAQEMNVKLGNEVGYSIRFEDCTSDRTVIKYMTDGTLHREFLSEPDLASYSVMIIDEAHERTLHTDILFGLVKDITRFRPDLKLLISSATLDAEKFSTFFDDAPIFRIPGRRFPVHIYYTKAPEADYIDACVVTVLQIHATQPLGDILVFLTGQEEIETCVEMLQERTKKIGKKLKELIILPVYANLPTDMQAKIFEPTPEGARKVVLATNIAETSLTIDNIIYVIDPGFAKQNNFNSKTGMESLMVVPISKASANQRAGRAGRVAAGKCFRLYTAWAYKHELEDNTVPEIQRINLGNAVLTLKALGINDLIHFDFLDPPPHETLVLALEQLYALGALNHHGELTKAGRRMAEFPTDPMLAKMLLASEKYKCSEEIVSIAAMLSVNSSVFYRPKDKIIHADTARKNFFHRHGDHLTIMNVYNQWADSDYSVQWCYENFIQYRSMKRARDVREQLVGLMERVEIDMVSSISDDTNIRKAITAGYFYHIAKFSKGGHYKTVKHNQTVMIHPNSALFEELPRWVIYHELVFTSKEFMRQVTEIESKWLLEVAPHYYKSKELEDSTNKKMPKTIGKSANNF-