Monarch geneset OGS2.0

DPOGS214059
TranscriptDPOGS214059-TA2922 bp
ProteinDPOGS214059-PA973 aa
Genomic positionDPSCF300171 - 60758-70426
RNAseq coverage110x (Rank: top 59%)
Annotation
HeliconiusHMEL0110861e-11236.31% 
BombyxBGIBMGA012404-TA1e-15248.06% 
Drosophilamus309-PA5e-17342.51% 
EBI UniRef50UniRef50_B0WJH90.043.75%ATP-dependent DNA helicase hus2 n=3 Tax=cellular organisms RepID=B0WJH9_CULQU
NCBI RefSeqXP_001848863.10.043.75%ATP-dependent DNA helicase hus2 [Culex quinquefasciatus]
NCBI nr blastpgi|1700422872e-18043.75%ATP-dependent DNA helicase hus2 [Culex quinquefasciatus]
NCBI nr blastxgi|3800113501e-17837.68%PREDICTED: Bloom syndrome protein homolog [Apis florea]
Group
Gene OntologyGO:00080263e-254ATP-dependent helicase activity
GO:00063103e-254DNA recombination
GO:00055241.9e-22ATP binding
GO:00043861.9e-22helicase activity
GO:00036761.9e-22nucleic acid binding
GO:00038243.6e-11catalytic activity
GO:00001663.6e-11nucleotide binding
GO:00442373.6e-11cellular metabolic process
GO:00056227e-10intracellular
KEGG pathwaycqu:CpipJ_CPIJ0076330.0 
 K10901 (BLM, RECQL3, SGS1)maps-> Homologous recombination
InterPro domain[107-953] IPR0045893e-254DNA helicase, ATP-dependent, RecQ type
[314-517] IPR0140011.2e-27DEAD-like helicase
[552-633] IPR0016501.9e-22Helicase, C-terminal
[320-487] IPR0115459e-21DNA/RNA helicase, DEAD/DEAH box type, N-terminal
[816-892] IPR0109973.6e-11HRDC-like
[820-884] IPR0021217e-10Helicase/RNase D C-terminal, HRDC domain
Orthology groupMCL11057 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214059-TA
ATGGATAACAACCCATTAACAGACTTATCAGATTTTAACAAAAAAATACTTTCACATCCTCTCTATTTAAAGATTAGAGAAGGCAGGGTGTACACTTTTAATGAAGCAAATGAATTCAAAAAACTGTATATAGAAGTATTAGAGAAATTGAGCGATGTGTTATATGTTTTAATCAACAAATTACCTGATTATGAAAGAAAAAGCTACACTTCAATTTTTACTGTGAAAGAGAAACTCCGAAATATTTCAGCACAAGAAAATGTTGATTTTAATGACAAGTTATCACCTGAGAGTCGGAACATTTTGGACTGTATAGATGATATGACACTCGCACCTAGAAATATAACAGAAAATAAGGAGAATGTGCAGAACAATAGTGATGTGAAAATTCAGACGACTATATTACCTGATTCACTAAACCATGAGCAGAAAATCACAAAAAACTGTACTGAATCATATAATTCGGATGATATCGATCGAACCAACACTATAGACTACCAGACTGAGACAACCAGACACATTGAGACTGAAGCTTTAAATGAATCAAAAGATTTAAACACCACCACCATTAACCAAGCCAACCCAGAGTCCAGAGTAAATAATCCTAACAATATATGTAACAATTCAGATATAGATTTCGATCACTTTGAAGAATTTGAGGACATTAATTTCCAAGACGATTGGTCTGATCATTTCAAAGAACCTTTGGAAATAGGAGATAGTAGTTTGTCGAACAATGAAGAGTCAAATACAACAATTAATGTAAACTATGAAGATTCTATTGTAACATCCAAGCAACCTTTGCCTGTACCAAGAAATATTCATAACAACAGTATAGGGGACTTCGGTAATACTAGAAATGATGGAGTCACAGGTGAATTTGACGGTGATGATTATCCGCACTCGATCCCTATGATGGAGACTTTGAAAGAGAAATTTGGTCTCACATCTTTCAGACCAAATCAGAAGCAAGTCATTAACGCAACGCTTTTAGGCCACGACTGTTTTGTTCTCATGCCGACCGGCGGGGGAAAATCACTATGCTACCAGCTACCAGCGATACTGACCCCTGGCGTCACCATAGTCATATCACCACTGAGATCATTGATGTTGGATCAAGTCAACAAGTTATTAGCGCTGGATATTCCGGCCGCACATTTAGGCAGTGACGTCACTGAGGCGAAGAGCAATTACGTGTATGACGATCTCAACCAACAGGAGCCCACTATCAAGCTCCTCTACGTCACACCCGAAAAAATACAATCGTCTCCAAAATTTCAAGAAACTTTAACGAGACTCTACGAGAAGCAAAAGATTTCCAGGTTCGTGATCGACGAAGCTCACTGTGTGTCGCAATGGGGTCACGACTTCCGTCCGGATTATCAAAAATTGAATTTACTAAGGAAAAAGTTCCCTAATGTGACGCTAATGGCATTAACTGCGACGGCCACGAAGCGAGTGCGCACCGACATATTGTATCAGCTGAAGGTGCGTGAATGCAAATGGTTCCTGAGCAGTTTCAACCGTCCAAATCTCACGTACACGATTTTAACGAAGAAACAGAATCTGATAAATAAAGATATAGCTGAATTGATCAAGACGAAATATTATCAACAGTGCGGTATCGTGTATTGCCTGTCGTGCAAGGACTGCGACAGCATGGCTAACGCTTTAAAAGAGATGAAAGTATCGTCTAAGGCGTACCACGGAAAACTCGTAGACGCCGAGAGAGTCAACGTACAGACGCAGTGGCTCGCAGGAACCATTAAAGTGATCTGTGCAACGCTTGCTTTCGGAATGGGCGTGGACAAAGGCGACGTCAGATTCGTCATCCACCACAGTGTACCGAAGTCTATAGAAGCATACTACCAAGAGACGGGTCGGGCTGGGCGGGACGGCAAACCGGCTGATTGTATCTTATTCTACTGTTACAGAGACATTGTGCGCCAACGCAACCTCTATTATCGTGATAGAAGTTTGACGGAGAACTCGAAGAACGTCCACGATGATAACCTTACACGTATCAATGAGCTGTGTGAGGACGTGCTTGAATGTCGACGGACCTTCGTACTGAGATACTTGGGGGAGACATTCCAAAGCGACAACTGCGGTCCCATGCTCTGCGACAACTGTCAACGAAGACCGACAAACGAGTTTATAGACGTGACGGACGTGTGTCGGGAGATCTATTTTCTCAGATGCAATCAGTGGGGGAAGGGTGACGCGGTCCGGCTGTTACAACTCTTGCTAATGAAGAAAATCCTCGCTGAGAAAACCCGTATGAACAAGGACATTGCCAATAATTACCTAATTCGTGGAGCTGATGTATATAAGCTGTCGTCGAAGTCTGAGCCGATTATATTCTACAAGCGCCCGCACACTACTAAGAAGCCGGCGACAGCTGTAGCAGCGCCGCTCGTACAAGACGTTGACCAACAAATTAAACAAGTCGAGGATAAGGCTTACGAAGAGCTCGTTGAGGAAATAAAGAACATAGCAAAGGAGTCAGACGTGGCGCTTTGGACGCTGTACCCACAGATGGCGCTACGTTATATGGCGGAGAAGCTACCGGAAACGGCGGAGGAAATGCTCAAGATACCTCACGTTACCAACGCCAACTATAATAAGTATGGCTTCCGGTTGCTACCCATCACCCTCAAGTACTCCATGGAGAGGCTTAAACTCGAAATGACTTTGCAAGACCAGGAAATTAGCGAAGCCTTCGACGACGAGGAGCCCTCCGCGGGACCCTCCACGGTCAGCCCTCGAGTGTCCTTCAGAAATAATAGGAGCTACCGATCAAGAAAATCGAAATCGGCCAGAGCTGGTGTGAAAAAACCATATAAAAAAGACAATAAAGCGTTCAAGAAGTTCAAAAAAGGTGCAACAGGAACAATGCCCCGGCCGGGGACATTTTTATAA

Protein sequence:

>DPOGS214059-PA
MDNNPLTDLSDFNKKILSHPLYLKIREGRVYTFNEANEFKKLYIEVLEKLSDVLYVLINKLPDYERKSYTSIFTVKEKLRNISAQENVDFNDKLSPESRNILDCIDDMTLAPRNITENKENVQNNSDVKIQTTILPDSLNHEQKITKNCTESYNSDDIDRTNTIDYQTETTRHIETEALNESKDLNTTTINQANPESRVNNPNNICNNSDIDFDHFEEFEDINFQDDWSDHFKEPLEIGDSSLSNNEESNTTINVNYEDSIVTSKQPLPVPRNIHNNSIGDFGNTRNDGVTGEFDGDDYPHSIPMMETLKEKFGLTSFRPNQKQVINATLLGHDCFVLMPTGGGKSLCYQLPAILTPGVTIVISPLRSLMLDQVNKLLALDIPAAHLGSDVTEAKSNYVYDDLNQQEPTIKLLYVTPEKIQSSPKFQETLTRLYEKQKISRFVIDEAHCVSQWGHDFRPDYQKLNLLRKKFPNVTLMALTATATKRVRTDILYQLKVRECKWFLSSFNRPNLTYTILTKKQNLINKDIAELIKTKYYQQCGIVYCLSCKDCDSMANALKEMKVSSKAYHGKLVDAERVNVQTQWLAGTIKVICATLAFGMGVDKGDVRFVIHHSVPKSIEAYYQETGRAGRDGKPADCILFYCYRDIVRQRNLYYRDRSLTENSKNVHDDNLTRINELCEDVLECRRTFVLRYLGETFQSDNCGPMLCDNCQRRPTNEFIDVTDVCREIYFLRCNQWGKGDAVRLLQLLLMKKILAEKTRMNKDIANNYLIRGADVYKLSSKSEPIIFYKRPHTTKKPATAVAAPLVQDVDQQIKQVEDKAYEELVEEIKNIAKESDVALWTLYPQMALRYMAEKLPETAEEMLKIPHVTNANYNKYGFRLLPITLKYSMERLKLEMTLQDQEISEAFDDEEPSAGPSTVSPRVSFRNNRSYRSRKSKSARAGVKKPYKKDNKAFKKFKKGATGTMPRPGTFL-