Monarch geneset OGS2.0

DPOGS213517
TranscriptDPOGS213517-TA3192 bp
ProteinDPOGS213517-PA1063 aa
Genomic positionDPSCF300033 - 879087-889366
RNAseq coverage321x (Rank: top 36%)
Annotation
HeliconiusHMEL0136940.079.16% 
BombyxBGIBMGA011797-TA0.074.29% 
Drosophilahay-PA0.073.09% 
EBI UniRef50UniRef50_P194470.068.21%TFIIH basal transcription factor complex helicase XPB subunit n=153 Tax=Eukaryota RepID=ERCC3_HUMAN
NCBI RefSeqXP_002030004.10.072.89%GM25215 [Drosophila sechellia]
NCBI nr blastpgi|1953265810.072.89%GM25215 [Drosophila sechellia]
NCBI nr blastxgi|910784040.075.83%PREDICTED: similar to rad25/xp-b DNA repair helicase [Tribolium castaneum]
Group
Gene OntologyGO:00036771.9e-286DNA binding
GO:00055241.9e-286ATP binding
GO:00062891.9e-286nucleotide-excision repair
GO:00040031.9e-286ATP-dependent DNA helicase activity
GO:00167871.8e-14hydrolase activity
GO:00043862.1e-10helicase activity
GO:00036762.1e-10nucleic acid binding
KEGG pathwaydse:Dsec_GM252150.0 
 K10843 (ERCC3, XPB)maps-> Nucleotide excision repair
InterPro domain[256-757] IPR0011611.9e-286Xeroderma pigmentosum group B protein (XP-B)
[290-483] IPR0140014.6e-18DEAD-like helicase
[766-855] IPR0186071.5e-15Chromosome transmission fidelity protein 8
[294-448] IPR0069351.8e-14UvrABC complex, subunit B
[557-625] IPR0016502.1e-10Helicase, C-terminal
Orthology groupMCL11765 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213517-TA
ATGGGACCCCCTAAAAAGTTCAGGAAATATGACTCCAAAGGTGGAAGCGACAGATCTGGTAAAAAGAAAAAAGTAGATGAGGAAGTGACAATTGATTTAGTGGATGATGATAACCCTGAAAGTTCTGGAGTGCCGGGAGCGGCCCTGCAGGATGCTGAGAAAAACGACCAGGTCCCTGAAGATGAGTTTGGAGCTAAAGATTATAGAAATCAAATGGAACTCAAACCTGACAATGCCAGTCGTCCGCTGTGGGTGGCTCCTAATGGTCATATATTTCTCGAGTCATTTTCGCCAGTCTATAAACATGCTCATGACTTTTTAATTGCTATCGCTGAGCCAGTGTCAAGGCCTCAGCACATTCATGAGTATAAATTAACAGCGTACAGTTTATATGCAGCAGTTTCTGTGGGTTTGCAAACAAATGACATAATTGAATATTTACAACGTCTCAGTAAGTGTAACGTTCCGGCCGGTATCATAGAATTCATCACACTTTGTACTTTGTCTTATGGCAAAGTTAAGCTTGTACTGAAACATAACAGATATCTAGTGGAGAGTAAGCACGTGGATGTTCTTCAGAAGCTCCTCAAGGATCCCGTGATCCAGCAGTGTAGACTGAGACGAGACGGAGATGAAGAACTGGTTACCTCCGCTCTACCTACTACGGCGCCCGCACCGCCCGGGACCGCAGTTAAGACTGGTCGGGTACTTGGACCATACAAACATTTACCATCCGCATCACAAAACCATCATATGTACTTTCTAAGAGTGATACAAAAACGCTGCATCGAGCTGGAGTATCCGCTGCTGGCGGAGTACGACTTCCGCAACGACGCCGTCAACCCTGACATTAACATTGATCTAAAGCCTACAGCAGTGCTGCGACCCTACCAGGAGAAGAGTCTCAGGAAGATGTTCGGAAACGGCCGAGCGAGGTCAGGTGTGATAGTGTTGCCGTGCGGCGCGGGCAAGTCCTTGGTGGGCGTGACGGCCGTGTGCACGGTCCGCAAGAGGGCGCTGGTGCTGTGCAACTCAGGAGTCTCCGTGGAACAATGGAAACAGCAGTTCAAGTGTTGGTCCACCGCCGACGACAGTATGATATGCAGGTTCACGTCGGAGGCCAAGGACAAGCCGATGGGCGCCGGCATCCTGATCACGACCTACTCCATGATAACGCACGGCCAGCGCCGCTCGTGGGAGGCCGAGCAGACCATGAAGTGGCTACAGGCGCAGGAGTGGGGGCTCGTGGTGCTGGACGAGGTGCACACCATCCCCGCCAAGATGTTCCGCCGGGTGCTCACCATAGTGCACTCACACGCCAAGCTGGGTTTGACGGCGACGCTACTCCGCGAGGACGACAAGATAGCCGACCTGAACTTCCTGATCGGCCCCAAGCTGTACGAAGCCAACTGGTTGGAGCTGCAAGCCAACGGCTACATCGCCAGGGTCCAGTGCGCCGAGGTCTGGTGTCCCATGACGCCCGAGTTCTACCGGGAGTACCTCGTGCAGAAGATCAATAAGAAAATGTTGCTGTATGTAATGAACCCGTCCAAATTCCGCGCCTGCCAGTTCCTCGTCCGCTACCACGAACGCCGCGGGGACAAGACCATAGTGTTCTCGGACAACGTGTTCGCCCTCAGACACTACGCCGTCAAAATGAACAAGCCCTACATCTACGGCCCGACGTCCCAGAACGAGAGGATACAGATCCTTCAAAACTTCAAGTTCAACCCTAAAGTTAACACGATTTTCGTCAGCAAAGTCGCCGACACCAGCTTCGACCTGCCCGAGGCGAACGTTCTCATACAAATCTCCTCGCACGGCGGCTCTAGGAGACAAGAAGCGCAACGTTTGGGTCGTATATTAAGAGCCAAGAAGGGTGCACTAGCGGAGGAGTACAACGCATTTTTCTACACACTAGTATCACAGGATACTTTGGAGATGGCGTACAGTCGCAAGAGACAGCGGTTCCTTGTGAACCAGGGTTACAGTTACAAGGCACGTTCGGTTATTACAGAATTGAAGGGCATGGACCAGGAGCCCGATCTGTTGTACGGAACTCGGGAGGAACAAGGGATGCTGCTGCAACAAGTTCTCGCGGCGTCAGAGACGGACTGCGAGGAGGAGAGGGAGGGTGGAGCGGGCGGTGCGGGGAGCGCGGGCGGGGCGAGGCGCACCGCGGGGTCATTGGCCTCGCTGGCGGGCGCCGACGACGCTCTGTACCTGGAACACAGGCGCTCCTCGCACCACAACAAGCACCCACTGTTCAAGAACGAGAACGAGACGGGCGGAATATCAGAGTGGGCGATAGTGGAACTGCAGGGTCTCGTGCAGGTGGAGGGAGACGATCGCGGCGGACCCGCGGTGGTGGGGGACCTGCATTACTTCAAAAGAAACCGACATCCCGTGCTCGTGCTCGGCCACCACGTGCTCACCGGCAAGGAGGTCAAGCTGGAGCAACCTATGGCAGTCATGGAGAAGACTGTGGACGGAGGTCAAACTTCGTACAGAGTCAAGGCGATCGTTAGAAAGAAACTACTCTTTAAATCGAGACCTAAACCCATCATATCAAACGAGGAAAGTGAGCGAAGTCTTCCCCGGTGCGAGGACGGCGAGGCGTGCTCGGTGTTGCTGCGCCGCTACTGGCGCCCCCCGGCCCTGGTGCGGCTCTGCCGGTGCTCCCGACGCACGCGCTGCGATAAGATCGCGTCAGGAGACAGACTGGTGGAACTGAACAACCGATCGGACTTGCAGTTCTGCAATCCGGTCACAGAATGGCCGGAGTGTTCCATCAACGAAGCGCCTCTCAAGATCGAAACAGCGTACGAGCGCATGAGTCCCGATGAGATCGAACTATTGCACCGCCAGAGCATACAGCTCGCGCCGCCGAGGATACGCCTCCGGTGCCTTTGCCCGAAACCGAACTATTGGAAATTAAAAACCGAAGACAGCGACACAAACCTAACGTATCGCTGCTCGTCTCTGCCGCTCTGTAAAACCGGTGACGTCTGCGGGAACGTGGACGACGTCCTGCTATCTCTGTATCAGTCGTGCCTGTGTCCCAAAAACCACATCTGCGTGCACAGCGGCGGAAGAACGCAGATCCAGATCTCGGAGCCGCTGTATCGAGGGAGGGGCTGGCGTGCCCGCTGTCAAGCTCTAAGTGACGAGGATAGCTACGAGGATTACTGA

Protein sequence:

>DPOGS213517-PA
MGPPKKFRKYDSKGGSDRSGKKKKVDEEVTIDLVDDDNPESSGVPGAALQDAEKNDQVPEDEFGAKDYRNQMELKPDNASRPLWVAPNGHIFLESFSPVYKHAHDFLIAIAEPVSRPQHIHEYKLTAYSLYAAVSVGLQTNDIIEYLQRLSKCNVPAGIIEFITLCTLSYGKVKLVLKHNRYLVESKHVDVLQKLLKDPVIQQCRLRRDGDEELVTSALPTTAPAPPGTAVKTGRVLGPYKHLPSASQNHHMYFLRVIQKRCIELEYPLLAEYDFRNDAVNPDINIDLKPTAVLRPYQEKSLRKMFGNGRARSGVIVLPCGAGKSLVGVTAVCTVRKRALVLCNSGVSVEQWKQQFKCWSTADDSMICRFTSEAKDKPMGAGILITTYSMITHGQRRSWEAEQTMKWLQAQEWGLVVLDEVHTIPAKMFRRVLTIVHSHAKLGLTATLLREDDKIADLNFLIGPKLYEANWLELQANGYIARVQCAEVWCPMTPEFYREYLVQKINKKMLLYVMNPSKFRACQFLVRYHERRGDKTIVFSDNVFALRHYAVKMNKPYIYGPTSQNERIQILQNFKFNPKVNTIFVSKVADTSFDLPEANVLIQISSHGGSRRQEAQRLGRILRAKKGALAEEYNAFFYTLVSQDTLEMAYSRKRQRFLVNQGYSYKARSVITELKGMDQEPDLLYGTREEQGMLLQQVLAASETDCEEEREGGAGGAGSAGGARRTAGSLASLAGADDALYLEHRRSSHHNKHPLFKNENETGGISEWAIVELQGLVQVEGDDRGGPAVVGDLHYFKRNRHPVLVLGHHVLTGKEVKLEQPMAVMEKTVDGGQTSYRVKAIVRKKLLFKSRPKPIISNEESERSLPRCEDGEACSVLLRRYWRPPALVRLCRCSRRTRCDKIASGDRLVELNNRSDLQFCNPVTEWPECSINEAPLKIETAYERMSPDEIELLHRQSIQLAPPRIRLRCLCPKPNYWKLKTEDSDTNLTYRCSSLPLCKTGDVCGNVDDVLLSLYQSCLCPKNHICVHSGGRTQIQISEPLYRGRGWRARCQALSDEDSYEDY-