Monarch geneset OGS2.0

DPOGS201030
TranscriptDPOGS201030-TA2703 bp
ProteinDPOGS201030-PA900 aa
Genomic positionDPSCF300147 + 474761-486717
RNAseq coverage149x (Rank: top 53%)
Annotation
HeliconiusHMEL0023793e-16058.11% 
BombyxBGIBMGA009112-TA7e-14763.57% 
DrosophilaMarcal1-PA1e-9638.87% 
EBI UniRef50UniRef50_D6WYV91e-10951.66%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WYV9_TRICA
NCBI RefSeqXP_967843.12e-11051.66%PREDICTED: similar to conserved hypothetical protein [Tribolium castaneum]
NCBI nr blastpgi|910894494e-10951.66%PREDICTED: similar to conserved hypothetical protein [Tribolium castaneum]
NCBI nr blastxgi|910894492e-10751.66%PREDICTED: similar to conserved hypothetical protein [Tribolium castaneum]
Group
Gene OntologyGO:00036773.3e-29DNA binding
GO:00055243.3e-29ATP binding
GO:00056343.1e-17nucleus
GO:00168183.1e-17hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides
GO:00043863.1e-17helicase activity
GO:00165683.1e-17chromatin modification
GO:00036763.7e-14nucleic acid binding
KEGG pathway 
InterPro domain[180-399] IPR0003303.3e-29SNF2-related
[173-352] IPR0140014.6e-18DEAD-like helicase
[90-146] IPR0100033.1e-17HepA-related
[676-759] IPR0016503.7e-14Helicase, C-terminal
Orthology groupMCL12515 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201030-TA
ATGGAATGTACGAAGGAACAAATTGAACAGAAACGGTTAGCAGCTTTGCAGAAAAGATTATCCAAAAATAATAACCACGTCCCTCATGCGCCCCAATGCACGAAACCGTCAGATCAACCAGAACATTTAAAACAGGGGCAGTCAAAAAACAATTCTGCTCATACTTTTCATCCATACTCAAGACCAAATAGTAGCAAACAATTTACTTCGACGGTTCCAGTATCTAATGTTGTCACAGGATCCGTATATTTGATATCTGAAGACAGATTTGAAGTAAACATGTCAGAATTTTGCCCTCCACTTATAAATATCTTCAAAACTTTAAGATCAAGGTCATATGATTCAAACACTAAGTTGTGGAATTTTTTAATGAGTGATTATGAGAATCTGATGTCCAAAGTTACACCGCTCGCGCCTCACATTGTTATAGGTCCCCTACCAGCATTTGTTTTGAAGATTCTGAATGATCCTCCCATTGATCATAGTGCTGTGGATTTAACACCAATCGAAGCAACACTAAGAAATAAACTGTTACCATTCCAGGAGGAAGGTGTCAGATTTGGAATAGCGCGTAAGGGTCGATGTCTTATAGCTGACGACATGGGACTCGGGAAGACGTTTCAAGCTCTTGCGATAGCTAGTTACTACCGGCATGACTGGCCCTTACTGATTGTCACTACTTCTTCAATGAGAGAAACATGGCAAAACAAAATCAGCGAGCTTCTGCCATCGGTGCCGTTGGTAAACGTGGCTACTCTAACAAGCAACAAAGACGTCAATTTCGTTTCCGATAGACAGGTCGAGGTTGTCATAGTTAGCTATAAGATAATAAGTCTGCACACGGAGCTGCTGAGGCAGAAGAGGTTTGGATTCGTCATAGTTGACGAGTCTCACCACCTGAAGTCACCCAAGGCTCAGTGTACGAGTGCCCTGTTCAAGCTATGTGGCCAGGGTCGTGCAGTACTCCTCAGCGGGACCCCCGCCCTCAGTCGACCCGTCGAACTCTACACCCAGCTGTCTCTACTAGAGCCGAGACTTTTCACTTACACGGAATACGGTAAGAGATATTGCGACGCGAAACAAACAAACTTCGGTTGGGACATGACCGGCAAGTCGAATTTGGCGGAGCTGCTCGTTATATTACAAAGGAGATTCCTCATCCGCCGCACCAAAGAACAGGTCCTCAATCTAGAAGAGAAGACTCGATGTATGTACGTGCCATCACAAGTTGTGATCGCTTTCCTTTCTTCTTCTGGGCTTGATGGGTTCCCAAACAGAGAAACATGGCAAAACAAAATCAGCGAGCTTCTGCCATCGGTGCCGTTGGTAAACGTGGCTACTCTGACAAGCAACAAAGATGTCAATTTCGTTTCCGATAGACAGGTCGAGGTTGTCATAGTTAGCTATAAGATAATAAGTCTGCACACGGAGCTGCTGAGGCAGAAGAGGTTTGGATTCGTCATAGTTGACGAGTCTCACCACCTGAAGTCACCCAAGGCTCAGTGTACGAGTGCCCTGTTCAAGCTATGTGGCCAGGGTCGTGCAGTACTCCTTAGCGGGACCCCCGCCCTCAGTCGACCCGTGGAACTCTACACCCAGCTGTCTCTACTAGAGCCGAGACTTTTCACTTACACGGAATACGGTAAGAGATATTGCGACGCGAAACAAACAAACTTCGGTTGGGACATGACCGGCAAGTCGAATTTGGCGGAGCTGCTCGTTATATTACAAAGGAGGTTCCTCATCCGCCGCACCAAAGAACAGGTCCTCAATCTGGAAGAGAAGACTCGTGAGACTGTTATCTTGGACCAGTCGCTGTTGAATTATTCTAAGGAGGACCAGCACGGGCTCACCCAAATGGCCGAGAAGTTCAGGAACTCTAAATCATCAGAGAGACACGTCGCTATGATACAATACTTTAATGAATCAGCGGCCATAAAGACGCCGGCTGTGTGCAAGTACATCAGACAGTTACTGAGCGGCAGCCAGAAGTTCCTGGTGTTCGCTCATCATAAGAATGTCATCGACTCCATCTGTGACACGCTGGATGAGCAACGCAAAAACTATATCAGAATCGTCGGATCTACCCCCACACACATACGAACGGAGCTCGTGGACAAGTTCCAACACAGCGAGTCGTGTCGTTGCGCCGTGTTGTCCATCACCGCCGCTAACTCTGGCCTGACACTGACAGCAGCAGACCTCGTCATATTCGCTGAACTACACTGGAACCCCGGCATTTTGATCCAGGCGGAGTCCCGTGCTCATCGCCTGGGTCGCGCAGGCTCCGTGTGTGTGAGGTATCTTCTAGCGCGAGGAACAGCTGACGATCACATGTGGCCGCTGCTACAGACCAAGCTTAATGTACTCAACGATGTGGGTCTGAGTGGAGACAATTTCGAAGACACGAAAATGAAGCACCAGGACACCAAAAACAACATAACACATTACATGTCACCGGTGAGAAACAAAAACGACTACATACAAGGTACGAACATAAAGAAAATATCACAGGAACAAACGGAAAATGTAAAATCGCAGTCGGACTCTCAAACTACCCTCCACGGCATTAGCAAACTAACTTTACTGAGTCCAGAGAAAAATAGTTCAGACTTGACCACTGACGGTGTAGACTCGGATGACAGGTTCTTGGAGAACGACGAGGACGACGAGATCCTGGCCAGTATAGATCTGGATATGTAA

Protein sequence:

>DPOGS201030-PA
MECTKEQIEQKRLAALQKRLSKNNNHVPHAPQCTKPSDQPEHLKQGQSKNNSAHTFHPYSRPNSSKQFTSTVPVSNVVTGSVYLISEDRFEVNMSEFCPPLINIFKTLRSRSYDSNTKLWNFLMSDYENLMSKVTPLAPHIVIGPLPAFVLKILNDPPIDHSAVDLTPIEATLRNKLLPFQEEGVRFGIARKGRCLIADDMGLGKTFQALAIASYYRHDWPLLIVTTSSMRETWQNKISELLPSVPLVNVATLTSNKDVNFVSDRQVEVVIVSYKIISLHTELLRQKRFGFVIVDESHHLKSPKAQCTSALFKLCGQGRAVLLSGTPALSRPVELYTQLSLLEPRLFTYTEYGKRYCDAKQTNFGWDMTGKSNLAELLVILQRRFLIRRTKEQVLNLEEKTRCMYVPSQVVIAFLSSSGLDGFPNRETWQNKISELLPSVPLVNVATLTSNKDVNFVSDRQVEVVIVSYKIISLHTELLRQKRFGFVIVDESHHLKSPKAQCTSALFKLCGQGRAVLLSGTPALSRPVELYTQLSLLEPRLFTYTEYGKRYCDAKQTNFGWDMTGKSNLAELLVILQRRFLIRRTKEQVLNLEEKTRETVILDQSLLNYSKEDQHGLTQMAEKFRNSKSSERHVAMIQYFNESAAIKTPAVCKYIRQLLSGSQKFLVFAHHKNVIDSICDTLDEQRKNYIRIVGSTPTHIRTELVDKFQHSESCRCAVLSITAANSGLTLTAADLVIFAELHWNPGILIQAESRAHRLGRAGSVCVRYLLARGTADDHMWPLLQTKLNVLNDVGLSGDNFEDTKMKHQDTKNNITHYMSPVRNKNDYIQGTNIKKISQEQTENVKSQSDSQTTLHGISKLTLLSPEKNSSDLTTDGVDSDDRFLENDEDDEILASIDLDM-