Monarch geneset OGS2.0

DPOGS204678
TranscriptDPOGS204678-TA2871 bp
ProteinDPOGS204678-PA956 aa
Genomic positionDPSCF300170 - 41835-53020
RNAseq coverage695x (Rank: top 18%)
Annotation
HeliconiusHMEL0176010.085.66% 
BombyxBGIBMGA010249-TA0.076.11% 
Drosophilamor-PA0.055.61% 
EBI UniRef50UniRef50_F4WR490.056.80%SWI/SNF complex subunit SMARCC2 n=8 Tax=Endopterygota RepID=F4WR49_ACREC
NCBI RefSeqXP_001649009.10.056.75%hypothetical protein AaeL_AAEL004358 [Aedes aegypti]
NCBI nr blastpgi|1571057460.056.75%hypothetical protein AaeL_AAEL004358 [Aedes aegypti]
NCBI nr blastxgi|1571057460.056.57%hypothetical protein AaeL_AAEL004358 [Aedes aegypti]
Group
Gene OntologyGO:00055152.8e-37protein binding
GO:00056223.9e-13intracellular
GO:00036777.1e-11DNA binding
GO:00063556.1e-06regulation of transcription, DNA-dependent
KEGG pathway 
InterPro domain[424-528] IPR0119913.9e-48Winged helix-turn-helix transcription repressor DNA-binding
[421-529] IPR0090572.8e-37Homeodomain-like
[427-514] IPR0075261.7e-35SWIRM
[140-273] IPR0013573.9e-13BRCT
[605-653] IPR0010057.1e-11SANT domain, DNA binding
[608-649] IPR0147788.3e-09Myb, DNA-binding
[608-648] IPR0122876.1e-06Homeodomain-related
Orthology groupMCL11359 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204678-TA
ATGGCTGCGCTTAGTCCTAAGAAGGATGGAGGCCCGAATATAGAGTTTTTCCAATCACCGGAGTCCTTAGCTCAGTTTGATCAAATCCGTGTTTGGTTACAAAAAAACTGCAAAAAGCATGTACAAACTGATCCACCAACAAAAGAAGGCTTGGCACAACTTGTCATTCAGCTCATACAGTATCAAGAGAACAAATTGGGAAAGAATGCCACTGATCCCCCTTTTATGAGGCTTCCAATGAAAGTGTTCATGGACATGAAAGCCGGTGGTTCGTTGTGCACAGTGCTAGCCACCATGTTCCGTTTCAAGTCGGAGCAGCGGTGGCGCAAGTTCGACTTCCAGGTCGGTAAGAACCCGTCCCGCAAGGATCTTAACGTGCAGATGATGATGGAAATAGAGTCGGCTCTGCTAACAGCTGAATTACTCCGCTCCCCCTGCATCTACATCCGCCCCGACGTCGACAAAGCTACGGCGAACAAAATCAAAGATATCATCGTGAATCACCAGGGAGAGATATGTGAAGACGAAGAGGATGCCACTCATATAATATACCCAGCTGTTGATCCTCTAGAGGAAGAATACGCCAGGCCGGTGTTCAGGAGGGGAAATAATGTCCTGGTGCATTGGTACTACTTACCAGACAGTCACGACACTTGGGCTCAAGCCGACCTCCCCGTGGATGTTCCGGAGACAGCCAACTGGGACTGTAATAGATCGGAGCCGTGGCGAGTGTCGGCCACGTGGGCGCTGGATCTGACCCAGTACAACGAGTGGATGAACGAGGAGGACTACGAGTTAGACCAGCATGGAAAGAAGAAGGTCCACAAACTGCGGTTGTCCGTGGACGAGCTGATGCCGGGAGCGGAGAGTTCAGGGAAAAGTAAGAAGAGCAAGAGGAAGAGGTCGCCGTCACCCCCTCCACAGAAACATGGGAAGAGAAAGAGTCGAGTTGCTAAACGTCGAGACAACGACGGCGATGACGAGGACGATAACGCGTCCAGAGACAACACGGACGTGGCGCCCGCCACCGACAGCGAGCGCTCCACCGAGGCACCTGTGTCTGTACCGTCTGCGGGTCCGTCGGGTGCGGGTGGGAGCGGCAGTGGTGGGGGCAGCGGCGGCGGTGGTTGTAGCGAGGTGGTGCAGGAAGCCCCCGCCACGCCCGCCCCGGCCATGGACGCGCACGACGACTCACAAGGGAAGCACAGTGATTCCAATACACAGGAGATGACTAAGGAAGAGCTGGAAGACAACGTAACAGACCAAACCCATCACATAGTGGTGCCGTCTTACTCCGCCTGGTTTGATTACAACTCGATACACACCATAGAGAAGAGGGCCTTGCCGGAGTTCTTCAATAATAAGAATAAGTCCAAAACACCGGAGATATATCTGGCTTACAGAAATTTCATGCTAGACACGTATCGTTTGAACCCTACTGAATATTTAACAAGCACGGCCTGTCGGCGGAACCTCGCCGGGGACGTGTGCGCCATCATGAGGGTACATGGATTCCTGGAACAATGGGGACTTATTAATTATCAGGTGGAGGCGGAAGCTCGTCCGACCGCGATGGGTCCTCCTCCGACATCTCACTTCCACGTGCTCTCAGACACTCCCTCCGGGCTGCAGCCGCTCCAGGCGAGGTCCACCCAACAGAGACCAGCAGAGAACGCGGCGGTGCCGAAGATCGAGGCCGGTCTGCCGAACGGCACTGAGGCACCCATCAAGGCCGAGCCCAGTGTTAAGACTGAACCCATAGAGCTGGGGACGGCCCCTGGGCTTAAAATGGATCAGTACCGCGGCGGTGCGAGGGGTCGCGAGTGGACGGAACAGGAGACGCTTCTGCTGCTGGAAGCTCTGGAACTCCACCGGGACGACTGGAACAGGGTTGCAGCACACGTCGGCTCCAGGACACACGACGAGTGCATCCTACACTTCCTCAGGCTACCCATCGAGGATCCCTACCTAAACGACACATCCGCGGGTGGAGTTTTGGGTCCGCTAGCCTACCAGCCTGTGCCGTTCAGTAAGGCCGGTAACCCTGTCATGAGTACAGTAGCCTTCCTCGCCTCGGTCGTTGACCCCCGAATCGCCTCTAAAGCTACAAGGGCCGCTATGGATGAATTCGCTGCTATTAAGGATGAAGTTCCGGCGGCCATGATGGAGGCTCACGTGAAGGCAGCCGGCGCCCACGGACCCGCCGCCGCCCTAGCAGCCACCGGCATAGCGGGGACTGCGCCCCCCGCACCCCCTGCCGGGGACACGCCCAGCGCCGGCGAAAAGAAAGAAGGAGGCAGCGATGTTAAGACTGAGGCGATGGAGGTTGATAACGAAGAGGCCAAGGTGAAGGAGGAACCGGCTGAGGCAGAGGAGGCTAAGGACTCCAAAGAAGAAGACACAAGCACGCCAGAGACCCCAGCTGTAGTGGACGCCAAGCTGCAATCAGCTGCAGCGGCAGCACTAGCAGCTGCAGCTGTTAAAGCGAAACACCTGGCGGGGGTCGAGGAGAGAAAGATCAAATCCCTGGTGGCATTACTGGTGGAGACACAGATGAAGAAGCTGGAGATCAAGCTGCGGCACTTCGAAGAGCTGGAGGCTACCATGGAGAGGGAAAGAGAGGGTCTAGAATATCAACGGCAGCAGTTGATTCAGGAACGGCAGCAGTTCCACCTGGAACAGCTGAAGGCAGCTGAATTCCGAGCGAGGAACCACGCCATCCAGAGATTACAGGCCGAGAGCGGTGGAGTGGTGGGCGTCGTGCCTGGCGTGGTGGGCGTCCCGGGGGGCGGACCGCCCCTCGCAGCCGGCGGACCTGCCATGGAAGCCCCTCAGGAGCCCCCGCCACAACCCGCGCCGCATCACGCATAA

Protein sequence:

>DPOGS204678-PA
MAALSPKKDGGPNIEFFQSPESLAQFDQIRVWLQKNCKKHVQTDPPTKEGLAQLVIQLIQYQENKLGKNATDPPFMRLPMKVFMDMKAGGSLCTVLATMFRFKSEQRWRKFDFQVGKNPSRKDLNVQMMMEIESALLTAELLRSPCIYIRPDVDKATANKIKDIIVNHQGEICEDEEDATHIIYPAVDPLEEEYARPVFRRGNNVLVHWYYLPDSHDTWAQADLPVDVPETANWDCNRSEPWRVSATWALDLTQYNEWMNEEDYELDQHGKKKVHKLRLSVDELMPGAESSGKSKKSKRKRSPSPPPQKHGKRKSRVAKRRDNDGDDEDDNASRDNTDVAPATDSERSTEAPVSVPSAGPSGAGGSGSGGGSGGGGCSEVVQEAPATPAPAMDAHDDSQGKHSDSNTQEMTKEELEDNVTDQTHHIVVPSYSAWFDYNSIHTIEKRALPEFFNNKNKSKTPEIYLAYRNFMLDTYRLNPTEYLTSTACRRNLAGDVCAIMRVHGFLEQWGLINYQVEAEARPTAMGPPPTSHFHVLSDTPSGLQPLQARSTQQRPAENAAVPKIEAGLPNGTEAPIKAEPSVKTEPIELGTAPGLKMDQYRGGARGREWTEQETLLLLEALELHRDDWNRVAAHVGSRTHDECILHFLRLPIEDPYLNDTSAGGVLGPLAYQPVPFSKAGNPVMSTVAFLASVVDPRIASKATRAAMDEFAAIKDEVPAAMMEAHVKAAGAHGPAAALAATGIAGTAPPAPPAGDTPSAGEKKEGGSDVKTEAMEVDNEEAKVKEEPAEAEEAKDSKEEDTSTPETPAVVDAKLQSAAAAALAAAAVKAKHLAGVEERKIKSLVALLVETQMKKLEIKLRHFEELEATMEREREGLEYQRQQLIQERQQFHLEQLKAAEFRARNHAIQRLQAESGGVVGVVPGVVGVPGGGPPLAAGGPAMEAPQEPPPQPAPHHA-