Monarch geneset OGS2.0

DPOGS210071
TranscriptDPOGS210071-TA2121 bp
ProteinDPOGS210071-PA706 aa
Genomic positionDPSCF300017 - 534931-547303
RNAseq coverage372x (Rank: top 32%)
Annotation
HeliconiusHMEL0154797e-8161.46% 
BombyxBGIBMGA012712-TA0.065.11% 
DrosophilaRfx-PE1e-7745.75% 
EBI UniRef50UniRef50_F4X7G11e-11848.22%Transcription factor RFX3 n=10 Tax=Coelomata RepID=F4X7G1_ACREC
NCBI RefSeqXP_975182.21e-12341.43%PREDICTED: similar to GA19507-PA [Tribolium castaneum]
NCBI nr blastpgi|1892413003e-12241.43%PREDICTED: similar to GA19507-PA [Tribolium castaneum]
NCBI nr blastxgi|2700131656e-11740.86%hypothetical protein TcasGA2_TC011734 [Tribolium castaneum]
Group
Gene OntologyGO:00036772.2e-28DNA binding
GO:00063552.2e-28regulation of transcription, DNA-dependent
KEGG pathway 
InterPro domain[208-282] IPR0119913.1e-36Winged helix-turn-helix transcription repressor DNA-binding
[208-273] IPR0031502.2e-28DNA-binding RFX
Orthology groupMCL10717 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210071-TA
ATGTCGGAGATGGGATTTGAGAACGTTTTCTACTTGGAGAGTGAGTATTGTGGGATGAGGGATGGTGTTGCCAGGTTACACTCCCCAGATTTTGAGTGCGCGGGCGACGAAGTGTTGGTGGAGTCGTCTCCGCCCGCTTCACCAGTCATGGCGGCGCGACTAGCGGCGCCCTCACAGGGCTCGGGCGCGGGCGGCGCCTCGCCGGACGCTGTGCGTGAGTTGATCGTTATACCGGAGCTCCCTAATTCAATCCACCTGCAACACGCCATACAGCAGGTGTCGAGTACGGTGGTAGAAGTCAACGGAGACAGTTCAGGACACTCCAGTCCCACGACGGAGGCACAGCACACATACATCGTCACAAGCGAGGGCGGGAACGGCGTCAACTATCACGTCCAGTATGTGGAGCCGCAGGAGATATACGCACAGACAGGACATCAGACACATATGGAGACCCTCCGCTCATATCCCGTGTACGGCGTGGCGAGCGTGCCCGCGGACGGCGCGGTGACGGCCGTGACCGCGGTCACAGCGGTCACCGCCTCCGATGACACCTGGCCGGCTGAGTTCACCTTCGAACAACCGGCCTCGCCAGCACCGGCAGCTCGTATGCCGCCGGCCACCGTGCAGTGGCTGCTGGATCACTACGAGACGGCCGAAGGTGTGTCTCTCCCGCGTTCGACGCTGTACGCCCATTACCTCCGTCACTGTTCCACACATCGCCTGGAGCCGGTGAACGCGGCGTCTTTCGGGAAGCTCATCAGATCGGTGTTCGTGGGTCTGAGGACCCGCCGGCTCGGGACCCGGGGAAACTCCAAGTATCACTACTACGGCATCAGGGCGAAGCATTCCGCGCCCCGAGACCTGCCGCCCACCGTACAGAAGATAGACGAGGAACCGCACTCGTCAGACGAATCCCGTCCCCGTGAGCCGGAGAGTCCCGTGGGTCTGTCTGGTATCGCTCACAGACAGTACTTGGGCTCGGTGAGCGCCCCTGACCCGCCGCCGCTGCAGCTAGACGACCCACCGCCAGACGTGACGCCTGAAGCGATGCAGCAGTTCAGGGATCACCACAGGCAACACGGGGTGGAGTTCCTCGAGGCCGTGGCGTCCCTGGACACGGGAGCTGCGGAGCGCTCTCGTCGGTGGTTCTGGAGGCGCGTGGGCAGGAGCGGGGCCCGCCTGGCCGGTCGCAGGGACGTGTGCACCTGGCTCAGGAGGGCCGAGCTCGAGCTGCACCAGCGAGCCGTGGACCTCCTGCTGCCCGACGTACTCAGGCCCATACCCTCACAACTCACACAGGCCATCCGTAACTTCGCCAAGAGCCTGGAGGGCGCGCTGTCGTCGGGGTCCTCCGGAGCCCCGGCCCCAGCGGCGCGCGCTCAGGCGTTGGCTGCGGGGGCTCTGTCGGCCGCCCTCAGGCGCTACACCTCCCTCAACCACCTGGCGCAGGCCGCGCGGGCCGTCCTCAACAACCACCATCAGATCCAGCAGATGTTGTCGGACCTGAACCGCGTGGACTTCCGCGTGGTGCGCGAGCAGGCGGCCTGGGCCTGCGCCTGTGGCAGTGCGGCCACCGCGCACCGCCTCGAGGCTGACTTTAAAGCCCGCCTCGGTCGCGGGTCGTCGCTGGAGTCGTGGGCGTCGTGGCTGGAGAGCTGCGTCCGCGCCGCGTTGGCCCCGCACGAGCGCCGCGCCGACTACACGCCGCGTGCGCGACGACTGCTGCTCGACTGGTCCTTCTACTCCTCGCTCGTCATCAGGGAACTCACGCTCAGGTCGGCGGCGTCGTTCGGGTCGTTCCACTTGATCCGCCTGCTGTACGACGAGTACGTCTCCTTCCTCATAGAGCGGCGCGTGGCCGAGCACCGCCAGGAGCCGCCCATAGCTGTGATGCAGCGAGCGATGGATGACGACGATGAACTGCCGGAGGAGGTTCCCCGCGACGACGACGACATGAACGGAGAGATGGTGGACGAGGGGCTCGACCACGGGGAGGGAGAGGGGGACGGAGACGGCGAGGGGAACGGAGAGGAGGGGGAGGGGGAGTGGGAGTGGGAGGACGACGACGACGAGCACGAGGAGAGGGAGCAGAAGAGGGCCCGCCTGGACCGAGGCTAA

Protein sequence:

>DPOGS210071-PA
MSEMGFENVFYLESEYCGMRDGVARLHSPDFECAGDEVLVESSPPASPVMAARLAAPSQGSGAGGASPDAVRELIVIPELPNSIHLQHAIQQVSSTVVEVNGDSSGHSSPTTEAQHTYIVTSEGGNGVNYHVQYVEPQEIYAQTGHQTHMETLRSYPVYGVASVPADGAVTAVTAVTAVTASDDTWPAEFTFEQPASPAPAARMPPATVQWLLDHYETAEGVSLPRSTLYAHYLRHCSTHRLEPVNAASFGKLIRSVFVGLRTRRLGTRGNSKYHYYGIRAKHSAPRDLPPTVQKIDEEPHSSDESRPREPESPVGLSGIAHRQYLGSVSAPDPPPLQLDDPPPDVTPEAMQQFRDHHRQHGVEFLEAVASLDTGAAERSRRWFWRRVGRSGARLAGRRDVCTWLRRAELELHQRAVDLLLPDVLRPIPSQLTQAIRNFAKSLEGALSSGSSGAPAPAARAQALAAGALSAALRRYTSLNHLAQAARAVLNNHHQIQQMLSDLNRVDFRVVREQAAWACACGSAATAHRLEADFKARLGRGSSLESWASWLESCVRAALAPHERRADYTPRARRLLLDWSFYSSLVIRELTLRSAASFGSFHLIRLLYDEYVSFLIERRVAEHRQEPPIAVMQRAMDDDDELPEEVPRDDDDMNGEMVDEGLDHGEGEGDGDGEGNGEEGEGEWEWEDDDDEHEEREQKRARLDRG-