Monarch geneset OGS2.0

DPOGS202607
TranscriptDPOGS202607-TA1794 bp
ProteinDPOGS202607-PA597 aa
Genomic positionDPSCF300140 - 80612-85900
RNAseq coverage475x (Rank: top 26%)
Annotation
HeliconiusHMEL0104370.079.53% 
BombyxBGIBMGA012517-TA0.076.07% 
Drosophilamamo-PG9e-5684.87% 
EBI UniRef50UniRef50_UPI0002063B442e-7061.04%UPI0002063B44 related cluster n=2 Tax=unknown RepID=UPI0002063B44
NCBI RefSeqXP_971723.21e-8058.39%PREDICTED: similar to CG34346 CG34346-PC [Tribolium castaneum]
NCBI nr blastpgi|3838590473e-13149.67%PREDICTED: uncharacterized protein LOC100879930 [Megachile rotundata]
NCBI nr blastxgi|3454813323e-14949.18%PREDICTED: hypothetical protein LOC100119619 isoform 2 [Nasonia vitripennis]
Group
Gene OntologyGO:00055155.9e-24protein binding
GO:00036767.2e-11nucleic acid binding
KEGG pathway 
InterPro domain[3-118] IPR0113338.6e-30BTB/POZ fold
[21-118] IPR0130695.9e-24BTB/POZ
[31-127] IPR0002107.8e-20BTB/POZ-like
[544-574] IPR0130877.2e-11Zinc finger, C2H2-type/integrase, DNA-binding
Orthology groupMCL15210 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202607-TA
ATGGGAAGTGAGCACTATTGCCTGAGGTGGAACAATCATCAGAGCAACTTGCTCGGCGTTTTCAGCCAACTGTTGCACGACGAGAGCTTGGTGGACGTCACGCTCGCGTGTTCCGAAGGCGCCTCCATAAGGGCTCATAAGGTCGTTTTATCAGCGTGTTCATCCTACTTCCGTTCGCTGTTCGTGGACCATCCCTCACGCCATCCCATAGTGATCCTCAAAGACGTGGGTTTGGAGGAGCTCAGAACACTTGTCGACTTCATGTACAAGGGTGAGGTCAACGTCCAATACTGCCAACTACCTGCGCTCCTGAAAACTGCTGAGAGCCTGCAGGTCAAAGGATTAGCTGAAATGACGACGCTAAGTGCAGCCGGTATTGATACAAGAAACGTTCCAGAACCAATGGATGAATGTCAACAGAGAGATACAAGAGAATGCGCTGAGTCGCATGAGACGTTAGACGATCAGGAAGAACGAAGAGAGGCAAAGGAATTCCGTGAGCTACGTGAAAGGGAGTCCAAAGACAACAGGGAAAGCCATAGAGAAATTCGCGATTCGCGAGAGTTGCGAGAGCACAGAGAAAGGGAACGAGACAGAGAATCTCGAGAGTTGATTCGGGAAATAAGAGACCCCCGAGACCCTTGTAGGGATCATAGAGAATTGCGCGAAGTAAAAGATAGCAGAGATCCTCGTGATTTACGAGATACTAGGGATCCCAGAGAGCTCAGAGATTCTCTTGACCCACGAGAGGCAACAGAAGCCTCTCCTCCCAGAATATCCCCATTATTATCTGTACGACGATTTAGGCCCGATATTTCAATTGATACACCAACTTTGACGCATCCACCGATACCTAAAGACGAGCCTCCTGACGAACCCATACGTCCAACGTCACCGGAAGACGACACAATATCAATAAGATCAAATGGAGCAAGTGAGAATATGGGTATAAATATGACCATAAACAGTCATGGCATGGGACCAAGATACTCTCCAGTTGAACAAAGGCTGAGTGTTTTGAACACCTTGCCACATCCCGGCTTGACTCATCCATCCCACAGCTCGTTATCCAGTCCAAGAAATGAACCTATAGCTGGACCATCTGGTTTGCCACCAGTACAACAAGTTCCTTTGTCACTAAAGAAAGAAGCGGATTGGGATAGGGGTAACGATGAAAAGGTTGGGGAGACAGACTACCGTATGCCACATGAATCGGAGTACGAAGGGTCCGGTAGGGGGAGAATTGAGGAGGGGGAAAGGGCATATCCTTGCATACATTGTGGAGCAGCATTCCCACACCAGAGCAAACTAACGAGGCACATACTGACCACGCATACCCTCGACACTCTAAAGTATCGGGACGCCATATTGGGCCGGCCGTTGGGTTTGCCCATGATAGGACAATTCAGTGAACCGACTTACATGATGCCCACTGAAGAATCGCCCCTAGATCTGGACATCGGCCCAGTTGAGCCGGGCAACGTAGTGTTGTGCAAATTCTGCGGCAAGAGTTTCCCTGATGTATCATCGTTAATAGCCCACTTACCGGTTCATACGGGCGACCGACCTTTCAAATGTGAATTTTGTGGCAAGGCGTTCAAACTACGACATCATATGAAAGACCACTGTAGAGTCCACACAGGCGAACGACCGTTTCGTTGTGTTCTATGTGGTAAAACATTCTCACGATCAACTATACTGAAAGCACACGAGAAAACTCACTATCCCAAGTACGCGAGGAAATTCCTCTCGCCGAGTCCCGTCGACACCGAGGAAGAGAGTCCACATCAATGA

Protein sequence:

>DPOGS202607-PA
MGSEHYCLRWNNHQSNLLGVFSQLLHDESLVDVTLACSEGASIRAHKVVLSACSSYFRSLFVDHPSRHPIVILKDVGLEELRTLVDFMYKGEVNVQYCQLPALLKTAESLQVKGLAEMTTLSAAGIDTRNVPEPMDECQQRDTRECAESHETLDDQEERREAKEFRELRERESKDNRESHREIRDSRELREHRERERDRESRELIREIRDPRDPCRDHRELREVKDSRDPRDLRDTRDPRELRDSLDPREATEASPPRISPLLSVRRFRPDISIDTPTLTHPPIPKDEPPDEPIRPTSPEDDTISIRSNGASENMGINMTINSHGMGPRYSPVEQRLSVLNTLPHPGLTHPSHSSLSSPRNEPIAGPSGLPPVQQVPLSLKKEADWDRGNDEKVGETDYRMPHESEYEGSGRGRIEEGERAYPCIHCGAAFPHQSKLTRHILTTHTLDTLKYRDAILGRPLGLPMIGQFSEPTYMMPTEESPLDLDIGPVEPGNVVLCKFCGKSFPDVSSLIAHLPVHTGDRPFKCEFCGKAFKLRHHMKDHCRVHTGERPFRCVLCGKTFSRSTILKAHEKTHYPKYARKFLSPSPVDTEEESPHQ-