Monarch geneset OGS2.0

DPOGS202302
TranscriptDPOGS202302-TA993 bp
ProteinDPOGS202302-PA330 aa
Genomic positionDPSCF300032 + 228869-237672
RNAseq coverage1523x (Rank: top 8%)
Annotation
HeliconiusHMEL0047343e-14577.40% 
BombyxBGIBMGA004974-TA8e-8169.44% 
Drosophilapnt-PB4e-5788.89% 
EBI UniRef50UniRef50_E2AGM72e-6290.00%Transforming protein p54/c-ets-1 n=4 Tax=Formicidae RepID=E2AGM7_CAMFO
NCBI RefSeqXP_396368.34e-6383.82%PREDICTED: similar to ETS-like protein pointed, isoform P1 (D-ETS-2) [Apis mellifera]
NCBI nr blastpgi|3287859183e-6283.82%PREDICTED: hypothetical protein LOC412916 [Apis mellifera]
NCBI nr blastxgi|3072123003e-6449.31%Transforming protein p54/c-ets-1 [Harpegnathos saltator]
Group
Gene OntologyGO:00063551.2e-57regulation of transcription, DNA-dependent
GO:00435651.2e-57sequence-specific DNA binding
GO:00037001.2e-57sequence-specific DNA binding transcription factor activity
KEGG pathwayame:4129161e-62 
 K02678 (ETS, pnt)maps-> MAPK signaling pathway - fly
    Dorso-ventral axis formation
InterPro domain[221-306] IPR0004181.2e-57Ets
[216-325] IPR0119914.1e-51Winged helix-turn-helix transcription repressor DNA-binding
Orthology groupMCL13065 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202302-TA
ATGCTGTTAGTGGAGCCCCGCGAAGCCGCCGGGCCTCCGTCGCTGCCCATGTCAGCCCTAACAAAACTGAACTTCACGGACTACAGTGTTAAAGTGAAACCGTACGACGAGGACAATAGATTGAACTATGAACAATCGCTAGGAGTTTCCAGACTCGCTTATGATGACACCAGCTCCGGCTCACATAAAATACAATATGATCAGAGAATCGCTTTTGAAACTGTTGATAACGTGTGCCAGGAGCCGTCCTACGTCCTGCCAGGCGGCTACACGACCGGCGGTCCGCGGGAAACGAACACCTACTACAGCCTCCGCTCGCCCCCGCGTCGTGGTCTCGCTCCCCCCGCCACCTCCCCCCTCGACGAGTACTACCAGCCGGAGTACCCGCCGGCGCCGGCGCAGCCCTACCTCGAGGACTACTATCACCACGCGCACGCACACACACAACCGATACTCACACACGAACACAAATATCAGCCGCATTCATACACCAAGCCTTATCCCAGAACTGCCGGTCGGTACGGAGGTGAGGTTTATGGTGAGGGATACTCCACGGAGTACGGAGGATGTGCTGAGTGGGCCGAGTGGCCGGAGCAACACGCGCTGCCCCAGGACAAGCTGCCCTACCCCGCGGCCGGACCCTGCTTCACTGGTTCAGGGCCGATCCAGCTGTGGCAGTTCCTGCTGGAGCTCCTTACAGACAAGAGCTGTCAGGGGTTCATCTCGTGGACCGGGGATGGATGGGAATTCAAACTCACCGACCCTGATGAGGTGGCTCGTCGTTGGGGGATCCGCAAGAACAAGCCCAAGATGAACTACGAGAAGCTGTCCCGCGGTCTACGATACTACTACGACAAGAACATCATACACAAGACGGCCGGCAAGCGCTACGTCTACCGCTTCGTCTGCGATCTGCAGAACCTGCTCGGCATCTCCCCGGAGCAGTTGCACGCGATGTACGAGCCCAAAACTGAGAAGAAGGACGACGACTGA

Protein sequence:

>DPOGS202302-PA
MLLVEPREAAGPPSLPMSALTKLNFTDYSVKVKPYDEDNRLNYEQSLGVSRLAYDDTSSGSHKIQYDQRIAFETVDNVCQEPSYVLPGGYTTGGPRETNTYYSLRSPPRRGLAPPATSPLDEYYQPEYPPAPAQPYLEDYYHHAHAHTQPILTHEHKYQPHSYTKPYPRTAGRYGGEVYGEGYSTEYGGCAEWAEWPEQHALPQDKLPYPAAGPCFTGSGPIQLWQFLLELLTDKSCQGFISWTGDGWEFKLTDPDEVARRWGIRKNKPKMNYEKLSRGLRYYYDKNIIHKTAGKRYVYRFVCDLQNLLGISPEQLHAMYEPKTEKKDDD-