Monarch geneset OGS2.0

DPOGS215998
TranscriptDPOGS215998-TA1476 bp
ProteinDPOGS215998-PA491 aa
Genomic positionDPSCF300078 + 230372-258250
RNAseq coverage276x (Rank: top 39%)
Annotation
HeliconiusHMEL0104171e-17784.38% 
BombyxBGIBMGA013358-TA1e-2450.41% 
Drosophilaeyg-PA2e-9965.13% 
EBI UniRef50UniRef50_B4LGP09e-9865.67%GJ13265 n=5 Tax=Eukaryota RepID=B4LGP0_DROVI
NCBI RefSeqXP_002021174.17e-10065.25%GL24956 [Drosophila persimilis]
NCBI nr blastpgi|1951606231e-9865.25%GL24956 [Drosophila persimilis]
NCBI nr blastxgi|1672343889e-12155.24%eyegone [Tribolium castaneum]
Group
Gene OntologyGO:00036778.4e-32DNA binding
GO:00063558.4e-32regulation of transcription, DNA-dependent
GO:00055153.5e-28protein binding
GO:00435652.1e-27sequence-specific DNA binding
GO:00037002.1e-27sequence-specific DNA binding transcription factor activity
KEGG pathwayrno:255092e-54 
 K08031 (PAX6)maps-> Maturity onset diabetes of the young
InterPro domain[110-172] IPR0119912.5e-33Winged helix-turn-helix transcription repressor DNA-binding
[48-166] IPR0015238.4e-32Paired box protein, N-terminal
[108-168] IPR0090573.5e-28Homeodomain-like
[258-320] IPR0013562.1e-27Homeobox
[255-321] IPR0122872.1e-27Homeodomain-related
Orthology groupMCL17059 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215998-TA
ATGTTGATATCTGCGGGCAACGGCCCCATACCCGAATTCTCTCCTTCCTCGTGCTGCGCGTCCTGGAAGATGGAGCCAGCGCCTGCTGCCTCCATAACACAGCCATTGAATATATCGCCCAACGTGCCAGTGTCACCATCATCATCAGGAGGTACAAACACGCTGTACGCGGGCGCGTCACACCCATTGTCGTCTTTGTTGTCACAACAGCGGTTACTCGAACTGTCTCGCTTCGGTTTGCGTCACTACGATATAGCTCATCACGTGCTGTCTCAGCAAGGTGCTGTTACGAAGCTGCTTGGTACACTCAGGCCACCTGGTCTCATCGGTGGCAGCAAGCCAAAGGTCGCTACGCCGGCAGTAGTTTCAAAGATAGAACAATACAAAAGAGAGAACCCTACCATTTTTGCTTGGGAGATACGAGAGCGGCTTATTTCTGAAGGCGTCTGTACTAATGCAACAGCACCGAGTGTATCTTCAATCAATCGGATACTCAGAAACAGAGCCGCGGAAAGGGCTGCAGCAGAGTTTGCTAGAGCGGCGGGATATGGTCTGTATGCTGCACCACCTCCATATGGCGGGTTCCCGTGGGCCAGCGGTGGTGGTGTATGGCCACCTGGAAGCCTACCCCTACCTCCAGGAGTACCGCCTTCTTCAGTTGGAGTACCTCATCCGGATGCTGTTAAACAAGGCTTTTTATCATCGTCGGGTCGTAGCTTGATTGATGTGGATGGAGACGATTCGGGATCACTAGATGGGGAACAACCAAAGTTTAGACGAAATCGAACAACATTTAGTCCTGATCAACTAGAAGAGTTAGAAAAAGAATTTGAAAAATCACATTATCCCTGCGTTTCAACACGAGAACGATTAGCCTCTAAGACTTCTTTATCCGAGGCAAGGGTTCAGGTTTGGTTTTCCAACAGACGAGCCAAATGGAGGCGCCACCAGCGCATGAATCTCCTCAAACGCGGTGGCTCTCCTTCGCACCGCCTCCCCCACTCTCCCTCCCGTTCCCGTTCACGTTCTTTATCCCCCACCCGAATTCCCTACCACGCTCCTCAAATGGGCGGTGAAAATAGTGCTTTCAAAGCTTTAGGCCATCAAGATACTAATACGCTGAAAGCCCTCACGCATCAAAGCCAATTCGAAACGAACGCACTAAAAGCTTTATCCCAACAGACAACATTTGATAGCAATCCTTTTAAATCACATCCAGCTCTAGAGAATAGTGCATTTAAAGCGCTCGTACCAAACTCAGCGGCAGCGGCGTTATTGGCCGCACAATCGATACAATTAGCCCGCGGATATGAATCTCATTCGGATTCCGATGAGGAAATAAACGTTCATGATGAGAGTGAAGACGAGGCCGAGAAACAGATAAATGCGATGAGATCTAGATCACCGAGCCCGAGCCGACATAGAATGACGACAACCAATGACGTGCCGCTGCAATTAACTAAGCATGACCGTTGA

Protein sequence:

>DPOGS215998-PA
MLISAGNGPIPEFSPSSCCASWKMEPAPAASITQPLNISPNVPVSPSSSGGTNTLYAGASHPLSSLLSQQRLLELSRFGLRHYDIAHHVLSQQGAVTKLLGTLRPPGLIGGSKPKVATPAVVSKIEQYKRENPTIFAWEIRERLISEGVCTNATAPSVSSINRILRNRAAERAAAEFARAAGYGLYAAPPPYGGFPWASGGGVWPPGSLPLPPGVPPSSVGVPHPDAVKQGFLSSSGRSLIDVDGDDSGSLDGEQPKFRRNRTTFSPDQLEELEKEFEKSHYPCVSTRERLASKTSLSEARVQVWFSNRRAKWRRHQRMNLLKRGGSPSHRLPHSPSRSRSRSLSPTRIPYHAPQMGGENSAFKALGHQDTNTLKALTHQSQFETNALKALSQQTTFDSNPFKSHPALENSAFKALVPNSAAAALLAAQSIQLARGYESHSDSDEEINVHDESEDEAEKQINAMRSRSPSPSRHRMTTTNDVPLQLTKHDR-