Monarch geneset OGS2.0

DPOGS213104
TranscriptDPOGS213104-TA1197 bp
ProteinDPOGS213104-PA398 aa
Genomic positionDPSCF300016 + 112211-113653
RNAseq coverage73x (Rank: top 66%)
Annotation
HeliconiusHMEL0065532e-14267.92% 
BombyxBGIBMGA007864-TA4e-13159.32% 
Drosophilasug-PA5e-6167.74% 
EBI UniRef50UniRef50_F4X8104e-6673.08%Zinc finger protein GLIS2 n=5 Tax=Myrmicinae RepID=F4X810_ACREC
NCBI RefSeqXP_969964.13e-6667.42%PREDICTED: similar to AGAP006736-PA [Tribolium castaneum]
NCBI nr blastpgi|3838484241e-6866.49%PREDICTED: uncharacterized protein LOC100875925 [Megachile rotundata]
NCBI nr blastxgi|3838484242e-7166.49%PREDICTED: uncharacterized protein LOC100875925 [Megachile rotundata]
Group
Gene OntologyGO:00036761.4e-15nucleic acid binding
GO:00082701.4e-05zinc ion binding
GO:00056221.4e-05intracellular
KEGG pathway 
InterPro domain[151-183] IPR0130871.4e-15Zinc finger, C2H2-type/integrase, DNA-binding
Orthology groupMCL17711 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213104-TA
ATGTTAGTGTTTCCGCATTTGAGGTCAGATATACGTAACGAGGGGTGCAGGACGTCAAGATGGATGCCTGCCGAGAGATTCGCTCCTTATTTCGCGAGACCAACTATAATCAAACATTTGCCCCCGCAGGACGAGGATGACGACCTGTCTGGTGGATCATCCCCAGAGCATGTAACTGAGTCAATGGACAGTATTGGTGTATGCGGTTGGCGGGGGTGTATGGCACGGTTTGCGAGCATCCCCCGCCTTTCAGCACACATCGCGCGAGTGCACGCGCATGCACACAGAGACGGATTGTTCTATTGCAACTGGAAGGGTTGTTCTAGACCGCAACGGGGTTTTAACGCCAGGTACAAAATGTTGGTGCATGTAAGGACTCACACAAACGAACGTCCACATACCTGTAATCAGTGTCAAAAAAGCTTTTCAAGAGCTGAAAATCTTAAAATACATCTCCGTTCACATTCCGGAGAAAAACCATACGTTTGCCCTTATGAGGGATGCGGGAAAGCGTACTCAAATTCAAGCGACAGGTTCAAGCACACACGTACGCACACCGTGGATAAGCCTTACTGCTGCAAAATACCGGGATGCAACAAACGTTACACGGATCCGTCCAGCTTAAGGAAACATGTCAAGACTTACAAACACTTCGTCAATGATAAGGACCTGCCGAAACATGAAATGACGGAAGAAAGAGCGAGTTCTGATAGCTCTTCACCGCTTCGGGAAAACAGCGGATACGGCCAGCCCTCACCTCGTATGTACATCCCAAATTGTATACCCCCATCAAAATCCTCATACAATACCTACTTGCCGTATTCGAACCCCGCGTTAGGAGCGCCTTTAGGTTACTTACCAGTGTTAGAACAAGTTCCGTGTGTGCGGCCATTAGAAAGGGTTTTTCCCATGAGAGTCCCGCCCCACGACATGCCGTATACAGACTACCACCTCTACGAGGATCATTACCGGAACGTCTACCCTTACGGCCTAAATTATTCACCCTTAAGGCCGAATGTCAATGAAAAAGTATTATATGAAGAGCAGGAAACAATCGAGGAGATCCAAGTTGTTGATACTGAAAAACACGTAGAAGTACCTTTAAACTTAATATGTAGCAAACGACTGGAATTCAAGCCAGTCGACGATCTAGTCAGACACACAGACTTGCCTCTAGATTTAAGTACTAAGAGTTAA

Protein sequence:

>DPOGS213104-PA
MLVFPHLRSDIRNEGCRTSRWMPAERFAPYFARPTIIKHLPPQDEDDDLSGGSSPEHVTESMDSIGVCGWRGCMARFASIPRLSAHIARVHAHAHRDGLFYCNWKGCSRPQRGFNARYKMLVHVRTHTNERPHTCNQCQKSFSRAENLKIHLRSHSGEKPYVCPYEGCGKAYSNSSDRFKHTRTHTVDKPYCCKIPGCNKRYTDPSSLRKHVKTYKHFVNDKDLPKHEMTEERASSDSSSPLRENSGYGQPSPRMYIPNCIPPSKSSYNTYLPYSNPALGAPLGYLPVLEQVPCVRPLERVFPMRVPPHDMPYTDYHLYEDHYRNVYPYGLNYSPLRPNVNEKVLYEEQETIEEIQVVDTEKHVEVPLNLICSKRLEFKPVDDLVRHTDLPLDLSTKS-