Monarch geneset OGS2.0

DPOGS200422
TranscriptDPOGS200422-TA858 bp
ProteinDPOGS200422-PA285 aa
Genomic positionDPSCF300236 - 137373-149513
RNAseq coverage216x (Rank: top 45%)
Annotation
HeliconiusHMEL0024952e-11294.21% 
BombyxBGIBMGA008996-TA4e-5448.80% 
DrosophilaCG10348-PB2e-5065.75% 
EBI UniRef50UniRef50_D6WWQ29e-5346.08%Putative uncharacterized protein n=2 Tax=Tribolium castaneum RepID=D6WWQ2_TRICA
NCBI RefSeqXP_001815516.11e-5346.08%PREDICTED: similar to CG10348 CG10348-PA [Tribolium castaneum]
NCBI nr blastpgi|1892401852e-5246.08%PREDICTED: similar to CG10348 CG10348-PA [Tribolium castaneum]
NCBI nr blastxgi|2700122969e-5747.08%hypothetical protein TcasGA2_TC006419 [Tribolium castaneum]
Group
Gene OntologyGO:00036761.3e-12nucleic acid binding
GO:00082701.7e-08zinc ion binding
GO:00056221.7e-08intracellular
KEGG pathwayspu:5936716e-50 
 K04462 (EVI1)maps-> Pathways in cancer
    MAPK signaling pathway
    Chronic myeloid leukemia
InterPro domain[161-188] IPR0130871.3e-12Zinc finger, C2H2-type/integrase, DNA-binding
[162-184] IPR0070871.7e-08Zinc finger, C2H2
Orthology groupMCL34354 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200422-TA
ATGGGTGGTAACAAGCACATCCCTTTGAATAAGTGCACATACGGTAGTGATGCTGTACGTAAAACAGGTCGGTGGCGCACGAGCCTGATGAGTTCGTGCGCGGCGGCCGCCATGCGCGAAAAGCACGCGGCGCCGCGCCGCCTCGCCTCTCCTCCGACCTCGCCCCTCCAGCGACACCCTAGCCCTCTTGGTTCACCTCGCCCTGACGAGCCCCTTGACCTCAGAGTCACACACAAGCGACCGCAACGGCTGGAGGATGAGAACTGTAATCTCATTCCCTCACCGCCACCTCACCCTACTCATCCAGCACATCCAGCGCATCCAGCTCTGCTTCAGTTCTGCCGACGATTACCCTTGGCTCTACCGGCGTCTTTCGGCCGTTACCCCTTCTTACCCGCAGCAGCTGCGACCCTTCTAGCGCCAGGGGCGCCCAGAGCTCCTCCCGTTCCCCAAAATGCCGGAGTGAATAGAGCCCGGGATCGGTACACGTGCTCCTATTGTGGAAAGGCGTTCCCGCGAAGTGCTAATCTGACGAGACATTTACGGACACATACAGGCGAACAGCCTTATCGCTGCAAATATTGTGAACGATCATTCTCAATATCTTCGAACCTGCAAAGACACGTGAGGAATATTCATAACAAGGAACGGCCGTTTCGATGTCGTCACTGCGACCGCTGTTTCGGCCAGCAGACAAATCTGGATAGGCATCTCAAAAAGCACGAAGCCGAGAACGGGGACGATAATCGTCGCAGATCGCCAGAAGAGACCTACTTTGAAGAAATAAGGTCGTTCATGGGACGTGTGGCGCCAGCAAGGCGGACTGCCGCCGCCGCAAGTATCGCAGACCACACCTGA

Protein sequence:

>DPOGS200422-PA
MGGNKHIPLNKCTYGSDAVRKTGRWRTSLMSSCAAAAMREKHAAPRRLASPPTSPLQRHPSPLGSPRPDEPLDLRVTHKRPQRLEDENCNLIPSPPPHPTHPAHPAHPALLQFCRRLPLALPASFGRYPFLPAAAATLLAPGAPRAPPVPQNAGVNRARDRYTCSYCGKAFPRSANLTRHLRTHTGEQPYRCKYCERSFSISSNLQRHVRNIHNKERPFRCRHCDRCFGQQTNLDRHLKKHEAENGDDNRRRSPEETYFEEIRSFMGRVAPARRTAAAASIADHT-