Monarch geneset OGS2.0

DPOGS212323
TranscriptDPOGS212323-TA675 bp
ProteinDPOGS212323-PA224 aa
Genomic positionDPSCF300019 - 940352-941303
RNAseq coverage111x (Rank: top 59%)
Annotation
HeliconiusHMEL0133701e-7983.59% 
BombyxBGIBMGA012004-TA4e-5661.72% 
Drosophiladm-PA1e-2059.09% 
EBI UniRef50UniRef50_F8WT287e-5461.72%C-myc n=2 Tax=Obtectomera RepID=F8WT28_BOMMO
NCBI RefSeqXP_001853504.12e-2355.21%conserved hypothetical protein [Culex quinquefasciatus]
NCBI nr blastpgi|3796989322e-5361.72%c-myc [Bombyx mori]
NCBI nr blastxgi|3796989324e-6266.50%c-myc [Bombyx mori]
Group
Gene OntologyGO:00056349.4e-11nucleus
GO:00063559.4e-11regulation of transcription, DNA-dependent
GO:00037005.8e-05sequence-specific DNA binding transcription factor activity
KEGG pathwayrno:245771e-08 
 K04377 (MYC)maps-> Colorectal cancer
    Thyroid cancer
    MAPK signaling pathway
    Small cell lung cancer
    Pathways in cancer
    Wnt signaling pathway
    Acute myeloid leukemia
    Endometrial cancer
    Bladder cancer
    TGF-beta signaling pathway
    ErbB signaling pathway
    Cell cycle
    Jak-STAT signaling pathway
    Chronic myeloid leukemia
InterPro domain[127-207] IPR0115989.4e-11Helix-loop-helix DNA-binding
[138-189] IPR0010924.2e-09Helix-loop-helix DNA-binding domain
Orthology groupMCL26771 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212323-TA
ATGTCGATAGACGCTGCGGTAGTACGACGACGGGGTCGACGCCCTCCCCCCTTCCCCCTCATACCCCCACCCTCTCACCCACCGCCCCCCGACTACTGCAATCGCATCCCACCACGGGTACACGAGGTCCAGGAGGAAATTGATGTGGTATCGTTGGGGACATCCCAGCAGCCGCTGGCCTCAACGTCGCCCGTGAGAGTCGACTCGCTGCCCCGAGCACCTTCCGTTCAGGAGCGTCACCACATACAGCGCACTGTGGAGAACGTTATAACGAGACCCCCAGCCAGGAAACGGCTGGTGCCACCTACCAGCATACCCACGGCCTCCGTGCCACGGCGGAGGAGAGGTCCCGGCAGACGGGGGCGCTCCAACACAGACACGGACTCGGAGGCCGAGTCTCCGGAGATCGAACGCAGATCCATACACAATGATATGGAACGGTTGAGACGCATCGGCCTAAAGAACCTCTTCGACGAGCTCAAGAAGCAAATACCCGCAACCAGAGATAAGGAGCGAGCGCCGAAGGTCGTGATACTGCGGGAGGCGGCGGCCCTGTGCCGCAAGCTGCGAGAGGAGGACCTCGAGCGGGAGTACCTTAAAAAACAACAGAATAAATTGATGACCAAACTGAAAAAAATGCGTACGATGCTCGCGGCACGATACCGGCCGTATTGA

Protein sequence:

>DPOGS212323-PA
MSIDAAVVRRRGRRPPPFPLIPPPSHPPPPDYCNRIPPRVHEVQEEIDVVSLGTSQQPLASTSPVRVDSLPRAPSVQERHHIQRTVENVITRPPARKRLVPPTSIPTASVPRRRRGPGRRGRSNTDTDSEAESPEIERRSIHNDMERLRRIGLKNLFDELKKQIPATRDKERAPKVVILREAAALCRKLREEDLEREYLKKQQNKLMTKLKKMRTMLAARYRPY-