Monarch geneset OGS2.0

DPOGS213556
TranscriptDPOGS213556-TA897 bp
ProteinDPOGS213556-PA298 aa
Genomic positionDPSCF300033 - 152189-154105
RNAseq coverage237x (Rank: top 43%)
Annotation
HeliconiusHMEL0217111e-6163.78% 
BombyxBGIBMGA011837-TA2e-10771.00% 
DrosophilaCG11294-PA4e-5194.23% 
EBI UniRef50UniRef50_B4L3B21e-5050.39%GI15515 n=5 Tax=Endopterygota RepID=B4L3B2_DROMO
NCBI RefSeqXP_001991987.12e-5450.77%GH24457 [Drosophila grimshawi]
NCBI nr blastpgi|1950455123e-5350.77%GH24457 [Drosophila grimshawi]
NCBI nr blastxgi|1950455121e-5651.81%GH24457 [Drosophila grimshawi]
Group
Gene OntologyGO:00036779e-26DNA binding
GO:00063559e-26regulation of transcription, DNA-dependent
GO:00435651.5e-25sequence-specific DNA binding
GO:00037001.5e-25sequence-specific DNA binding transcription factor activity
GO:00055151.1e-23protein binding
KEGG pathwaynvi:1001194753e-45 
 K12373 (HEX)maps-> Lysosome
    Glycosaminoglycan degradation
    Amino sugar and nucleotide sugar metabolism
    Glycosphingolipid biosynthesis - globo series
    Other glycan degradation
    Glycosphingolipid biosynthesis - ganglio series
InterPro domain[21-87] IPR0122879e-26Homeodomain-related
[24-86] IPR0013561.5e-25Homeobox
[15-93] IPR0090571.1e-23Homeodomain-like
Orthology groupMCL23640 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213556-TA
ATGGCGGGACCGTATTTGTCACCGTTGCCGGGGGACCTTCTGACCGAGTACATGTTCGGCAGACGACGGCAGAGACGGAACAGGACGACTTTCACTCCGCAGCAGTTGAGCGAGCTGGAGTCATTATTCCAGAAGACTCATTACCCTGATGTGTTCCTCAGGGAAGAAGTCGCGTTGAGAATCAGTCTATCTGAAGCGAGAGTGCAAGTCTGGTTTCAAAATCGACGAGCTAAATGGCGTAAGCAAGCTCGCCTCCAGCTTTTGCAGGACGCCTGGAGAATGAGGTGTTTGGGGCTCGGGACACCTCCTCTTGGACTGCCAGGTCACACAAAACCACCTGAATCGTCACCAGAAGGAGGGGAATCGCCGCCACCCAAAGACAATTTAGAAACACAGTCTAATGAAAACAGTGCTGGTATGAAATCACCCCAACAGGAGACTTACCACCATAACGGAAATATGTCCGACAATCTACCCCTGCCCCATGACATACCACCCCCCATGCCGTACGGCAACGACGACAGGATACTTCAACAGCCACCGTTCCCGTTCCCGCCATTACTGATGCCCCAGCAGATGGACAACAACGAAATAGCGCCAACTGACTTGAGAATGGCGAAGAGCAATCCCTGTCATTGCACAGGAGAAGAAATAAATGAGCCGGAAAACCTAGCATACCGGAAACGAAGCGAGGACGTCAGTGACAGCGAAAGCGATACGGAAATAGATCTCACGTCACACTCCAAAGAGGATGATTTAAGAGAATCAGAAAGACTCGCAAAGAGGATAGCAACAAATATGTTCAGAAGTGACACCGTGCTGCACGACCTGTTACCTAACGATCTGACGATGGCCGTTAAGAATAACGATAGGAATGGTAGAAATGTAACGATGTAA

Protein sequence:

>DPOGS213556-PA
MAGPYLSPLPGDLLTEYMFGRRRQRRNRTTFTPQQLSELESLFQKTHYPDVFLREEVALRISLSEARVQVWFQNRRAKWRKQARLQLLQDAWRMRCLGLGTPPLGLPGHTKPPESSPEGGESPPPKDNLETQSNENSAGMKSPQQETYHHNGNMSDNLPLPHDIPPPMPYGNDDRILQQPPFPFPPLLMPQQMDNNEIAPTDLRMAKSNPCHCTGEEINEPENLAYRKRSEDVSDSESDTEIDLTSHSKEDDLRESERLAKRIATNMFRSDTVLHDLLPNDLTMAVKNNDRNGRNVTM-