Monarch geneset OGS2.0

DPOGS209032
TranscriptDPOGS209032-TA909 bp
ProteinDPOGS209032-PA302 aa
Genomic positionDPSCF300102 - 212059-218632
RNAseq coverage195x (Rank: top 48%)
Annotation
HeliconiusHMEL0046473e-10764.92% 
BombyxBGIBMGA010033-TA9e-3846.95% 
DrosophilaNK7.1-PA7e-1779.59% 
EBI UniRef50UniRef50_D6W7002e-2542.52%Nk homeobox 7 n=1 Tax=Tribolium castaneum RepID=D6W700_TRICA
NCBI RefSeqXP_001656642.11e-2337.56%hypothetical protein AaeL_AAEL003277 [Aedes aegypti]
NCBI nr blastpgi|2700151537e-2542.52%nk homeobox 7 [Tribolium castaneum]
NCBI nr blastxgi|1571354049e-2639.45%hypothetical protein AaeL_AAEL003277 [Aedes aegypti]
Group
Gene OntologyGO:00036773.1e-12DNA binding
GO:00063553.1e-12regulation of transcription, DNA-dependent
GO:00055157.5e-11protein binding
GO:00435652.7e-10sequence-specific DNA binding
GO:00037002.7e-10sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[232-302] IPR0122873.1e-12Homeodomain-related
[232-302] IPR0090577.5e-11Homeodomain-like
[259-302] IPR0013562.7e-10Homeobox
Orthology groupMCL20333 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209032-TA
ATGAGCGCGGAAAGAGACGCGGCCGAATGTGAACCAGCCACTGGAACCGCCAGCGGAGAGATGCAGGCCGTCGTCTCCGGCTATCTGACGCAGAGTCTCCTCATCACACACGGCGGACACTCCTACCCGGACTACAGCTACGGCGACGACGACAGCCTGACGCCGGAGGAGAAGATCACCGTAGACGACAAGGAAGAAGAACTGGTCATCGCCGACGATGATGAAGGCCGAATCGACGTGGAGGACACCGGACATGACGGCGGGCGGGCGGAGGACAGCGGGGAGCTGTCCTGGCATCCGCACGTATATGGAAAACCGCCCAAGAAACCGACTCCACATACCATAGAATACATATTAGGATTATCGAAGAGCAATGAAGCGAGGGAGGAGAAAAGGTCCGCGGTCAGCCAACTCATGAACGTCAAGAGGAACTTCGATAAGAAGACGTTCGGGTGTCAGGAAAAGGGGGTTCAGGTGCAAGAGGGGGGCGAGCGGAAGATGAGTGTACACAAGAATAGGCTTCAGGAGCAACTGCTGCAGAGAGGGGCGCGGACCGGAGACGGAGCATACGACAAGCTGCACGGGAAGACCGACGAGCCCCTCAACCTGTCCGTCAACAAGGCCAAGGACTCGCCCACGTGGACTGGCGACGACGAAGACAAATATGGCAGAGACCCCAAACTAAAACGAAAGAAGTCGACGGAGGACGTTCGGTCTCCTCTGGGGGGCGAGGGCTCCGTGACCAGCGAGGAGGGCGAGGAGGCCGGCAGGAGGAAGAAAGCCAGGACCACTTTCACCGGCAGACAGATCTTCGAGCTGGAGAAGCTGTTCGAGGTCAAGAAGTACCTCTCGTCCGGGGAACGAGCCGATATGGCTAAACTGCTGAACGTCACTGAGACTCAGGTATGA

Protein sequence:

>DPOGS209032-PA
MSAERDAAECEPATGTASGEMQAVVSGYLTQSLLITHGGHSYPDYSYGDDDSLTPEEKITVDDKEEELVIADDDEGRIDVEDTGHDGGRAEDSGELSWHPHVYGKPPKKPTPHTIEYILGLSKSNEAREEKRSAVSQLMNVKRNFDKKTFGCQEKGVQVQEGGERKMSVHKNRLQEQLLQRGARTGDGAYDKLHGKTDEPLNLSVNKAKDSPTWTGDDEDKYGRDPKLKRKKSTEDVRSPLGGEGSVTSEEGEEAGRRKKARTTFTGRQIFELEKLFEVKKYLSSGERADMAKLLNVTETQV-