Monarch geneset OGS2.0

DPOGS206628
TranscriptDPOGS206628-TA972 bp
ProteinDPOGS206628-PA323 aa
Genomic positionDPSCF300048 - 729779-739602
RNAseq coverage37x (Rank: top 73%)
Annotation
HeliconiusHMEL0123051e-15183.38% 
BombyxBGIBMGA010706-TA3e-2253.04% 
Drosophilavnd-PA9e-4071.05% 
EBI UniRef50UniRef50_D2A2S49e-4848.79%Putative uncharacterized protein GLEAN_07014 n=1 Tax=Tribolium castaneum RepID=D2A2S4_TRICA
NCBI RefSeqXP_967738.12e-4848.79%PREDICTED: ventral nervous system defective [Tribolium castaneum]
NCBI nr blastpgi|910818573e-4748.79%PREDICTED: ventral nervous system defective [Tribolium castaneum]
NCBI nr blastxgi|910818572e-5148.10%PREDICTED: ventral nervous system defective [Tribolium castaneum]
Group
Gene OntologyGO:00036771e-21DNA binding
GO:00063551e-21regulation of transcription, DNA-dependent
GO:00435651.7e-21sequence-specific DNA binding
GO:00037001.7e-21sequence-specific DNA binding transcription factor activity
GO:00055153.1e-21protein binding
KEGG pathwaytca:6560945e-48 
 K08029 (NKX2-2)maps-> Maturity onset diabetes of the young
InterPro domain[176-254] IPR0122871e-21Homeodomain-related
[196-258] IPR0013561.7e-21Homeobox
[195-269] IPR0090573.1e-21Homeodomain-like
Orthology groupMCL17511 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206628-TA
ATGAAAATGATTGGCGACGGTGAAATGGGTTATGGGGACTATTATAATTATGAAAATAACTGGTCGGTGGTGCCAACGGATTATGGGGCTGAATCGTGTCAGTACAGACAGCCACAAGTCGAAGACGACTACCGATATGGAACTTACGCATTGCTTGACAGCATGAAAATGCCCGGTCAGCGCCCCGGTTTCCAAATATCCGATATTCTCGGCTTAAATGAGGCTAAAGGATTGGAACCACCTCCTCAAGGAGGACTAAGCGGCCTGGAGTTACCTCCGTATGCTCCTCCGCATCACAACTACCCCCATGAACTACTCCGACATCATCAGCCTTGGTTATCATTAGATCAGCACGATGGCACAGGTATGCTTGGACAGCAGGCGAGTCCTGACAGTACTTCCAGAGCATCTGAATTGTCATACGTTGGTCCGTCAGCAGCTTCTCCCACAGTGACTGACCCGCGCCACGACCACGACCTGGAACAGGAACATGATCATGACATCCACGATCACAGCCTGGAGTTAGACGACGATAACGACAACGATCAACCTAACACAGCCTCCGAGTCAAATCTATCGCACAAGAAACGCAAACGCAGAGTACTCTTCTCCAAAGCCCAAACATACGAGTTGGAACGACGTTTCAGACAACAGAGATACCTCTCAGCGCCTGAGCGGGAACACTTGGCTAGTTTAATACGCCTGACGCCGACGCAAGTAAAAATCTGGTTTCAGAACCACAGATACAAGACAAAACGTGCAGTTCAAGAGAAAGGCGCCCATGATTTGAACGTGGGCGGTCTTAATTCACCGCGCCGTGTGGCGGTGCCGGTGTTGGTGAAGGACGGTAGGCCGTGTATCGGAAAGCCCGACGGCCTACCACCGTTGGGGATGACGCTGCCACCGTACCAGCCGATGCATCACCAGCCCCCCGTCACGGGTCACGGACCTCAACCAGGTTGTTGGTGGTGA

Protein sequence:

>DPOGS206628-PA
MKMIGDGEMGYGDYYNYENNWSVVPTDYGAESCQYRQPQVEDDYRYGTYALLDSMKMPGQRPGFQISDILGLNEAKGLEPPPQGGLSGLELPPYAPPHHNYPHELLRHHQPWLSLDQHDGTGMLGQQASPDSTSRASELSYVGPSAASPTVTDPRHDHDLEQEHDHDIHDHSLELDDDNDNDQPNTASESNLSHKKRKRRVLFSKAQTYELERRFRQQRYLSAPEREHLASLIRLTPTQVKIWFQNHRYKTKRAVQEKGAHDLNVGGLNSPRRVAVPVLVKDGRPCIGKPDGLPPLGMTLPPYQPMHHQPPVTGHGPQPGCWW-