Monarch geneset OGS2.0

DPOGS211229
TranscriptDPOGS211229-TA1149 bp
ProteinDPOGS211229-PA382 aa
Genomic positionDPSCF300007 + 1400287-1408098
RNAseq coverage30x (Rank: top 76%)
Annotation
HeliconiusHMEL0093850.093.94% 
BombyxBGIBMGA003208-TA9e-14388.26% 
DrosophilaLim3-PB3e-13366.02% 
EBI UniRef50UniRef50_G6DG540.0100.00%Lim homeobox protein n=26 Tax=Eumetazoa RepID=G6DG54_DANPL
NCBI RefSeqXP_001863808.13e-15071.39%lim homeobox protein [Culex quinquefasciatus]
NCBI nr blastpgi|3123743112e-14970.70%hypothetical protein AND_16082 [Anopheles darlingi]
NCBI nr blastxgi|1700559434e-15071.39%lim homeobox protein [Culex quinquefasciatus]
Group
Gene OntologyGO:00063559.1e-23regulation of transcription, DNA-dependent
GO:00435659.1e-23sequence-specific DNA binding
GO:00037009.1e-23sequence-specific DNA binding transcription factor activity
GO:00055151.7e-20protein binding
GO:00036771.8e-20DNA binding
GO:00082701.4e-17zinc ion binding
KEGG pathway 
InterPro domain[168-230] IPR0013569.1e-23Homeobox
[152-227] IPR0090571.7e-20Homeodomain-like
[156-228] IPR0122871.8e-20Homeodomain-related
[96-153] IPR0017811.4e-17Zinc finger, LIM-type
Orthology groupMCL10910 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211229-TA
ATGCTGGGCTCCATGATGTACCCCGGCGCGGAAGACGAGCTCGACATGCGCGTACCACCCATCCAACTAGAACACCTACCTGAAGTGTTCCTATCCAGCATACCAAAATGTGGAGGCTGTCATGAGATGATAGTAGATCGATACGTGCTAAAAGTGTCAGACAGGACTTGGCACGCTGGCTGTTTGAGGTGTGTCGAGTGTCGAGCCATGCTGTCAGGAAAGTGCTTCGCTAGAAATAACCAGCTCTACTGCACCGAAGACTTCTTCAAGCGTTACGGCACTAAGTGCGCGGGCTGCGGGCAGGGCATCCCGCCGACACAGGTCGTACGGCGGGCCCAGGCTCACGTTTACCATCTACGGTGTTTCGCATGTGCTGCCTGTGCACGCACACTTAATACCGGAGACGAGTTTTATCTAATGGAGGATGGCAAGCTTGTTTGCAAACCGGACTATGAAGCTGCGAGAGCGAAAGGTGAAGGTTCGTTGGATGGAGACGCAGCTAGCAAGAGACCGCGAACTACAATTACTGCAAAACAGCTTGAAACTCTTAAAAGTGCTTATAGCAGCAGTCCAAAGCCAGCTCGCCACGTGAGGGAACAGCTTGCTCAAGATACAGGCTTAGATATGAGGGTCGTTCAAGTTTGGTTTCAAAATCGGAGGGCAAAAGAGAAGCGACTAAAGAAGGACGCGGGTCGAACAAGATGGTCACAATACTTCAGATCTATGAAGGGCGGAGGAAGTGGCTCACCGCGTCACGATCGTCTTCTGGATAAAGACGAACTGAAAATAGATTTAGACTCTTTCAGTCATCATGAGCTAAGCAACGATAGTTATAGCACTGCTGCGCTGGGTGGTGAGGAGGGATCACCAGCCGGCGGCGCGGCGGGAGGGACTCGGTTTGGTACTACGCCACCATATCTTCGTGCTCATTCACCCCCCCACGCACATTATCATTATCCCCCCGATCACCTCGTATATACCAATATCGGTCAATCAATGGGTGGCGCTGGTTTAGGCGTGGGGGCGGGGGGTGCGTCTGATATGAGCAGCTCTTCTTCTCCTGCGGCTGGTGGTTACCCTGACTTTCCTCCGTCTCCGGACTCTTGGCTCGGCGAACCACATCACTATTCACCCCGAGGCTACCCCTAG

Protein sequence:

>DPOGS211229-PA
MLGSMMYPGAEDELDMRVPPIQLEHLPEVFLSSIPKCGGCHEMIVDRYVLKVSDRTWHAGCLRCVECRAMLSGKCFARNNQLYCTEDFFKRYGTKCAGCGQGIPPTQVVRRAQAHVYHLRCFACAACARTLNTGDEFYLMEDGKLVCKPDYEAARAKGEGSLDGDAASKRPRTTITAKQLETLKSAYSSSPKPARHVREQLAQDTGLDMRVVQVWFQNRRAKEKRLKKDAGRTRWSQYFRSMKGGGSGSPRHDRLLDKDELKIDLDSFSHHELSNDSYSTAALGGEEGSPAGGAAGGTRFGTTPPYLRAHSPPHAHYHYPPDHLVYTNIGQSMGGAGLGVGAGGASDMSSSSSPAAGGYPDFPPSPDSWLGEPHHYSPRGYP-