Monarch geneset OGS2.0

DPOGS206411
TranscriptDPOGS206411-TA1278 bp
ProteinDPOGS206411-PA425 aa
Genomic positionDPSCF300181 - 204556-223231
RNAseq coverage56x (Rank: top 69%)
Annotation
HeliconiusHMEL0068962e-7097.16% 
BombyxBGIBMGA013872-TA2e-9976.43% 
Drosophilatup-PA4e-11161.67% 
EBI UniRef50UniRef50_P920316e-10861.11%LIM homeobox protein n=36 Tax=Coelomata RepID=P92031_DROME
NCBI RefSeqXP_319393.48e-13456.68%AGAP010209-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582992791e-13256.68%AGAP010209-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1571194852e-14161.81%insulinprotein enhancer protein isl [Aedes aegypti]
Group
Gene OntologyGO:00082707.7e-22zinc ion binding
GO:00063551.2e-19regulation of transcription, DNA-dependent
GO:00435651.2e-19sequence-specific DNA binding
GO:00037001.2e-19sequence-specific DNA binding transcription factor activity
GO:00055152e-18protein binding
GO:00036771.2e-17DNA binding
KEGG pathway 
InterPro domain[8-107] IPR0017817.7e-22Zinc finger, LIM-type
[258-320] IPR0013561.2e-19Homeobox
[256-331] IPR0090572e-18Homeodomain-like
[231-316] IPR0122871.2e-17Homeodomain-related
Orthology groupMCL15245 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206411-TA
ATGATAGAAAAGCTCCGGCTGTCTCTGTGTGTGGGCTGCGGCGGCCAGATCCACGACCAGTACATCCTGAGGGTGGCCCCGGATTTGGAGTGGCACGCCGCATGCCTCAAGTGTCAGGAGTGCAGGCAGTTCCTCGACGAGTCCTGCACATGCTTTGTCAGGGATGGAAAGACTTATTGCAAGAGGGATTATACCAGATTATTCGGGACCAAGTGTGATAAATGCGGTTCATCATTCAGCAAGAACGACTTCGTGATGAGAGCAAAGACGAAGATATATCATATAGACTGCTTCAGATGCTGCGCTTGCGCACGACAACTTATACCCGTCATTCTGGACTACGCAACTGTTTCAAGATTCGATTCCCGGCCGGGTATAGCCGTAAATCTAGAAATAAAATTATTCGGGACCAAGTGTGATAAATGCGGTTCATCATTCAGCAAGAACGACTTCGTGATGAGAGCAAAGACGAAGATATATCATATAGACTGCTTCAGATGCTGCGCTTGCGCACGACAACTTATACCCGGTGACGAGTTCGCGTTGAGAGAAGGCGGAGCTTTATATTGTAGAGAAGATCACGATGTATTAGAAAAGAGCGCTAACACAAGCGGCAGCAGCGCCGGCAACGCCGAGAGCAACAACAACACAACACTCAGCAACAACAATTCGCATCACCCGCACGAGTTAGGATCTATGTCGGATTCAGGAAGTGAGTCTGGCTCGCATAAGAGTGGAAGAGCCAGGGCTGGCGCTGCGGCTGATGGTAAACCCACCAGGGTGAGGACTGTCCTCAATGAGAAACAATTACACACACTAAGAACCTGTTATGCTGCGAATCCTAGACCTGACGCTCTCATGAAGGAACAGCTGGTTGAAATGACAGGTCTTAGTCCTCGAGTGATAAGAGTGTGGTTCCAGAACAAGAGATGCAAAGACAAGAAGAAGACTATACAGCTGAAGATGCAGATGCAGCAAGAGAAGGAAGGCCGCCGTTTGGGCTATATGTCTATGGGAGTGCCGTTAGTGGCCGGTTCGCCTGTAAGACATGAGGCTGGGTCTCTAGCTCTAGAGGTGACGGCGTATCAGCCGCCGTGGAAGGCCCTCAGCGACTTCGCACTCCACGCGGACCTTGACAGGCCTCAACACAGCGCCGCCTTCCAACAGCTCGTGAACCAGATGCACGGTTACGACATCCCCTCTCTGCCCCCTCCACGTCACGAGGACAACTACGTCACCTATCTCGAGAGTGACGACAGTCTGCCGCCGTCACCCTAG

Protein sequence:

>DPOGS206411-PA
MIEKLRLSLCVGCGGQIHDQYILRVAPDLEWHAACLKCQECRQFLDESCTCFVRDGKTYCKRDYTRLFGTKCDKCGSSFSKNDFVMRAKTKIYHIDCFRCCACARQLIPVILDYATVSRFDSRPGIAVNLEIKLFGTKCDKCGSSFSKNDFVMRAKTKIYHIDCFRCCACARQLIPGDEFALREGGALYCREDHDVLEKSANTSGSSAGNAESNNNTTLSNNNSHHPHELGSMSDSGSESGSHKSGRARAGAAADGKPTRVRTVLNEKQLHTLRTCYAANPRPDALMKEQLVEMTGLSPRVIRVWFQNKRCKDKKKTIQLKMQMQQEKEGRRLGYMSMGVPLVAGSPVRHEAGSLALEVTAYQPPWKALSDFALHADLDRPQHSAAFQQLVNQMHGYDIPSLPPPRHEDNYVTYLESDDSLPPSP-