Monarch geneset OGS2.0

DPOGS208615
TranscriptDPOGS208615-TA1500 bp
ProteinDPOGS208615-PA499 aa
Genomic positionDPSCF300052 + 554251-563969
RNAseq coverage485x (Rank: top 26%)
Annotation
HeliconiusHMEL0165881e-11087.45% 
BombyxBGIBMGA005720-TA1e-4281.74% 
Drosophilapan-PJ6e-4864.74% 
EBI UniRef50UniRef50_UPI00020627AC7e-7464.71%UPI00020627AC related cluster n=3 Tax=unknown RepID=UPI00020627AC
NCBI RefSeqXP_002059670.17e-7961.57%GJ21981 [Drosophila virilis]
NCBI nr blastpgi|1954021461e-7761.57%GJ21981 [Drosophila virilis]
NCBI nr blastxgi|1954021465e-8161.07%GJ21981 [Drosophila virilis]
Group
Gene OntologyGO:00036771.7e-28DNA binding
GO:00055151.6e-21protein binding
KEGG pathwaydre:305231e-46 
 K04490 (TCF7L1)maps-> Basal cell carcinoma
    Colorectal cancer
    Prostate cancer
    Thyroid cancer
    Adherens junction
    Arrhythmogenic right ventricular cardiomyopathy (ARVC)
    Pathways in cancer
    Wnt signaling pathway
    Acute myeloid leukemia
    Endometrial cancer
    Melanogenesis
InterPro domain[167-249] IPR0009101.7e-28High mobility group, HMG1/HMG2
[167-236] IPR0090711.6e-21High mobility group, superfamily
Orthology groupMCL11553 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208615-TA
ATGGGTCTACCAGCTGGAGGCAGCCTCTCCTTATTCTGTCCCGGAGGAGACTTGGGCCAACCTCCACCAGCACACATGGGCATCCCCCCTCCATACCAGCTCGATTCAAAATTAGCGGCCGGTCTGGCTCGTACGCCAATGTACACCTTCCCCGGGGGAACGTATCCCTATCCCATGCTCAGTCCCGAGATGTCGCAAGTGGCTTCCTGGCATACTCCCAGTATGTATCCGATATCATCAGCGGCCAGTGGATTCCGCAGTCCATACCCTACCACGCTGCCAATCAGCACATCTAGTCTGTCAAGCGAGCTGTACCGGTTTTCACCGACCTTGGGCGGGAGTCTGGGTCTGGGTTTGAGTCCAGCTCTGGTGCCGCCGCCCCCATCAAAGAGTGATCTCTTCACTCACCATTCGAGGTCACAGGACAAGGGCACGAGTAGTAGTAGCGTGTCTGATAAACAGTCGGACAGTTCTAATAGTAAGGAACAGAACAAGAAGCCGCACATCAAGAAGCCGTTGAATGCCTTCATGTTGTACATGAAGGAGATGAGGGCGAAGGTGGTCGCGGAGTGCACCCTCAAGGAGTCGGCGGCCATCAACCAGATACTGGGAAGACGGTGGCATTCCTTGAGCCGCGAGGAGCAGGCAAAGTACTATGAGAAAGCGAGACAGGAGAGACAGCTCCATATGCAGCTGTACCCGGGCTGGAGCGCCAGAGACAACTACGGTTACGGTTCCAAAAAGAAAAAGAGAAAAAAGGACCGCGGTCCAGCCGAACTGGGAGGTTCAAAATTGTCCAGGGGTACGCTCACTGCGGGCTGTTTTTCTAGCACCAGAGCTGTAGGGCGCGGTCGCGAACCACCGTCGAAGTGCATTTCTCTTTTGTTTCCATATTGCGGCGTGTTCGGTACACAGTGGCATGCGCTCGGACGCGAGGAGCAAGCCAAGTATTACGAGTTGGCGCGACGTGAGCGTCAGCTGCACATGCAGCTCTATCCCGATTGGTCGTCTAGGGCTAATACGCAAAGGGGCAAGAAGAGGAAACGAAAACAGGAGACAACCGATGGAGGGAATAATTTGAAGAAATGCCGTGCGAGGTATGGCCTGGACCAACAGAACCAGTGGTGTAAACCTTGCAGGCGGAAGAAAAAGTGCGTCCGTTACATGGAAGCCCTAGCGGCGGCTACGGGGACGGTCAGCCTGCCGATGGCGACTCCGCAGTCCCCGTGTTCCGATGAGGACGTGAAGCTGGAGGAGCTCTCGGACGCCTCCAGTGACGAGGCAGACACGCCTCTCACCTCCGCCTCGAGCCCGGGGGGCTTGAGCGCCCTGTCCTCTCTGGCGTCGCCGGCCTCTGACCCTCCACCGCCACCTCGCAACCCCGTGGGCACTAACCCCCGCGACGCCAATAACCCCCTCTCCGTGGGTCAGCTGACCTCCCAGTCGTGGCCCCGAGCCGCCCAGCCGAGGCCCGGACACGAGGTCATCTCCGTGTCATAG

Protein sequence:

>DPOGS208615-PA
MGLPAGGSLSLFCPGGDLGQPPPAHMGIPPPYQLDSKLAAGLARTPMYTFPGGTYPYPMLSPEMSQVASWHTPSMYPISSAASGFRSPYPTTLPISTSSLSSELYRFSPTLGGSLGLGLSPALVPPPPSKSDLFTHHSRSQDKGTSSSSVSDKQSDSSNSKEQNKKPHIKKPLNAFMLYMKEMRAKVVAECTLKESAAINQILGRRWHSLSREEQAKYYEKARQERQLHMQLYPGWSARDNYGYGSKKKKRKKDRGPAELGGSKLSRGTLTAGCFSSTRAVGRGREPPSKCISLLFPYCGVFGTQWHALGREEQAKYYELARRERQLHMQLYPDWSSRANTQRGKKRKRKQETTDGGNNLKKCRARYGLDQQNQWCKPCRRKKKCVRYMEALAAATGTVSLPMATPQSPCSDEDVKLEELSDASSDEADTPLTSASSPGGLSALSSLASPASDPPPPPRNPVGTNPRDANNPLSVGQLTSQSWPRAAQPRPGHEVISVS-