Monarch geneset OGS2.0

DPOGS216173
TranscriptDPOGS216173-TA3357 bp
ProteinDPOGS216173-PA1118 aa
Genomic positionDPSCF300155 + 284753-295879
RNAseq coverage673x (Rank: top 19%)
Annotation
HeliconiusHMEL0165613e-17371.59% 
BombyxBGIBMGA014159-TA4e-16958.16% 
DrosophilaCG2247-PB2e-2024.75% 
EBI UniRef50UniRef50_UPI00022C95852e-10450.86%UPI00022C9585 related cluster n=1 Tax=unknown RepID=UPI00022C9585
NCBI RefSeqXP_975618.15e-10550.63%PREDICTED: similar to conserved hypothetical protein [Tribolium castaneum]
NCBI nr blastpgi|3504081868e-10450.86%PREDICTED: hypothetical protein LOC100740617 [Bombus impatiens]
NCBI nr blastxgi|2700052594e-10838.10%hypothetical protein TcasGA2_TC007283 [Tribolium castaneum]
Group
KEGG pathwaydpe:Dper_GL268943e-17 
 K03083 (GSK3B)maps-> Axon guidance
    Prostate cancer
    Alzheimer's disease
    B cell receptor signaling pathway
    Hedgehog signaling pathway
    Pathways in cancer
    Chemokine signaling pathway
    Endometrial cancer
    Insulin signaling pathway
    Neurotrophin signaling pathway
    T cell receptor signaling pathway
    Melanogenesis
    Focal adhesion
    ErbB signaling pathway
    Basal cell carcinoma
    Colorectal cancer
    Wnt signaling pathway
    Circadian rhythm - fly
    Cell cycle
InterPro domain[684-724] IPR0223641.1e-08F-box domain, Skp2-like
[445-513] IPR0110113.8e-07Zinc finger, FYVE/PHD-type
Orthology groupMCL17437 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS216173-TA
ATGTCGGAAGAAATGGCGGTGGGTGATAACGCATCAAATGAGGAGCATAAAGTCGCCAGCGAAGATGTGGGCTCGGAGGGAATGGATTCTGCTGACGCTACAGAAGTTCCAGCTGTATTAAAAACTGCTGAGACAGAAATTAAACCAGAAATAAATGAAAATGCCGAAGACTCGGAATTGCAAGCAGAAATGGAGAGTGAAGAAAATGAAATTGAGGAATATAAAGAAATTGAAGAAAAAATGCAAGAAATAAACCAAGAAACTCAAGACGCAAGTGATGAAGCTGCTAGAGGTGTGAAGCGTAGACTGTCGACAGCATTCAGCGACGATGGAGAGGAGTTTAAAGGCTTTGATCAGTGTGAACCGAGCAGCCTGGACGACTACAGCAAAACTGTTGAGGATGGGAGACGAGTTAAACCGAGAACCTCAAGTGATAGCGATGGTTTCAAAGGTTTTGACCTTATAGAGCATAACAATCTGACAGGCTACTGTCAGGTTTTAGAACGGCTGGAAGCTGAAGTGCTAGCAGCTGCAAAGACATATACTCCTGTTCGACGTGTGATGGCATCACTTCAAAGAAGGATCGATGCAAGCGGGTGTACCAAGTGGTCAGAACATCTTGAGGCTGAGGTTTCTGAAGTTGTGAGGAGCTTCTCCACCGACGATGACTCCAAGCTGAGAAAGCGGACCGAAGGATCGAGACCCTCGTCGGCTTTGTCATCAAGATCTGACGGTGATGGCGGTGTGTCAACTGATAAACCTCAGCCTCAAGATGTTCTGACCACTTGGTACACCCGCTTGCATCGATCCTTCTTTGAAGTGATGCCATCACACGTCGGACAGGAGGATGCGAAAGTCCAGCGTCGTGTTGAGTCTCCAGTGCCAGCAGTCACACCGACCTTAGAGTCCAAGAGAACTCCAAAGCCCAAACCTCCGAAGGGAGCCAGCCCCGAGCCTGGGATGAAGTCACCCCCTGGGAAGTTAAAGGTGTCGCAGGTGAAGTCAATGGGCTCCAGGTTGAGTAGCAACGGTGCACCAGCTGCGAAACAGGTGAAGAAGACCCCACAACAAGCGGTCACACATGACAACAACAACATGGCCGCCTGGAAGAAACCTAGGGGCCCGCCGACGGTGCAGCCGCGGCCGGTGGCAGGGCAGCCGGGGACGACGACACCGCCGGCGCAGCTCGCACAGCACACACAGCCCGCGAGCCTCGCACAAGCCGCTCATCACGTGCAGCCTGTGCAACCAGCCCAGCACATACAGCCTATCCACCCCGCCCCCGCACCCGCCCACGCCCCGCATCAGATATCCGTTCCCCGGCCGCAGCTCACACCGGACAGAGATGTACCGCCTCCCTTACACCATCAGAACCGTCAGATCCAACAGCCGTGTTCCATGACGTGCGGGACGGGTGTGCCGTCGTTGGCCTGCGAGGCCTGCCTCTGTTTGTACCACCCGGCCTGTGTGGGGCTCCGTCTGCCGCAGGACACGTTCCTGTGTAAGAACTGTCGTAAGACTTCATCTCCGCCGGTGGAGCCCCCGCCCCTCACCCACAAGTCGGGCGTGACGTCACTACCGGCCGGAGCTGGCCCCGGGGCTCCGTGCTCGTCCAGCGCCAGGCGCCTTCCGGTCCCAGTCCCAGTACCAGTACCGAAATCGAGAAATGACAAGCGAGTGCTCTTGAGAATGAAGGTTGCGGGCGGCGGTCCTGATGGCGAGCGTGTGTGGTCCGTGGCCAAGCCGGGCGCTCCGGCCCCGGCTCCGCCCCCCGCTCCGCCCCACTCCGCGCCCACACCCCCTCCGTCCAATCCCCCCACCACCACCTGCCGGCCTTCATTACCACAGTCGCTGGTGGTGCTAAACGGCAGACGGTTCATAGTAGTAGCTAGAGCTGTGCATCATGATATTAAAGTACGCCGCGGAGTGTCCAACGGCGCTTCGCCCCCGCCGGCTGCGCCCTCGCCCGCTCTGAGACGAAGAGTCAAAAAAGACGACACCGATTACTTCACACCCTTCATAGAAAAGGCTAAGGCGAACAACTATAATGTAGCTGTACAGATCTTCCAGTACTTAGGCATGCGTGATGTAGCGCGCGCCGCCCGCACGTGCTCGCTGTGGGCGGAACTAGCCGCTACACCCGCACTATGGAGGCACGTACGGATGAAGAATTCGCACATCTTCCAGTACTTAGGCATGCGTGATGTGGCGCGCGCCGCCCGCACGTGCTCGCTGTGGGCGGAACTAGCCGCTACACCCGCACTATGGAGGCACGTACGGATGAAGAACTCGCACGTGAGTGACTGGGCGGGCCTGTGCGCCGCTCTCCGCCGCCACGGGACCCGCTGGCTCGACCTCCGCAAGATGTTGCTGCCACCAAATGACACGTTATTCTGGGATCAGTTCGCGGAACACATCGGCACCGTCGACACGCTCGAGAGACTAGAGCTATGTCGCTGTCCCGCCCGCGCGGTGGAGGCGTCCTGTGAGCGTGTCCCGGGTCTGCGCGCGTTGTCCGCGCCCGCCATACGGGACGCCAGACTCGACCCCGCGCCACTCGCCAGACTCACTAGACTGGAGCTGCTCAGACTCAAGAGCCTCACAGGTCTATCGTTGACGCGAGACCTCCGCCCCCTGGCCGGCCTCTCTAGACTCCAACACCTGTCGCTGACGTCCATCAAGGAGCTGGGCTGGTGCGCCTGTGAGGTGGTCGGACAGTTGGAACAGCTGGAGTCGCTGGAGCTGGGAGAGTGCTCCTTCGGCGGTTCCTTCGCCACGGCTCTCGGAAAACTGGTCAAGTTGCGGAAGTTAAGGCTGGAGCGAGGGGTGGCACATTGCGCCGCGCCGGCATTGCTAAGAGCACTGGCAGCCCTGCCCAAACTGACACGGCTGGAGTTAGTTAATTTCGATGTTAAGGTCGGCTTCGACGATGCTCTGGCGGAGTGTAAAAACATACAGAGACTGCTCATCATACCGACGTACGTGTCGCAGTCGGCCACCACCAACAAACAGGTTCTGAGCGGTGTGCTGCGATTGAAAGAGACCCTGACGCATCTCATGTGGGGTGTGACCATCGAGCTGCTGAGGGTCACGGAGCTGTTCATAGACCAGTGTGAGGCGGGCGACGGAGACACCAAGCGGCGGGACGTAGGGGAGTGCATACCCGTCCTCAAGCCGGTCCCCGGGTGTCGTCTGCCCGACGACCACCGCACCGTGGCCGGACCTCCGCAGGTTGAAATTCTACCCATCCCGACCCTCCAGCGGTTGCTGGCGGCTCAGCTGCCGCGGACCAAGCTCAAGCTGCTCCGGATCCCCTTCCACGCCACCTGGAGACAGTCGCTGGCTGATTTCCAATAG

Protein sequence:

>DPOGS216173-PA
MSEEMAVGDNASNEEHKVASEDVGSEGMDSADATEVPAVLKTAETEIKPEINENAEDSELQAEMESEENEIEEYKEIEEKMQEINQETQDASDEAARGVKRRLSTAFSDDGEEFKGFDQCEPSSLDDYSKTVEDGRRVKPRTSSDSDGFKGFDLIEHNNLTGYCQVLERLEAEVLAAAKTYTPVRRVMASLQRRIDASGCTKWSEHLEAEVSEVVRSFSTDDDSKLRKRTEGSRPSSALSSRSDGDGGVSTDKPQPQDVLTTWYTRLHRSFFEVMPSHVGQEDAKVQRRVESPVPAVTPTLESKRTPKPKPPKGASPEPGMKSPPGKLKVSQVKSMGSRLSSNGAPAAKQVKKTPQQAVTHDNNNMAAWKKPRGPPTVQPRPVAGQPGTTTPPAQLAQHTQPASLAQAAHHVQPVQPAQHIQPIHPAPAPAHAPHQISVPRPQLTPDRDVPPPLHHQNRQIQQPCSMTCGTGVPSLACEACLCLYHPACVGLRLPQDTFLCKNCRKTSSPPVEPPPLTHKSGVTSLPAGAGPGAPCSSSARRLPVPVPVPVPKSRNDKRVLLRMKVAGGGPDGERVWSVAKPGAPAPAPPPAPPHSAPTPPPSNPPTTTCRPSLPQSLVVLNGRRFIVVARAVHHDIKVRRGVSNGASPPPAAPSPALRRRVKKDDTDYFTPFIEKAKANNYNVAVQIFQYLGMRDVARAARTCSLWAELAATPALWRHVRMKNSHIFQYLGMRDVARAARTCSLWAELAATPALWRHVRMKNSHVSDWAGLCAALRRHGTRWLDLRKMLLPPNDTLFWDQFAEHIGTVDTLERLELCRCPARAVEASCERVPGLRALSAPAIRDARLDPAPLARLTRLELLRLKSLTGLSLTRDLRPLAGLSRLQHLSLTSIKELGWCACEVVGQLEQLESLELGECSFGGSFATALGKLVKLRKLRLERGVAHCAAPALLRALAALPKLTRLELVNFDVKVGFDDALAECKNIQRLLIIPTYVSQSATTNKQVLSGVLRLKETLTHLMWGVTIELLRVTELFIDQCEAGDGDTKRRDVGECIPVLKPVPGCRLPDDHRTVAGPPQVEILPIPTLQRLLAAQLPRTKLKLLRIPFHATWRQSLADFQ-