Monarch geneset OGS2.0

DPOGS200642
TranscriptDPOGS200642-TA3336 bp
ProteinDPOGS200642-PA1111 aa
Genomic positionDPSCF300076 + 633990-644893
RNAseq coverage83x (Rank: top 64%)
Annotation
HeliconiusHMEL0010440.077.83% 
BombyxBGIBMGA011319-TA0.070.96% 
Drosophiladefl-PA4e-9051.69% 
EBI UniRef50UniRef50_C3PPH10.075.04%DNA sequence from clone AEHM-21P16 (Fragment) n=1 Tax=Heliconius melpomene RepID=C3PPH1_9NEOP
NCBI RefSeqXP_396796.23e-12742.90%PREDICTED: similar to integrator complex subunit 7 isoform 1 [Apis mellifera]
NCBI nr blastpgi|2294873770.075.04%unnamed protein product [Heliconius melpomene]
NCBI nr blastxgi|2294873770.075.04%unnamed protein product [Heliconius melpomene]
Group
Gene OntologyGO:00054886.6e-17binding
KEGG pathway 
InterPro domain[20-472] IPR0160246.6e-17Armadillo-type fold
[432-468] IPR0119892.1e-09Armadillo-like helical
Orthology groupMCL11879 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200642-TA
ATGATCGGAGTAAGATTAAACTCATTCAGCGATAATTCAGGGGAACCTGAACAGGACGCTAATTCTGCTTTAACAGAGTTGGATAAAGGTCTAAGGTCTGGTAAAGTCGGAGAACAATGTGAAGCTATAGTCCGTTTTCCTCGTTTATTTGAAAAGTACCCGTTTCCTATATTAATTAATTCATCATTTTTAAAATTGGCGGACGTGTTTCGTATGGGTAACAACTTTTTACGGCTCTGGGTTTTACGCGTGTGTCAACAGAGCGAGAAGCATTTGGATAAGATATTAAATGTGGATGAGTTTTTGAGGCGGGTTTACAGCGTGTTGCATTCAAATGACCCTGTGGCGCGAGCGTTGGCGCTGAGAACTTTGGGTGCCGTGGCAGGAATAATTCCCGAGCGTCAGAACGTCCATCATGCAATCCGCAGAGGTCTAGAGAGCCACGATAATGTAGAAGTCGATGCCGCTATTTATGCTACTACTAGATTTGCTGCACATTCAAATTCATTCGCAGTGGCCATGTGCAATAAGCTGTCGGACATGGTCGAGTGCGAGAGTACGGGGGTTGAGCGGAGAGCAAAACTCGTCAGGGCCTTACGGACAGTACATGGCGGAGCGGTTCGTGCTCAAGGTGTCCTGAAGCTGCTAAGGTCCCTGCTGGAGAGATTCCCTTCATCCAGCTCAGTCCGAGCTGCAATCACAGCTCTCACTGCCATCGCTGCGGATACAGTAGTGCATGTACCCGATCAGGTGGAGCTTCTGCTTAAACTGGCGGTTAACGACGCTCGGTCAGCTGTTCGCCGCGCTGCGCTAGTGGGTCTTCGTAAACTGGCTGAGCATGCTGCGCTATGGCCCACTGACTGTATCCAAGACCTGGTGCACGCGGCTTCAGAGATGCAGGACGATGAGCATAGCATGCTCTGCCTACAGGTCATGCAGATCCTGGTCCGTTGCCCGGCGGTGTGCGCGGCGGCCGGGCCCGAGGCGTCTCTCCGGCGCCACTGCAGCGCTGCAGCGCTCAGCGTCAACATGAAGCTAGCTGCAATCGCTGCAGACGTACTCACAAGAATCGTAGCCCACTGCTACGAGGAGAATCTCCCGGTCGAAGGTTCGGAGCTGATGCTGGCCCTGGAGTCGCTGGTGATCGCCACCGGCATGGACAACGGACAGAATAATATACGACCCCTGCGGATAGCGCTCAGATGTCTGGTACAATTGAGTTCCTCCGCCCCTCACCTGTACGCGAGTCGAACCGCTGGTGTTTTGGGGTCCGCCGCCCAGAGCTCCGTGGGTCCGCGACAGGCTGCACTACTCGAAGGCCTGGCTGCGCTCGGGGCCCTAGGGGCCTCCGCCGCACCCTATCTCATACCAGCCCTAGAGAGAGCCAAGGAGGACTGCAAGGACCCGACCTACGACGGCACCACTCTAGTGCTGATCTGCACGGTGCTGCTGCAGGAGAGGGCGGTGGCAGCGCTGCGGTACAGGATCCGGGGCTCGTGGGAGGAGAGGATCAAGGATGCCGTGAGGGGAGCCGACGGCTGGACCAGATACAGGGTCGCCAGGACCGCGCTCAGCGCCGTCTGTCGTCCGCGTCCCCGTGCTAGCGACACGGTCTGTACGCGACTGTTGTATCTGAAACATACCATTGCAGAGTTTTATCAAAATCCGTTCAGTGCACAAACCTCCGCGGGTACAAGGTACGGCCACCACCGCCTGGCGGGGGAGCTCCTGGAACAGCTGGCTACTCAAGCGCCGTCCGAGGCGGCGCAGCGCTGGCTCACAGCACTACACCGAGCGGCCGCAGCTGATAGGCTGCTGGAAGATGAAGTTATTCCATGTCGTTTTTATTACGACACCGATCCCCTCGGCTCCTCGAGCTCGCCGTCCTGCAAGCCACATCATGTTTTAAAGTCCGCGCCGCTTTCCTTTGACGTTAATTTGTGCATAACACTCGGCCACCGACAGGTGTCTCGGCGCTGGAGGAGGCGAGCAGAACAGATCAAACCCTGGGCACCTCCCTATAGATACTCTCGCACGAACTCCCAAATACTTAACTTGAGCCAAATGTTGCAGCAAATAGGTGCGCTACTCGCCTGTATATCGTTACTATACTTAGTCAAAGATTTTATTATTTCTATGGGTCATGCTAGCGACACCTTCCGTGCACCGTTTGTCACCTTCCACCACAGCAATATCACACGGAAGCAAGGATTGAGGGCAAAGAAGACGAAGGAGGAGTCGGATAAGGGTCAGAGAGAGTCCGGAAACGAGAAGGAGGGCGAGGGTCAGGAGTTACCCCTCTACTGTCAGAGCGTCTACACCTACCCCCACACGGACTATATCAATTTCCAGCCGATGCCTGTTGAAGTAAAAGCACTTCCCCTGATGGTGGACTGCCGGCCGGCGGGCGCTCCGCCCCACGACATGCCCCACAATAACGGGACCGGCGATAACAGCCATCAGCAGGCTCCCAGTGGCGAGCTGGTGTCGCACGCCCGCGCCTCCACTCCCGCCGCGCCCACCTACCAGCACGCCTACAACGCACGGGTGGTGTCCTCCGAGATAGCTATATTCAATCAGTCAGTACCCGGAGGCGGTCCGTGTCCACACGCGGAGGCCGTGGTGGCCGGGGTCCGAGCTCTGTGTCGGGGGATCGTGTGGCCGCGGGCGGTCTGTGCGGGCGGGGCGGGCGGGGCGCCCTGCCGCGTGTCGCTGTCCCCTGCGCCGCGAGCGCCCCCCGCCGACCACGCCGCCGCCCTCCCCCTGGCGCACCGCCTGGCCGTTAAGCTGGAGGGCGTTCTGCTGCCGCCGCCGGGCAAGATGAAGAACAAGCGGCAAGTTAAAGGAGTCCAGATCACTGTGACCGCGACTCCACATCCGCGGACCAACGAGAAGACGGTGGAGCTGACGAACGTACAGCCGACGCTGACGGCCGTGCAGACGGTGACGCCCGTGAGGGACTTCTTCTCCGCCCAGCAGCTGGTGAGCGTCCCGGCTCCCGGACTGTACACGGTCGCCGTGGAGGCGGCCTTCGTGGACGAGAAGGGCCAGCTGTGGCACACCGGGCCCAGGAGCTGCATCGTCATCAAGGCGCACGAGGACCCCGGCACCAAGGGGAACTCGCAGACCTCGAGGAGCAGCGGCTTAAACTCGCCAAACGTTACGTTATACAGGGTGTCCCGAACTCAACGACCGGCACAATGTGACATGTTAATTTATTGTTTAAATTCTCTTCATCTAAACGCGCTTCATCCCGTTCTCAACGCTCCCGTGAACGACGACTATTACTTTCTGTCGCTGCTCGGCAAGCACCCTCTGACATGCTCGCAATTCTAG

Protein sequence:

>DPOGS200642-PA
MIGVRLNSFSDNSGEPEQDANSALTELDKGLRSGKVGEQCEAIVRFPRLFEKYPFPILINSSFLKLADVFRMGNNFLRLWVLRVCQQSEKHLDKILNVDEFLRRVYSVLHSNDPVARALALRTLGAVAGIIPERQNVHHAIRRGLESHDNVEVDAAIYATTRFAAHSNSFAVAMCNKLSDMVECESTGVERRAKLVRALRTVHGGAVRAQGVLKLLRSLLERFPSSSSVRAAITALTAIAADTVVHVPDQVELLLKLAVNDARSAVRRAALVGLRKLAEHAALWPTDCIQDLVHAASEMQDDEHSMLCLQVMQILVRCPAVCAAAGPEASLRRHCSAAALSVNMKLAAIAADVLTRIVAHCYEENLPVEGSELMLALESLVIATGMDNGQNNIRPLRIALRCLVQLSSSAPHLYASRTAGVLGSAAQSSVGPRQAALLEGLAALGALGASAAPYLIPALERAKEDCKDPTYDGTTLVLICTVLLQERAVAALRYRIRGSWEERIKDAVRGADGWTRYRVARTALSAVCRPRPRASDTVCTRLLYLKHTIAEFYQNPFSAQTSAGTRYGHHRLAGELLEQLATQAPSEAAQRWLTALHRAAAADRLLEDEVIPCRFYYDTDPLGSSSSPSCKPHHVLKSAPLSFDVNLCITLGHRQVSRRWRRRAEQIKPWAPPYRYSRTNSQILNLSQMLQQIGALLACISLLYLVKDFIISMGHASDTFRAPFVTFHHSNITRKQGLRAKKTKEESDKGQRESGNEKEGEGQELPLYCQSVYTYPHTDYINFQPMPVEVKALPLMVDCRPAGAPPHDMPHNNGTGDNSHQQAPSGELVSHARASTPAAPTYQHAYNARVVSSEIAIFNQSVPGGGPCPHAEAVVAGVRALCRGIVWPRAVCAGGAGGAPCRVSLSPAPRAPPADHAAALPLAHRLAVKLEGVLLPPPGKMKNKRQVKGVQITVTATPHPRTNEKTVELTNVQPTLTAVQTVTPVRDFFSAQQLVSVPAPGLYTVAVEAAFVDEKGQLWHTGPRSCIVIKAHEDPGTKGNSQTSRSSGLNSPNVTLYRVSRTQRPAQCDMLIYCLNSLHLNALHPVLNAPVNDDYYFLSLLGKHPLTCSQF-