Monarch geneset OGS2.0

DPOGS200631
TranscriptDPOGS200631-TA3276 bp
ProteinDPOGS200631-PA1091 aa
Genomic positionDPSCF300076 + 312653-325839
RNAseq coverage506x (Rank: top 25%)
Annotation
HeliconiusHMEL0032940.046.74% 
BombyxBGIBMGA011311-TA1e-16469.55% 
DrosophilaHaspin-PA9e-6236.57% 
EBI UniRef50UniRef50_UPI0000D572832e-9841.25%UPI0000D57283 related cluster n=1 Tax=unknown RepID=UPI0000D57283
NCBI RefSeqXP_971131.14e-9941.25%PREDICTED: similar to Haspin CG40080-PA [Tribolium castaneum]
NCBI nr blastpgi|910916547e-9841.25%PREDICTED: similar to Haspin CG40080-PA [Tribolium castaneum]
NCBI nr blastxgi|910916544e-9237.27%PREDICTED: similar to Haspin CG40080-PA [Tribolium castaneum]
Group
KEGG pathway 
Orthology groupMCL15965 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200631-TA
ATGTCTCATACTATAACATTTATTAGTGATATCAATGGTAGTATGAGGTGGTTAGACCACTGTATAGTGACACAGGAAGCGTGGAATACTATTTTAAATGTTAGAATTGATAGTGATGTGCACGGGTCTGACCACCTCCCGCTTCTGTTCAAGATGAGACGAACATATAGATCTAAAAAGGAAACCGCGGGTGCCATAGATCCCAGAGATGTCTTGTTTACGATCGACGCAAAAAGTTCTGTTTTCGATGCCTTTTATATTCAAAATATCAAACAAACTAAAAGAGATATGTCTGTCGGAAGTACAATGCAGTTTGCTGCTTCAAATAAAGTACGTAAAAGGAAATACGTTAAACGACTCAAGGAACAAGAAACATTGTGTCGGACAAGGGAATCAACTTCGAGCTCAGCGTTTTTAACCCCGGATAAGTCTTTCCGAGTCAATAAACCACCAGATTTATTCGATCAGTTACTTAATTCTTCTAATAACTCACCAGAACCTGTACCGTTATATAATAAACCAATATTCACACCGTTTCATGATAAATATAGGGGTGCATCCATATATAGATTTTCACCTATATCAATAAACTTAGATGACAGCCCTAAAGAAAAAACTGCCATCAAGAATAATGAAGAGCAAAATGATTTTTCATACAAAGAGGAAAATGTAGACAATATTGAACAAGATATAGAGGCTGTAGAAATTGAAAATCCAGAATCTCGCAGTTGTCAAGATAGTATCAATGAAAGAGATGATTCTAAAACAAATAAATTAAATGAATCCCAATATAAAAGTATATGTCCGACGGATAATCATTCACCAATATTGTCCAACCCGACCATAACTAAGAGAAATCGAATAAAAACTATAAAAAATTCATTAAGTTTCCAAACAACCGAAATGGAATCATTCCACGGTTTCGGTGACAGCGACATACATTGTAATGATTTAGATGCTATCCGTGAAAAATATAAGGAAATAGAAAATATATTAACGGAAGATAATTCTCATAACGAAACATCTGAAACATTACAAGAAGATGAAAACAGTCATGCAAATGGTTCCACAAATTCAGATTCCGATGCAAGTTTTATTTCCGGATCGGAATCCAATTATGATACTTGCAATAGTGAAGACGATTCCGATGAGTTCAAACGACTCGGTCAACCTGTAGTTGTTGTCGAAAGACTGAACGATTCCATATTTAACAAATATTATGAATTAATGCCGAAATCCGAAAGCCTTAATTCAGATTATAGTACAGATTTTAACGATAGTTACAATAATTCTAACATTACAGGCTCTTTAAGTGTATCTGACAATTGTGATGATATAGATTTAGTTGAAAATGTACCTGATATGACTAGCATTAATTTATCAGATTGTTCGAATACATGTAACGATAATAAGATGGACGAAGAGGTTTGTGTTAGTTTTGTGACAACCAGAAGAAGGATCCTACCAAATGATTCAATTATTTTAGATGTAGATAGTTCTGTTGCTGATTCAAGTGGCAGTGATGCTGATAAGACTGTACTTAGAAAGAGTGTAGATGATAGTAGTAATTTACAATTAAAATCTGAACATGCAGAGAATATATTAGATAATAATAAAGAAATAGAACGAGATGGTGCCATTAGTTTAAAATCAGAATTACATACTGAAGATGTTGATCAATTAAAAACGTTGCCGGAACCTCCAAGAATGGTTACCAGGAAAAGTGCTCGGATGATTCTTAAAACAGATACAACCTCCATTAACTGCGCAAGAAATATTGAGAGAAATTCGAGGATTGATGGAAATATAAAGGATATGAGCGAGACTGAGACAAACATACTAAACATGTCCAAAGAAAAACCGTCCATAGTCCTACAGCCGGGCAAGAGGTGGGAGCGGTCATTAAGCATATACAGGAGAATGACAACAATGGAAAACTTCGACAAAACGATCTTAGACGAAGAACAGTTGCAGAATAAAGGCAGGAAATACAGGCAGAGCGTCATAGCCACCATGGAATTGCAGGAAAAGGGTTCACTACACAATGATTCGATTAAAAGTCGCAGGAGTACGTTCGTTTCAAAACCAAGCCGGTCAACCATTAAAATTGTAAGAGAATCTGATCTTTCCCGGGATAGTTTGTGTTCGACCATAGTATGCGAAGATTTACAAGGATTTTTGGGCGAAGACTGTGATGATACGATTGTTGAGTTATCAAAACTGTCGATTGCCGATTCGGAACACGAGGTCACTCTCATAGAGAAGTTTCATGATACTTCTAACCGTATAGCGACCGCTCGCGATTACGTCCTGCGACGGTGCAACCAGACAGATGTGTTACTCTTCGACGAATGCTATCCCGATCCGCTTTTGAAGAACTGCCGCAAAATCGGTGAAGGTGTTTATGGGGAAGTGTTTCTGTGGCGAGCTCGTGACGGAAGGGCTCGTGTCTTAAAGGTTATACCAATCGCTGGGGACATCAAAGTCAATGGGGAAGAACAGAAGGGCTTCCATGAAATTCTCTCGGAGATTGTGATTGCTATGGAATTGAGCGCACTACGCGCTCCAATAGCAGACATAACGAATCATTTAAATGAGGGCAAGAGCTTGGAAACATTGGATTTACATACTGTAGAAAATGCTACGGATGTTTTTAATGAGGTGTTATCAGTACGCTGCGTGACTGGGGGCTATCCGTCCAGACTTCTGGACCTCTGGGACCTGTATGACGAGAGCAAGGGCTCGGAGAACGACAACCCAGCTGTTCTGCCGCCCGACCAGCAGTTCATTGTGCTGGAACTGGCCAACGCTGGACAGGATTTGGAAAGCTATCAGTTTGTGAACGCCGAACAGTCGTATGCACTGTTCAAACAGAGTTCGTGTTTCGTGGTTCGCGGTCGCGCCTTCAACCTGCCGAGCTGTGGAGTGAAGGCTTCCATCATCGACTACTCGTTGTCCCGGGCCTCGGTGAGCCGTGGAGTTCTGTACTCTGATCTGGCCCAAGACGAAGCCCTGTTCGAAGCCCTGGGGGACTATCAGTTCACGGTGTATAGACTCATGAGGGATAAGCTTGGTAATGATTGGAAGAATTTCGAACCATACACTAATATACTGTGGCTGCATTACACTTTGGATAAGATGATAACGGCCCTCCGTTACACAAGGACCAACACTAAAATACACAAGCACTACATAGCGAAGCTGAAGGAGGTGAAGAACAGGATCCTGGACTACGGCAGCGCCGTTCAGTTTGTGCTCACAGACAACGAAATATAA

Protein sequence:

>DPOGS200631-PA
MSHTITFISDINGSMRWLDHCIVTQEAWNTILNVRIDSDVHGSDHLPLLFKMRRTYRSKKETAGAIDPRDVLFTIDAKSSVFDAFYIQNIKQTKRDMSVGSTMQFAASNKVRKRKYVKRLKEQETLCRTRESTSSSAFLTPDKSFRVNKPPDLFDQLLNSSNNSPEPVPLYNKPIFTPFHDKYRGASIYRFSPISINLDDSPKEKTAIKNNEEQNDFSYKEENVDNIEQDIEAVEIENPESRSCQDSINERDDSKTNKLNESQYKSICPTDNHSPILSNPTITKRNRIKTIKNSLSFQTTEMESFHGFGDSDIHCNDLDAIREKYKEIENILTEDNSHNETSETLQEDENSHANGSTNSDSDASFISGSESNYDTCNSEDDSDEFKRLGQPVVVVERLNDSIFNKYYELMPKSESLNSDYSTDFNDSYNNSNITGSLSVSDNCDDIDLVENVPDMTSINLSDCSNTCNDNKMDEEVCVSFVTTRRRILPNDSIILDVDSSVADSSGSDADKTVLRKSVDDSSNLQLKSEHAENILDNNKEIERDGAISLKSELHTEDVDQLKTLPEPPRMVTRKSARMILKTDTTSINCARNIERNSRIDGNIKDMSETETNILNMSKEKPSIVLQPGKRWERSLSIYRRMTTMENFDKTILDEEQLQNKGRKYRQSVIATMELQEKGSLHNDSIKSRRSTFVSKPSRSTIKIVRESDLSRDSLCSTIVCEDLQGFLGEDCDDTIVELSKLSIADSEHEVTLIEKFHDTSNRIATARDYVLRRCNQTDVLLFDECYPDPLLKNCRKIGEGVYGEVFLWRARDGRARVLKVIPIAGDIKVNGEEQKGFHEILSEIVIAMELSALRAPIADITNHLNEGKSLETLDLHTVENATDVFNEVLSVRCVTGGYPSRLLDLWDLYDESKGSENDNPAVLPPDQQFIVLELANAGQDLESYQFVNAEQSYALFKQSSCFVVRGRAFNLPSCGVKASIIDYSLSRASVSRGVLYSDLAQDEALFEALGDYQFTVYRLMRDKLGNDWKNFEPYTNILWLHYTLDKMITALRYTRTNTKIHKHYIAKLKEVKNRILDYGSAVQFVLTDNEI-