Monarch geneset OGS2.0

DPOGS203916
TranscriptDPOGS203916-TA3315 bp
ProteinDPOGS203916-PA1104 aa
Genomic positionDPSCF300005 - 689783-696447
RNAseq coverage167x (Rank: top 51%)
Annotation
HeliconiusHMEL0103700.052.44% 
BombyxBGIBMGA000494-TA0.042.11% 
Drosophila% 
EBI UniRef50UniRef50_F1R2R47e-1035.71%Si:dkey-39n1.3 n=5 Tax=Danio rerio RepID=F1R2R4_DANRE
NCBI RefSeqXP_002742202.12e-1038.71%PREDICTED: restin-like [Saccoglossus kowalevskii]
NCBI nr blastpgi|2933413952e-0934.69%PREDICTED: centrosomal protein 350kDa [Rattus norvegicus]
NCBI nr blastxgi|2700032001e-1829.78%hypothetical protein TcasGA2_TC002404 [Tribolium castaneum]
Group
KEGG pathway 
Orthology groupMCL26557 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203916-TA
ATGGATGCATCGGCATCCAACAAAACGAATCCCGAAAATAATTTATATGATCTTTATTTGAAAATTGAAAATCCCATCTACTCGGCTCGAGTTCCATATAATAAACCTGTAGTGAAATATACGCGCCTACCAGACTTCGATCTCAACACTAAAGAATCAACAAAAAGCGCTAAAAGTATAAAAAGTGCTGGGTCATCACCTAAATTTAGCTTAAGTAAAAAGTTATTAAACCCAAAGGAAAATGAAAAGGAGTGTACTCCCAAGAAAAACAAATTAAAAGAATACATCAACATTAAGGCACCACCAAAAACTGTTTTGGATAAGACAAAAAGAACTAAATCTCCACCTAATTTTAATCGTAACAAAATGTCTCTTATCAAAGATATTAAAGACAAGGTTAGACCGGTTTCTGTAATACATGTGCCACTTGATGATCATTTTCGCAAACGTGATATTGGAACTGAGGCTTCAGAGGTCATCATGAATCAAGCTGTACATTCTATTAATAATGTCACTACAGCGGTTCAGGCAGTTCCAGATGTGACATTTAAGGATGTAGAAATTTCAACAAATTTTACACCAGAGAAAGTGAATCAAATGACATCTGTTGATGAAGTGATAAGCAAATTAGGTAGTGAATGTAAGGATATCAAAGATACTTACGATAGTCTTGAAATACCAAATGTAAGCTTAATAAATAGTAAAACTGACTCGTCCAAAGACCTAGACATTGAAAAAAAAATTAATAATATACCGAACAGTTATGTAAAAGAAGAAAAAGAACATGATACTGATATTGTTAGAACGCCTAATATTGATAATGCTTCTTCGAATGTTAGCGATAAAGATAAAGACACTGTATACCATAAAAGTACATCATACATTATTGGTCGAGCTACCTTAACTTATACAACAAGACAGAAAATAAATTTTCACCTAGTAGAAAATAACGAAGTTTTGCACTCTAGGATCTCACCACGTTCATTAAATTATCCTTTGAACGTTGTTTCCGTACTGAAAAAGGAAATATCGAATCAGAAATACAATAACGATGACAAACTAAACTACAATAGCCAAGATAGTAAATCTGACGATGAAACAGAAACTCGTCTTACTCCTATTAACCGTTCTTATGCTCTCAAGAGTGAAAAATTAGTAAAACCATCAGATATAATTAGTACAATAAAACTTAATAACAATTTGTTACATAGAGACCTCTGTGGACAGTTTCAACGAGAACTTAATTTCATTGATTCATTTTTCGAGTCTCTTCAGTATTTAGATAGCTGTTCCTTATCTGATAGAAGTATAACAGAAAAAAAGGTTGAAAACTGGATTAGCGGCGGTGCCAATGAAGTGAAGAACTTTGAATTTGGATCATTTTTGTCACAATTCGAAAATGAATTTAATATTGACAACCCTAAAACAATGGCGTCCGAGAGTCTTTGTCTGCTCAATTTCCTCATTCAAAATGAGCAAATTAAAGCAGAATATCTATTGGATGCTTTGAAAATGCGTGAAGACGCTTTGAAACATTTTACGAAGTCGCAAATTTTGTGGTTGGAAGATAAGAAAAAACATGATCACACCGACATACCGACACTGAAAAAGAAACAAAGAGGTGCAATCATAAAACTTCAACACGAATGTGGTGAAATGCAGCGCATGCGAAAGGCTCTGTTGGCGTTGTCGGAACAACGTAAACTGGCATTGAAGAAAACAAAGAAGAACATAGAACTTAAATTAAGGAATAGCTCTGATGTGGAACAAATTATATTGGGAAAAAAGAAATTAAGACGTAATGTTTCTACAGATCGTAACAATGCACCTCTTAAATGTTTTGAATTGTCTAGCAGCGGCTGCGATGATAGCACTACATCGAGGCCCAAGTCAAACGCATCCATTGTACTGAACGACTTAAAGGCAGTGACGAGCGCCGAAAAGTGCGTTCAAACAGGTGAAAGCCTCCCCCTTACATCAGACCAATCAACGAACACCTTCGACGAGAATTTTGTAGTTGTTGATGGTAGCTACTTGAACATCGTTTTCCAAAATCTATCACCTCAGATCTTCAGCGCCGGTAAACAGTACGAAGTGAACAAGGACGCCCTGAAGAATATAGTCGATAAAAGTAATATGCACAATATTAATCAAAATAACGAAGTAGCGCTTGAAGAGTTTATGGATCATATAAAGAATCACGAGTTGGAATCGAGTTCACCGTCCACTGCGAGAAGTTTAGTGGATGAATTCGACCAGATTTACAAGACCTATTCCGATGATGATATCTCATACGAAGTCGGCCGTGCTCTTGTGGACGACATCAAAGAGGTTCAAGTTTCACCTGAAGCTGGAAACATCAAAAATGATGTTAAGACATTGGCTGTTCCGAGTGGGGGAGTTGATCGTTCAGTGTCCGTTGACGAATGCTGTAGTTGTGAGCCATTAGTAGCTAGTGTTGGGATTCAGGTGTCCAAAGGCAAGGAAGTGGCTCAAACTAGTGTGACGGGACCATTACCTATACCAGCTGGTGCTGCTTCCACTGATGATGTATCATCTGAAGTAGCCACTTGGTTGACGCAGAGATCATCGGTGAAGTCCACCGGTGCGAGCAGTTCGAGTAGTCAATCTCATAATCCTAGCATTTCGTCATTATCATCACCAGTTCAGTACGAGGCTGAAGAATTACGTCGTCAACAACTAGCCATTGAACGAGAGATTAAAGCATTAGAACAACAACAGTGTCAATTGTTGGCGTTGCGTGAGATACCGGACAAACCCCCACCACCGTACACGCCACCAACAGAACCACGACCGTTAAAATCTCTAAACAAGTTCATAGCTGATGACATCAATGAACAGAAAATACACAAGTTGCTTTTCCAACCTGGCAAGCAACTCGGCGAAACAGATGTATTTGAAGTATTTGTCAAAGACTTCTGCCAAGAGTCCATTGAAAGACAGAAGTTGGATAGAAGCGACAAATATTGGGACACATGCAATATGATACCGGTTAAACCCAAGCCGGACAAAGAGAAATTAGTAAAGAAAGCTGCTGCTGACCTAAAAGAAGTCTTGTCGGATGTACCACCTACCGTTGTTTCAGGTGTAGGAGCGAGGAGGTCAGACCACATAGATGATATATTGTTTGCCGAGTGGCGACGTTGTGAACCAGAGTGGACTTCGTTGCACGCAGACGAAGTGATTGTTAAAAATCAGGTGTTTGAAAGCATTTTTCAGAAGATACTATCAGAAACTGTCGACGAATATAAAAGAACTGTGCTCAGTAAACCAAGTGATGGATCTGTGCCATGA

Protein sequence:

>DPOGS203916-PA
MDASASNKTNPENNLYDLYLKIENPIYSARVPYNKPVVKYTRLPDFDLNTKESTKSAKSIKSAGSSPKFSLSKKLLNPKENEKECTPKKNKLKEYINIKAPPKTVLDKTKRTKSPPNFNRNKMSLIKDIKDKVRPVSVIHVPLDDHFRKRDIGTEASEVIMNQAVHSINNVTTAVQAVPDVTFKDVEISTNFTPEKVNQMTSVDEVISKLGSECKDIKDTYDSLEIPNVSLINSKTDSSKDLDIEKKINNIPNSYVKEEKEHDTDIVRTPNIDNASSNVSDKDKDTVYHKSTSYIIGRATLTYTTRQKINFHLVENNEVLHSRISPRSLNYPLNVVSVLKKEISNQKYNNDDKLNYNSQDSKSDDETETRLTPINRSYALKSEKLVKPSDIISTIKLNNNLLHRDLCGQFQRELNFIDSFFESLQYLDSCSLSDRSITEKKVENWISGGANEVKNFEFGSFLSQFENEFNIDNPKTMASESLCLLNFLIQNEQIKAEYLLDALKMREDALKHFTKSQILWLEDKKKHDHTDIPTLKKKQRGAIIKLQHECGEMQRMRKALLALSEQRKLALKKTKKNIELKLRNSSDVEQIILGKKKLRRNVSTDRNNAPLKCFELSSSGCDDSTTSRPKSNASIVLNDLKAVTSAEKCVQTGESLPLTSDQSTNTFDENFVVVDGSYLNIVFQNLSPQIFSAGKQYEVNKDALKNIVDKSNMHNINQNNEVALEEFMDHIKNHELESSSPSTARSLVDEFDQIYKTYSDDDISYEVGRALVDDIKEVQVSPEAGNIKNDVKTLAVPSGGVDRSVSVDECCSCEPLVASVGIQVSKGKEVAQTSVTGPLPIPAGAASTDDVSSEVATWLTQRSSVKSTGASSSSSQSHNPSISSLSSPVQYEAEELRRQQLAIEREIKALEQQQCQLLALREIPDKPPPPYTPPTEPRPLKSLNKFIADDINEQKIHKLLFQPGKQLGETDVFEVFVKDFCQESIERQKLDRSDKYWDTCNMIPVKPKPDKEKLVKKAAADLKEVLSDVPPTVVSGVGARRSDHIDDILFAEWRRCEPEWTSLHADEVIVKNQVFESIFQKILSETVDEYKRTVLSKPSDGSVP-