Monarch geneset OGS2.0

DPOGS209014
TranscriptDPOGS209014-TA3348 bp
ProteinDPOGS209014-PA1115 aa
Genomic positionDPSCF300209 + 73738-78947
RNAseq coverage58x (Rank: top 69%)
Annotation
HeliconiusHMEL0025400.059.77% 
BombyxBGIBMGA012553-TA0.049.22% 
DrosophilaCG10979-PA2e-5428.31% 
EBI UniRef50UniRef50_UPI00021A840E1e-6037.21%UPI00021A840E related cluster n=3 Tax=unknown RepID=UPI00021A840E
NCBI RefSeqXP_966362.13e-6031.03%PREDICTED: similar to CG10979 CG10979-PA, partial [Tribolium castaneum]
NCBI nr blastpgi|3838580245e-6829.40%PREDICTED: zinc finger protein 800-like [Megachile rotundata]
NCBI nr blastxgi|3838580245e-7829.59%PREDICTED: zinc finger protein 800-like [Megachile rotundata]
Group
KEGG pathway 
Orthology groupMCL19882 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209014-TA
ATGGCTGGTAACAAAATAAACACTAAGAAAAAGGAAGAAAAGGGTAAATCAAACAGAATTGCTGGTCAATCTATTGAGGAAACAGAGGACCTTGACTTCTCCTTGCTGAGAAAACCAATACATACTAGTGTTACAGGCTTTGCTCAAGCAAGAAAAGTTTTCGACTTAGCCACTGAGGAGCTCAAAGGTTTACTCAGCAATGAATGTGACTTATTATATGAATGTAAAGTATGCAGAAATATATTCAGAAGTTTAGCCAATTTTATATCACATAAGCGAGTCTACTGTAAAGAAAAGTTTAATCCCTCTGAACATGGACATTTCATTAAAAATACCTCCTCTCTAAATGAAATTTTGAAAATACGAAAACTCGAAGAGAGCTATCAGGAAATATTAAGAAAAGAAAATGACTCCAATGAAATGGATACAGAGGAGACGGAAGAAAGGATCCCACTCACAAAGGATCTTACAGATATAATAGAAAGGATATCTAAAACTAAAGGAGTACAAAAGAAGCAATTAAAGGAACAAAATTTAGTGTTTCAAAAAATACCAAAAAGTAATGTTGCTGTTTTCCAAAATATTGAAAGTGATGTGAACAAAACTGATACAATGAAGGCTGAGGTTAGTGAATTGGACAAGATGTTGTCTCAAGAGAATGCAGTGTTACAAAGCGATGGCACATTCAAAGTACAAACGAACGACGTACAAATGAATACAGAAAATGTTATACAAATCAGTGACGATGAAGATAATGAAGGTGCACCATACTCGGTCAAACATGGTGTACTGAAATGTGAAATTTGTGATTTACAATTTTCAACTCAGAAAACCTTAAAGTTCCATATGAAATATAAACATTTAGAGAGCCGTTTGGTCTATCCCTGTCCCGATTGCTTGGATATCTTCTCAACATCCTGGAGTGTTTATAGACATCTGTTCAAAGTACACAGAAAAACAGCTGCTCAAATCCGTCGACTCCGAGAGTCTATACAAGCTAAGGCATTTAAAATGAACAACCCGCCAGCATTTTACGAGAAACGGAAGTCCGTTTTGAAAAATTTTCCAGCTCAAAAAATAACAGAGGAAGAGAGAATCTATCAAGAGAATCAGTCTTGGGAGTTGGAGGTGGAAGGCGAAGGTCGTCGTTGCGGTGGTTGTGGGCGGTCGTTCGAGCGTCGAGCCGCGCTCGCAGCACACGCACACACGTGCGCTAGGAGACACACACGAAGAATACAGATACAGATTAGGAAGGACTATCACAAGGAACAGAGCGCGCCGTACCTCGTGATGAACAGAAACAATGAAAATAAACCATCCGAGGAAAAGTCTGAAAAGCCGCCAGAAAAAGAAATCAAAGAGAAAGAACCGAAACCGCTCGAGGAAGCTGTGATGAGTACAGTTGATGTGAAACAACAAGAGACCGAGGATTACAGAGATGACGACACTCAAGACGTGCCAGCTGGGAACACTCTACAGTATTACACGAATACGCTTATAAATAAATTACCATTCGCCCAACAAGCGGAGAAGAGCAATCTGAACGCGTTCAAAAAGAGATTACAATCAGATGTCGAAATAGATCAACTTTTATGTAAAAAATGTAATAGTAAATTTGAACAAATAGGTGAATTATTAGAACACGTCGCTGGACATTATAAATGGTTGCGCTACGCCTGTAAACTTTGCAACTTCAAGCATTTCAACTTTGATAAACTCCCGGAACACGTTAAAGTTGTCCACAAACTCAAAGGCGATACTGATTTCTACTATAGTACCGTAAAAGCCATAGACGGTTCGGAAGCCAGCGAACTATCTTCCCCCGTGGAAGAATTAACCGAATCTAATGAAACTAGTCCAGATTCACGACGTCCAAGCAGATGTTCTAGTGACTCCAGCAGATTATCTGACGATAGCTCCTCCAGCAGTACACGAGTCGAAACCGGTTCGAGAAAACGCAAAGCACGACTGGTCAAAAACATCGGAAAGAAGAAAAAGGATACTGTTGTTATAGATGACAACGAAGAAAGTAAAGAGGTTATGCATAAAGGAGTTTTGTTAGGAGAAAATGATTCGTCCTCCAATTCAAAAATATTCGAAGAAAATTCATCAGATTTGGATGAAGTTGATGAGAAAATAGCAAAGCGCGAAAACATGACATCCGTAGCATGCCGTAGACCAGTTCGTAAGAAAACTAAACGCAAGAACGAAGATTTCGAATACGATCTGTCGAATTTGTTAAAAATGGAAGCGCAGGGCTATCGCGATTCACAAGTCACACCAAAAACTGCTCCTTCTAAGAAGAAAGTACAACAAGATGTTAACCCTCAGTACGAGCTCATCAACAAAGAGTGTTGTGGTGCACTAGTGACGATGTCGAGGTCCTCGGTAGAAAAAGCTCAAGCCCATATGAAGACTGCAACCTTTGCTGTGTTTAACACTTCAAAAGAACCTCGTGTATCAAATATTTTTGTGAGGCCTCTGGTGCCTAAAATTAATAGAGTAGATAAAATATCGCCTAAGAAGGCTGAAAATGAAGAAACAAAAGAAATCTCCCACCCTAGTCCCACTAAAATAATAGACGCCTCCACTCTATCAAATCTCTGTAAGGAATTGGTGATAACTAAAGTTGTAAATAAAAAATGTGAGGAAAAAGAAGCAAATGTATCCGCTAATGAAACTCCAAAAGAAACCGAGCCGATACCTCAAGTAGATAATAAAACGGTCAGCGACGACAGTAAAGAGAAAAAGGAAGAAAAGAATAAGGTATCTGAAATAGAAGCAAAATCTGACGAGAGTGCCTCATCCGAACAAACTAAAACTAATGTGAATGTACCAACAATACTTCCTATAAAATTCCGAAGACAAAGTTTGGAGGTTATACAAAATCCCTTAATAAAGAAAAATATCACAGACTTCACAAAAGCCGGTATGAAAACTAAAATTTTGGTAATCAAACCCATCAATAGGAGCACCGATGGAACAAAAACACTGAAATTTCAAACAATAAAATTGAAAGATCCGAACAAGACCACCACGAAAAATGATGAAATGAAAACCGAACAGGTCGTCGTTGTGAAAGTTCCCAAAGTGGATTGTTCTATAAGCAGATCAATACCAGCCAGCGACGCCCCTGTGGCACTCGACGAGAAATGTGATGAGAATGAAAACGAAAAAGTTAAAACGAATGCTGCAAATCCATCAAATCCTACCGGTGAAAACAGTGTGGAAGAACCTAAAAAAGACATTAAAATAGAAAATGACATAACTGACTTGGTGGAAGACAAACCCGAATCAAAATTAATAGAATGTATAGAATTGGAAGAGGCCGTGATGCAATCTGGTTGA

Protein sequence:

>DPOGS209014-PA
MAGNKINTKKKEEKGKSNRIAGQSIEETEDLDFSLLRKPIHTSVTGFAQARKVFDLATEELKGLLSNECDLLYECKVCRNIFRSLANFISHKRVYCKEKFNPSEHGHFIKNTSSLNEILKIRKLEESYQEILRKENDSNEMDTEETEERIPLTKDLTDIIERISKTKGVQKKQLKEQNLVFQKIPKSNVAVFQNIESDVNKTDTMKAEVSELDKMLSQENAVLQSDGTFKVQTNDVQMNTENVIQISDDEDNEGAPYSVKHGVLKCEICDLQFSTQKTLKFHMKYKHLESRLVYPCPDCLDIFSTSWSVYRHLFKVHRKTAAQIRRLRESIQAKAFKMNNPPAFYEKRKSVLKNFPAQKITEEERIYQENQSWELEVEGEGRRCGGCGRSFERRAALAAHAHTCARRHTRRIQIQIRKDYHKEQSAPYLVMNRNNENKPSEEKSEKPPEKEIKEKEPKPLEEAVMSTVDVKQQETEDYRDDDTQDVPAGNTLQYYTNTLINKLPFAQQAEKSNLNAFKKRLQSDVEIDQLLCKKCNSKFEQIGELLEHVAGHYKWLRYACKLCNFKHFNFDKLPEHVKVVHKLKGDTDFYYSTVKAIDGSEASELSSPVEELTESNETSPDSRRPSRCSSDSSRLSDDSSSSSTRVETGSRKRKARLVKNIGKKKKDTVVIDDNEESKEVMHKGVLLGENDSSSNSKIFEENSSDLDEVDEKIAKRENMTSVACRRPVRKKTKRKNEDFEYDLSNLLKMEAQGYRDSQVTPKTAPSKKKVQQDVNPQYELINKECCGALVTMSRSSVEKAQAHMKTATFAVFNTSKEPRVSNIFVRPLVPKINRVDKISPKKAENEETKEISHPSPTKIIDASTLSNLCKELVITKVVNKKCEEKEANVSANETPKETEPIPQVDNKTVSDDSKEKKEEKNKVSEIEAKSDESASSEQTKTNVNVPTILPIKFRRQSLEVIQNPLIKKNITDFTKAGMKTKILVIKPINRSTDGTKTLKFQTIKLKDPNKTTTKNDEMKTEQVVVVKVPKVDCSISRSIPASDAPVALDEKCDENENEKVKTNAANPSNPTGENSVEEPKKDIKIENDITDLVEDKPESKLIECIELEEAVMQSG-