Monarch geneset OGS2.0

DPOGS205917
TranscriptDPOGS205917-TA1998 bp
ProteinDPOGS205917-PA665 aa
Genomic positionDPSCF300089 + 381087-388705
RNAseq coverage335x (Rank: top 34%)
Annotation
HeliconiusHMEL0055115e-11277.66% 
BombyxBGIBMGA007018-TA1e-15651.93% 
DrosophilaSap130-PC5e-2437.16% 
EBI UniRef50UniRef50_D6WNN51e-3534.75%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WNN5_TRICA
NCBI RefSeqXP_397260.31e-3837.37%PREDICTED: similar to CG11006-PA, isoform A [Apis mellifera]
NCBI nr blastpgi|910820154e-3534.75%PREDICTED: similar to sin3a-associated protein sap130 [Tribolium castaneum]
NCBI nr blastxgi|3454955327e-4226.53%PREDICTED: hypothetical protein LOC100678030 isoform 2 [Nasonia vitripennis]
Group
KEGG pathway 
InterPro domain[532-663] IPR0241378.3e-25Histone deacetylase complex subunit SAP130
Orthology groupMCL25304 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205917-TA
ATGAGTGGTCATTTAGAAAATGATAGGTCTGTTACAACTGGTAAAATGTATCCTATTGACCTAGCACCTCAAAAAATAACTATAGTAAAGAGTATGTCTAATGCTGAAGTGAAAATGGCTCATTTAATCCCAGTCCAAACAAAAACGACCAGCTCCAATCAAATAATCTCTCACGGTAACACTGGTTCAATGGGATTAATGCGTACAGGTACCCAAATCATTTCCCCAGGCGTCACTTCTCAGCCCCAAATGATAGTTAGCGGGTCACCAATACTACAAAGCACACAGATAGTCAGCCAAGGATCCCAACTTCTTGGACCAAATGCACAAATGTCGCAGCCTCAATTGATTCCCGGTGGTCAAATTTTAAGTCCCGGTACCCAAATAATAAGTCAGGGTACACAGCTGTCGGCTCAAGTATCAAGTAATAATAATGCAGTTTCTAGTACTGTACAGTCTAGTAATCCCTCCACTGCAAATAGTGCACAGTTGTTAAATGTTGGTGGCTTAGTTAGTGGATCTAGCAATTTAGTGGTTAGCTCCTCTGTGCGTACTCTGCCTTCCAGTGTAAGGGTGTTACCACCTATGCAACACACTAGTAATAGGCCGGTTCTCTCCAGTGTTAATGTTAGCAGTGCAAGTGGAGTTCTTGTGAGTAAGGGGGTGACAAGTCATGTGCCCCGTGGTTTAGCAGCGGGAGCTTCCCTCGCCGTGCGTCCCGTAACAAACACACAGCCAGCTAACACTCAAGGTTGGTCGAGCAGCCGCGGCCGCGGGCGTGCGCTTGTATATGGCTGTAGGGCTCGCTCGCCCGCTCCCCGCGCCCCCGCACCCAGTTGTACTACCATACCCACAACCACTGTACTAACATCTACCGGTGTTATATCAAGCACTGTCCGTCAAGCGCCCCGCGCGCCCGTCCCCACTACACCGCCGGCTACGTCCGCTCGACCATTGCCTTTATTGCAAAGAAACTATCAACCTACTAAAGTGGTAGGTGTAGCCAGTGTTGGTATGCGTGGCGTGGCTAACAGCGCCCCCTCTCAGTTGTATTACGAAGTGCCACGGCCACAACAGTCTCAACCACTGCAAGTGCCACTACAGCCGCAACTGACAAGATCATTGACACCATATGCACATGCACAAGGATTAGTGAGCATAGTGAATGCCTCACAGTCTGATGTACGTCAGCTATCTTCTTCCATTCAGAACTCCGCACCGCGACCTTCGATATTAAGAAAGAGGGACATTGATGGGTCGCCAACGAAGTCGTCAATATATTCTGAGGGTACGGGATGGGAGGATGTTCCCAGCGGTTCAGGGTCAGGGTCGACAACGATATCAGCGGCGTCTTCTCCACGGGATCTGGACCTAGAGCCGGGACCGGAGCCCGAGCCGGTACCGGAGCCCGACCACGACCTGTCCCCGAGGAAGAAACCCCGCAAACAAATATTGAGCAATGAAGTTAGACAATGTGAGTTCCCAGCGGAAGACACGCCGCCCTCACCGCCGCCAGCAGCCCCGGCACCGCCACTACCCAAACGTCCGTCACTGAGCTCGAGCTATGTGTGCGGCTGGCGGAGTACAGCGTTACACTTCACCCGGCCGTCGGATGTTCGCCGCAGAGAGCCGAGAGCCCGTGACATAGTTAGCATCGCGGCCCAGAGACACGTGCTCACCAGCGCTGAGGGCTGGAAGGTACATCATTTGACAGCACAAATGGACGACCTGGTGTCTCTAGAAGCGGATGTGGGTGAACAGTTGGCGGGAGTTTTGGGCGCAGTGGCGGCGAGGGACCGTGGACCGCTTCATGCACTGCAGCACACACTACTGGAACTTGTCAAGGGTAATATTCAAAGAAGTAAAATCGTGTGCGAGGGTATACAAGAAGCCCGGGAGGATATCCTGCGAGTGTTCAAACATCGCAACTTTGTTTCCGACATTTTGACTCGACAGGCCGACAAGCGATGTTTCAGGAAGCATAGATCGCAATCATAG

Protein sequence:

>DPOGS205917-PA
MSGHLENDRSVTTGKMYPIDLAPQKITIVKSMSNAEVKMAHLIPVQTKTTSSNQIISHGNTGSMGLMRTGTQIISPGVTSQPQMIVSGSPILQSTQIVSQGSQLLGPNAQMSQPQLIPGGQILSPGTQIISQGTQLSAQVSSNNNAVSSTVQSSNPSTANSAQLLNVGGLVSGSSNLVVSSSVRTLPSSVRVLPPMQHTSNRPVLSSVNVSSASGVLVSKGVTSHVPRGLAAGASLAVRPVTNTQPANTQGWSSSRGRGRALVYGCRARSPAPRAPAPSCTTIPTTTVLTSTGVISSTVRQAPRAPVPTTPPATSARPLPLLQRNYQPTKVVGVASVGMRGVANSAPSQLYYEVPRPQQSQPLQVPLQPQLTRSLTPYAHAQGLVSIVNASQSDVRQLSSSIQNSAPRPSILRKRDIDGSPTKSSIYSEGTGWEDVPSGSGSGSTTISAASSPRDLDLEPGPEPEPVPEPDHDLSPRKKPRKQILSNEVRQCEFPAEDTPPSPPPAAPAPPLPKRPSLSSSYVCGWRSTALHFTRPSDVRRREPRARDIVSIAAQRHVLTSAEGWKVHHLTAQMDDLVSLEADVGEQLAGVLGAVAARDRGPLHALQHTLLELVKGNIQRSKIVCEGIQEAREDILRVFKHRNFVSDILTRQADKRCFRKHRSQS-