Monarch geneset OGS2.0

DPOGS210538
TranscriptDPOGS210538-TA3354 bp
ProteinDPOGS210538-PA1117 aa
Genomic positionDPSCF300304 - 171960-179005
RNAseq coverage194x (Rank: top 48%)
Annotation
HeliconiusHMEL0095520.069.62% 
BombyxBGIBMGA009857-TA9e-11330.82% 
DrosophilaMyo95E-PE4e-16636.49% 
EBI UniRef50UniRef50_D6WBX60.041.47%Putative uncharacterized protein n=2 Tax=Tribolium castaneum RepID=D6WBX6_TRICA
NCBI RefSeqXP_971077.20.041.12%PREDICTED: similar to unconventional myosin 95e [Tribolium castaneum]
NCBI nr blastpgi|2700023100.041.47%hypothetical protein TcasGA2_TC001321 [Tribolium castaneum]
NCBI nr blastxgi|2700023100.041.24%hypothetical protein TcasGA2_TC001321 [Tribolium castaneum]
Group
Gene OntologyGO:00055242e-140ATP binding
GO:00164592e-140myosin complex
GO:00037742e-140motor activity
KEGG pathway 
InterPro domain[13-737] IPR0016092e-140Myosin head, motor domain
[894-1094] IPR0109261.9e-24Myosin tail 2
Orthology groupMCL10069 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210538-TA
ATGGCGGAGCCCGAGCCTGCGCCCGCCGGCCTCGCCGACGCCGTGCTACTGGCGCCACTAACCGAGGATTCTTTCCTGCACAACTTACATGTTCGATATAAAAGAGATATCATATATACGTATGTGGGCAACGCCCTGGTGTCGGTGAACCCTTGCCGTCCACTGCCGCTGTATTCGGCCGAGCTTGTTCGGACCTACCTGGCCCGGCCTCCCTACCTGCTACCACCACATCTATACGCGATAACGGCTACGGCTTACCGCTGGGTCCGAGACAGGAACGAAAATCAGTGCATCGTTATAACGGGAGAGAGCGGGTCGGGCAAGACGGAGGCGGCGCGCGTGTGTCTGCAGTGTGCCGTTGTGGCGGGCGGCGGTGGCGGGGGAGGAGCGCTGGCGGCCGCGGGGACTCTGCTCGAGGCCTTCGGGAACGCGGCCACCGCTCGCAACCACAATGCCAGCCGCTTCGGCAAATTGTTAGACATAGAGTTCGACTTCAAGGGCGAGCCGGTCGGAGGACACATAACACACTACCTTCTGGAAAAGGAGCGTGTGTGTTGCGCGGCGGAGGGCGAACGGAACTTCCACGTCCTGTACCAGCTGCTGGCCGGCGCCGACGCTCACCTGCTCAAGCGTTTGAAGCTGCAGCGGTCGTGGGAGCAATACCGTGTGCTGAGGGGCGGCACGTGCTCGGGCGCGTCCCCTCGGCGGGCCGTACTGCCGCCGCCGGGGCCGCGAACGCCTCATGCAGACAGGGACCACTTCGCCTTCACCAAGGCGGCGATGGGCGCTCTTGGCTTCAGCGCGGGTCAGTGCGGCGCGGTGCTGCGAGTGTTGGCCTTCCTCCTCAAGCTGGGGAACGTGCAGTTCGAGCCCCAGCACAACATCGACGGCTCCATCGGCACTCGCCTCCAGCACGAGTACGAACTCCGCGAGGCGTGTGCGCTGGTCGGGGTGGACGCGGACGAGTTGGAGCTGGCCCTGGGCGCCGCGCCGCCCTCGGTGCCTCGCGATGACCCTCACGGTATCGAGTGCGGCTCGGAAGCGGGGTCCGAAACGGAAGGCGCCGGTGCCGAGTGGGCCCGGGCGCTTCGTGACCGCCTGCTGTGCGCTCTCTACTCTCGGCTCTTCAACTGGCTCGTGAGCGCCGTCAACGAAGCGCTGAAGCCGGCGTCGGTGGGAGCGCGCCGCTCGCTGGGCATCTTGGACGTGTACGGACTGGAGGCGCTCGCTCACAACGGACTCGAGCGCCTCCTCATCAACTACGCCGCCGAGCGCGTGCAGGCGGCCGTGACGGCGGCTACGCTGCGCCGCGAACAGGAAGAGTACGCGCGCGAGGGACTGGCCTGGGCGCCGCTGGTGTACGCCGACCACGAGATGTACGCGGACCTCTTGGACGCGGGTCCCGAGAGCGTGCTGGGAATCTTGCGTGACTGCACGGCACGAGGAACCGGGGACGCCGTCTTCTTGCAGCGTTTGCAGCGACGCCGTAACCCTCGACTCGTCGTGCTGCCTCCTGATCGTTTTCAGGTGGTTCACTTCGGTGGCGCGGTGATGTACAGCGCGCGGGGGATCGTGGCTAAGAATCGCGACGCAGTTTGCCGTCGGTGTGCCGGAGCCCTGGGCGCCGCACGGGAACCTCTGTTAGCGGCGCTTTTCGCCCCCGGCGCCTCCGCAGCGCCCCCCGCTGGGTACGACCCGGCCGAGACTGCTTCTCATCCCGTTGTGAAGTATGGTAATTATATATCTATACTCATATCTCCGGGTTCCAGCACGGGTTCTCCTCGGCGACCGTGTGCGTTGTCGTGTCGTCAGCGTGCGTTGGTGGGCGCGTTGGTTCGCCGCCTCCCGGCCGCCCCGCGCCTCGTCCGTTGCCTGCGTGCCGACGCCGCGCTGCGTCCGCACCGTTTCGACGCCGCTCTGCTGCGTCACCAGATCCGTACTCAGGGGATCATGGACATGGCGATGTTGCGGCGCGCCGGCTGGTGCGAGTCGATGTGCGCGCGGGCATTGCTAGCCCGCTACGGATTACTCCGGGGCCGCGACGCCAAGGCGTCCGGTCCGGGGGCGGGCCCGGGGTCGGCCCCCTCGTCTCCGGGCGGCGCGGTGTCGTCGGGGCGGGAGGACGCCGTGCGCGCCGCCCGCGGCTTGCTGCGCTCGCTGCCCATACCCAGCGCGGAGTTCGCGTACGGCCGGACGAAAGTCTTCATCCGCAGTCCGAGAACCGTGTGGGAGTTGGAGGCGCTGCGAGCGGCGCGTGTGTCGTCGCTGGTGTGCGTGGCGCAGCGCGCGTGGCGGCGTCACCGCGCGAGGACTCGTGCGCGGGCCGCAGCCATCATAGCGAGAGCCTGGACGCGCCACCGAACCCGAGAGAAACGTCGCGCCGTCGAGGAGCGGCGAGTGTCCGTGGCGCGAGGCGTGGAGGGCGGCGGCGGAACGGCTCGAGCCGGGCTGCTCGTGTGGCGCTGGTGGTGCTCCCAAGCGCGGCGGGCGTACCTGAGTTCTTTGTGGTCGCGGCTGCCGGCCCGCCACCGCTCGCCCGCGTGCGCCGCGTGGCCGCCGTGTCCGCAGCCCCGGCTGTTGGCGCGCGCGGACGCCCTCCTGCGCCGCCTGCACCACCGCTGGCGCTGTCACCTCTACCGCCGCGCCTTCGACCAGACGGCGAGGAACCGCATGCGCGAGAAGGTCACCGCGAGCGTGCTGTTCAAGGATCGCAAGCTGAACTACGCCCGCAGCGTGGCGCATCCGTTCGTGGGTGACTACGTCCGCCTTCGCGCGTCGGCGGCATGGCGGCGCGGGCCGGGGGCGGGTGCGGCGGACCGCTACGTGGTGTTCGCGGACGTGGTGGGCAAGGTTGCGCGCTCCAGCGGGCGTGTGTCCCGCTGCCTGGCCGTGGTGTCCACGGGCGCGCTGCTGCTGCTGGAGGCGCGCTCGTTGCGTCTCAAGCGCCGCGTGCCCGCTCACTGCGTGTACCGCCTGTCTCTGTCGCCGTTCGCCGACGATCTCCTGGCCGTTCACGTGCGCGCGTGCGGAGGGCTGGAGAGTTCGGTCGAGGAGTTGTCTCAGTGTTCGGTGCGTGAGGCGGCGGACGCTCCCGGCTGCCTGTTCGGGGGCGAGGGCTCGTGGCGGCGGCGCGGGGACGTGCTGGTCCGCACCTGCCACGTGCTCGAGCTCGCCACCAAGCTCTTCCTCGTGGTGCAGAACGCGGTCGGCTCGCCGCCACACGTCAATATCGCCACAGAGTTTGAGGCCAACTTCGGCCAGCAAATGGTGACGGTGGCTTTCCACGCCCTGAGCGGCGGAGAGGCGGGTGCGCGCGTGTTGCGCCGCGGCAGTCGTATGGACGTGCTGCTGTAG

Protein sequence:

>DPOGS210538-PA
MAEPEPAPAGLADAVLLAPLTEDSFLHNLHVRYKRDIIYTYVGNALVSVNPCRPLPLYSAELVRTYLARPPYLLPPHLYAITATAYRWVRDRNENQCIVITGESGSGKTEAARVCLQCAVVAGGGGGGGALAAAGTLLEAFGNAATARNHNASRFGKLLDIEFDFKGEPVGGHITHYLLEKERVCCAAEGERNFHVLYQLLAGADAHLLKRLKLQRSWEQYRVLRGGTCSGASPRRAVLPPPGPRTPHADRDHFAFTKAAMGALGFSAGQCGAVLRVLAFLLKLGNVQFEPQHNIDGSIGTRLQHEYELREACALVGVDADELELALGAAPPSVPRDDPHGIECGSEAGSETEGAGAEWARALRDRLLCALYSRLFNWLVSAVNEALKPASVGARRSLGILDVYGLEALAHNGLERLLINYAAERVQAAVTAATLRREQEEYAREGLAWAPLVYADHEMYADLLDAGPESVLGILRDCTARGTGDAVFLQRLQRRRNPRLVVLPPDRFQVVHFGGAVMYSARGIVAKNRDAVCRRCAGALGAAREPLLAALFAPGASAAPPAGYDPAETASHPVVKYGNYISILISPGSSTGSPRRPCALSCRQRALVGALVRRLPAAPRLVRCLRADAALRPHRFDAALLRHQIRTQGIMDMAMLRRAGWCESMCARALLARYGLLRGRDAKASGPGAGPGSAPSSPGGAVSSGREDAVRAARGLLRSLPIPSAEFAYGRTKVFIRSPRTVWELEALRAARVSSLVCVAQRAWRRHRARTRARAAAIIARAWTRHRTREKRRAVEERRVSVARGVEGGGGTARAGLLVWRWWCSQARRAYLSSLWSRLPARHRSPACAAWPPCPQPRLLARADALLRRLHHRWRCHLYRRAFDQTARNRMREKVTASVLFKDRKLNYARSVAHPFVGDYVRLRASAAWRRGPGAGAADRYVVFADVVGKVARSSGRVSRCLAVVSTGALLLLEARSLRLKRRVPAHCVYRLSLSPFADDLLAVHVRACGGLESSVEELSQCSVREAADAPGCLFGGEGSWRRRGDVLVRTCHVLELATKLFLVVQNAVGSPPHVNIATEFEANFGQQMVTVAFHALSGGEAGARVLRRGSRMDVLL-