Monarch geneset OGS2.0

DPOGS205861
TranscriptDPOGS205861-TA4215 bp
ProteinDPOGS205861-PA1404 aa
Genomic positionDPSCF300081 + 559689-566851
RNAseq coverage660x (Rank: top 19%)
Annotation
HeliconiusHMEL0099714e-11250.74% 
BombyxBGIBMGA009384-TA1e-4846.80% 
DrosophilaCG8370-PA7e-3424.02% 
EBI UniRef50UniRef50_C3XX734e-8126.80%Putative uncharacterized protein n=1 Tax=Branchiostoma floridae RepID=C3XX73_BRAFL
NCBI RefSeqXP_970963.27e-8824.69%PREDICTED: similar to RW1 protein [Tribolium castaneum]
NCBI nr blastpgi|1892363931e-8624.69%PREDICTED: similar to RW1 protein [Tribolium castaneum]
NCBI nr blastxgi|3398991955e-6031.87%putative proteophosphoglycan ppg3 [Leishmania infantum JPCM5]
Group
KEGG pathway 
InterPro domain[13-85] IPR0221133.9e-12Protein of unknown function DUF3651, TMEM131
Orthology groupMCL13150 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205861-TA
ATGCAGACGCTGCCGCCGCAAGCCAACACGACCTTCAGCGTGGTGTACCTCGGCCGGCGCGAGGGTCCCGTCTCTGCACACCTCTATATACACACGTCGCTCGGTGTTCATAAACTCCCCGTGTCGGCCGTGGGCGTGGCGAGCGAGTACGACGTGTGGCCGCTGGTCGGACTGCGCGTACCGATGAACGCCAGCGTGGAAGCCGAGCTCAGGTTCCACAACCCTACGGACGCGACGGTGCAAGTGAGCGAGGTGTACTCGACCGGCTCCTGGCTGGGTCTCCGCCTGCCGGGCGGCGGGGAGCGCGCGCCGCGAGACCTGTGGACCGTGCCGCCTCGAGCCACCCGCGCACTGGTGACTCTCCGCCTGGCGCACGCCGCACACCGCGCCCTGGCCGACCACAACCGACCGCTCACGGCGTACATCCGGGTGCGCGCGGACATCCCCGGGGGTGGGCTGGTGGTGGCGGTGGAGGCGCGTGGTGCGCCGGCAGGGGAGCACGCGCGGCCGCTCCACGTGCGTCTGCGGCCGAGAGGATCCGGAGATGCCGACTACACGGTGGAGCTGGAGGTGGCTAATAGTGACGGGGCTGGAGTGTCGCTGGAGGGAGCCGTCAGCGCGCGCTGCTCACGGACTGCACCGCCGCCCTGCGTCCCGCCCCCGCGAGCCCTCCACGACACTACTAAAGCCAATGGCGGCACGGCGCCGGGCGCCGTCCTCACCATGCTGCGGACGCATCTCGAAGCCCACCAGGAGTTCGTTCGCGTGGCGCGACTCAAGCTGGACTACGCCGAGCTCTGGTCGCAGGTGTCTCAGGCCGGCGATGCGGACTCGCCGGAGGAGGCTTGGTGCTCCGGCTGCGTCCGTCTGGGCCGAGCGCTCGTGCCCTTCAGCGTGCAGCTGCTGCCCGGCAGCCTGAGCTTCCAGCCCGAGCATATAGACGTCATCACCGCCGAAGAGGACATGATGCGCGAGCGAGAGGTGCGCGCCCACAACACCTTCAGACGCGAGCTGCGCGTGCTGGCCGTCACCGTCACCGAGGACGTCAAGCGATTCTTCCATGTGGTGGACCACACTCCGCTCCGCCTGTCTGCGGGCGCCTCCGGGAGCCTCCTGCGACTGGCGGCCCGGGACCCCGCCCCGTCCTCGGCCACCGGCCTCGCCGCCGACCTCACCATCCACACGGATCTCGCCGACTACACGCTGCCGCTTTTCATGTATAGCGGAAGACTGCTGCTCGAGTGGGAGTGGGCGAACGTGACGTCCTCGCACGTGGAGCTCGGCACGGTGGGCGCTTCGTGGACGCGGCGGGTGGGCGTGCGCGTGCGGAATCCGGCGCCCGCCTCGTCGCTGTGCGTGAGTCGGCTGTCGGCCGTTTTGACCGGGGCCGCCGCCTCGCTGGCGCTGGCCGAGCCGCGGCGGCGCTGCGTGCCGGCGGGGGGCTGGACGCGCGCCTGGCTCACCGTGGTAGCGCCGGCCGCAGAGGGCGTGCTGCGCGGGGAGGCCACGCTCGACACCCCACACGCCACCACGCGGGCGACCCTGTCGCTGCGCGTGCGCGCCGGCTCGCTGCGCGTGTCTCCCCCGGCCGTGCCCCCCGCGGCGCCCTACGCGAGTTCCTGGTCAACTATAGAGGTGGAGAGCAGCATGTCGGTGGTCATGAGGGTGACGTCGGTGGACCAGCCCGCACCGCCGGACCCCGCCCTTTCGTTCACATGGTCGGGACCGGGCCTGGGCGAGGTGTCCCCGGGTCGTCAGTCCGTGGCCCGCCTGCGTGTCTCTCCAGAGGAACACTGCCGCCCCGACTGCTACACGGGCCTGTCGCTCGACACCCCAGAGGGCGCGGCGTGGGCGGCCCTCGGGGCGCGATACGACGAGAGCGCTCAGCGTGAAGACATGGCATTGCTTCAACGACGATTGGCTCTGTTCCGGGAACAGGCCGCCGGCGGGAACTTCACGCTGCACGTCCACACGGACGAGGTGCTGCAGCTGGCGGCCGACGGAGTGTGGACCGCGCACTGGCCGCGACTGGCCGAGCGCGGGGCTGGCGGCGCACTCAGAGCCGGCGTGGGCCGCGCCGCCACGCTGGTGTTGACCGTGCGCTCGCCCGCCACCGTCCCTATGCTGCTGCACGCCCTGCTGCCGCCGCACACCGCGCCGCCGCATCCTCCTCTGCCGCCCCTGGTCGGCGGGGAGGGCGACCACTGCACGAGTGACGAGTGCACGTGGTCCTCCAAGGCGTGGCGTCTGTCCGGGTGGAGGGTGACGCGCGGGGCCGTGAGGCTATGGAACGAGTCGGACTCCGAGCTGCACCGCCGCGGCGTGCGGGACGGCGGCCTGCTGCTGGCGCCGCGGACCGAGCTCGAGCTGAGCCTCTGGTTCACGCCGGAGGACGCCGGCGCCTTCACCGCCTACCTCTACCTCAGGAATAACCTGACCGTGCTGGAGGGCGTCCGGCTATTCGGCGAGGGCGAGTACCCGAGCTTCGAGCTGGCGGGGCGGCGGCCGGGCGCGGCGGCTCCTTTCTCTTTCGAGGTGAAGGAATGTTCGTGGGGCGGCGGCGGGGCGGTGGTGCGGCGGGTGATCGCACGGAACGCGGGCCGCGTCACGGCCGCCCTCGGGGGCTGGAGAGTCGCCCGAACGGGCTGTGTGGCCCGCGGCTTCCGAGTTTCGCCCTGCGCCCCCTTGGCTTTGGCGCCCAACGCTTCGGCTCCTGTACATTTAGCCTTCTCGCCGGACTACTCGCTGGCCCGGGTGTCGGCGGCGCTCCAACTGGACACCGACCTCGGTCCTGTGGAGTTCACGTTGCTGGCCACCGTCCCGGCGAAGGTGCTCGCTAAATGTGCCGCGAAGACCCCTCGTCCGCCCTGGGACGGCATGATGAGAGGTGTCTGTTTGTTGGTCTCCATCGCGGCCTTCGCTCTCGTCCTGGGCGCCGGGGCGCTGGACGCCGAGCGCCTGTTGCGCAGAGCACGCGCTTCTCGCGCACCCGCTCCGCCGCCGCGTTCCGCTCCCCTGGACCTGCGACGCCTGGCCGCCGAGCCAGCGCCCGCCCCCGCCGTCCCTCGCGCTCCTCCCCGCCGGCGCCGCTCCACTCGCCGCCCGCTGCCTTCGCTAGACCCTCGCGCTGAGCGCCGCGCCTTCGAGCGCTGGCGAGCCGGCGTCTTGCGCGGAGACGACGACTCCTCGCGTTCTAGCGAAGACGTCGACCTCGACGAAGTCTCCCCCGCGTCTCCGTCAGAGCGCTCCTCCGCCGACCGCCCGGCCGAACCCGGACCGGAACCTTCCTCGAAGCGGGCGGAGGGTGAGCCGCCCAAGGAGGAGGAGTCGCCCGCAGCAGACGAAGAACCCAACTCGACGGGGTCAGACGCCTCGACGCCGGTCGAGGACCGGGACGACGAGCCCTACGGCGGAGACGTGGAGCCCGACCCGAGGGATGACTCGCCCACGGACGGGCCAGAGGAGACGCCCGTCGCGACGCCCATTGTCCGACCCATACCGCGCTCGCCCCCGGTAGAGACCGCGACACGGGACGCTCGTCCGCGGCGGGAGGCCTCGGAGAGCTCGCGGCGGATAGAACGATCGAGAACTGCGGACGGCGGCGACCGGCGGCCCGCGCCCGCTCGCCACACGCGTAAAGAGCGCGCCTCCAAGCGCCGCGGGGAGCGTCGACAGCCGAGTCCGCCGCCGCAGTCACCGTCGGGCGGGCCCGGAGCCCCGGGGGCGCCGGCGGAGGCGGGCGAGGCGCGCGGCCTTCGGTGGGGCGCCTCGTGGAGCTCGGTGGTGGCTTCCCGCGGGGCCCCCCTAGCACCTATCGGGTCCGACGTTCGGCGCCGGGAACCCGAGCGCCCGGCCGACAACTCCCTCTTCTACTTCAACGGCACGTCGGAGTCTCCGCCCCGGCCGGAGCCCGAGTTCTCCTGGCGGCCGCCGGTGTCCGCCGACCGCCCCGCTTTCGTCCCCGCCGCTAGAGACTTCATCGAAGAAGCTCCATCTTTGGTGTGGGGCGGTGCCTGGGGCGCCTGGGGCTCGGGCGGCCTACGCCCCCCGCCCGGCTTCTCCCCCCGGAGCCCCCCCCGCAGCGCTCGTACGATCCCTTCCGTTCCCTGGCGTCCATCTGGGCGCCGGCTCCTCACGACTGGCGCGCCGCTGACCCTCGGCGGGAGGACGCCGACCGGCCGTAGCCCGACGAGTGAACCCGCGCGGCGGCGCTAG

Protein sequence:

>DPOGS205861-PA
MQTLPPQANTTFSVVYLGRREGPVSAHLYIHTSLGVHKLPVSAVGVASEYDVWPLVGLRVPMNASVEAELRFHNPTDATVQVSEVYSTGSWLGLRLPGGGERAPRDLWTVPPRATRALVTLRLAHAAHRALADHNRPLTAYIRVRADIPGGGLVVAVEARGAPAGEHARPLHVRLRPRGSGDADYTVELEVANSDGAGVSLEGAVSARCSRTAPPPCVPPPRALHDTTKANGGTAPGAVLTMLRTHLEAHQEFVRVARLKLDYAELWSQVSQAGDADSPEEAWCSGCVRLGRALVPFSVQLLPGSLSFQPEHIDVITAEEDMMREREVRAHNTFRRELRVLAVTVTEDVKRFFHVVDHTPLRLSAGASGSLLRLAARDPAPSSATGLAADLTIHTDLADYTLPLFMYSGRLLLEWEWANVTSSHVELGTVGASWTRRVGVRVRNPAPASSLCVSRLSAVLTGAAASLALAEPRRRCVPAGGWTRAWLTVVAPAAEGVLRGEATLDTPHATTRATLSLRVRAGSLRVSPPAVPPAAPYASSWSTIEVESSMSVVMRVTSVDQPAPPDPALSFTWSGPGLGEVSPGRQSVARLRVSPEEHCRPDCYTGLSLDTPEGAAWAALGARYDESAQREDMALLQRRLALFREQAAGGNFTLHVHTDEVLQLAADGVWTAHWPRLAERGAGGALRAGVGRAATLVLTVRSPATVPMLLHALLPPHTAPPHPPLPPLVGGEGDHCTSDECTWSSKAWRLSGWRVTRGAVRLWNESDSELHRRGVRDGGLLLAPRTELELSLWFTPEDAGAFTAYLYLRNNLTVLEGVRLFGEGEYPSFELAGRRPGAAAPFSFEVKECSWGGGGAVVRRVIARNAGRVTAALGGWRVARTGCVARGFRVSPCAPLALAPNASAPVHLAFSPDYSLARVSAALQLDTDLGPVEFTLLATVPAKVLAKCAAKTPRPPWDGMMRGVCLLVSIAAFALVLGAGALDAERLLRRARASRAPAPPPRSAPLDLRRLAAEPAPAPAVPRAPPRRRRSTRRPLPSLDPRAERRAFERWRAGVLRGDDDSSRSSEDVDLDEVSPASPSERSSADRPAEPGPEPSSKRAEGEPPKEEESPAADEEPNSTGSDASTPVEDRDDEPYGGDVEPDPRDDSPTDGPEETPVATPIVRPIPRSPPVETATRDARPRREASESSRRIERSRTADGGDRRPAPARHTRKERASKRRGERRQPSPPPQSPSGGPGAPGAPAEAGEARGLRWGASWSSVVASRGAPLAPIGSDVRRREPERPADNSLFYFNGTSESPPRPEPEFSWRPPVSADRPAFVPAARDFIEEAPSLVWGGAWGAWGSGGLRPPPGFSPRSPPRSARTIPSVPWRPSGRRLLTTGAPLTLGGRTPTGRSPTSEPARRR-