Monarch geneset OGS2.0

DPOGS215901
TranscriptDPOGS215901-TA1482 bp
ProteinDPOGS215901-PA493 aa
Genomic positionDPSCF300029 + 378298-380652
RNAseq coverage181x (Rank: top 49%)
Annotation
HeliconiusHMEL0219362e-7161.89% 
BombyxBGIBMGA000429-TA1e-9058.58% 
DrosophilaCG34461-PA9e-2553.39% 
EBI UniRef50UniRef50_Q9BPR72e-4354.55%Cuticle protein n=18 Tax=Obtectomera RepID=Q9BPR7_BOMMO
NCBI RefSeqNP_001036862.13e-4454.55%cuticular protein RR-2 motif 99 [Bombyx mori]
NCBI nr blastpgi|2236713006e-4354.55%TPA: putative cuticle protein [Bombyx mori]
NCBI nr blastxgi|1892343811e-6247.32%PREDICTED: similar to Larval cuticle protein A3A (TM-A3A) (TM-LCP A3A) [Tribolium castaneum]
Group
Gene OntologyGO:00423022.8e-15structural constituent of cuticle
KEGG pathway 
InterPro domain[239-291] IPR0006182.8e-15Insect cuticle protein
Orthology groupMCL18037 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215901-TA
ATGTTTACAAAAGTACTGTCATTAAGCGCTGTTTTGGCGGTTGCGGCAGCAGGCCTTATCGCTGAACCTCATTACTCCTCTGCTGCTGCAGTTTCTTCCCAAAGTATCGTTCGCCACGATCAATCTCATGCTGTAGCTGCCGCTCCAGTTGCAATCCACTCTGCTCCAGTTGCCATTCACTCTGCTCCTGTTGCCTACCATGCAGCACCAGTACACTATTCATCTGCCGGAGCTGTATCTTCTCAGTCGATCCAACGTCACGACCAACAGCGTGCTGCTATTGCTGTTGCACCCGTAGCCCACTACGCCGCTGCTCCAGTTGCTGTTCACTCTGCTCCTGTTGCTTACCACGTAGCACCAGTACACTATTCATCTGCCGGAGCTGTATCTTCTCAGTCGATCCAACGTCACGACCAACAGCGTGCTGCTATTGCTGTTGCACCCGTAGCCCACTACGCCACTGCTCCAGTTGCTGTTCACTCTGCTCCTGTTGCCTATCACGCAGCACCAGTACACTATTCATCTGCCGGAGCTGTATCTTCTCAGTCCATCCAACGTCATGACCAACCCCGTGCTGCTATTGCAGTGGCTCCCGTAGCTCACTACTCAGCTGCTCCAGTTGCTCACTATGCAGCTGCCCCAGTAGCTCACTACTCTGCCCCTATCGCCCATGCTGCATATGCTGCCCACGAAGAAATCGACTCTCACCCTCAATACGACTTCTCTTACTCCGTACATGACGGACACACCGGCGACAACAAGTCACAGCACGAGAGCCGCGACGGTGACGCAGTGCATGGCGAGTACTCTCTAGTAGAGGCTGACGGATCTGTACGTACCGTTCAATACAGCGCTGATGATCACTCTGGATTCAACGCCGTCGTCAGCCACTCCGCTCCATCAGCTCACGCTGTTCCAGTGCCAACGCACTCGATCCAACGTCACGACCAACAGCGTGCTGCTATTGCTGTTGCACCCGTAGCCCACTACGCCGCTGCTCCGGTTGCTGTTCACTCTGCTCCTGTTGCCTATCAAGCAGCACCAGTACACTATTCATCTGCTGGAGCTGTATCTTCTCAGTCCATCCAACGTCATGACCAACCCCGTGCTGCTATTGCCGTGGCTCCCGTAGCTCACTACTCAGCTGCTCCAGTCGCTCACTACGCAGCTGCCCCAGTAGCTCACTACTCTGCCCCTATCGCCCATGCTGCATATGCTGCCCACGAAGAAATCGACTCTCACCCTCAATACGACTTCTCTTACTCCGTACATGACGGACACACCGGCGACAACAAGTCACAGCACGAGAGCCGCGACGGTGACGCAGTGCACGGCGAGTACTCCCTGGTAGAGGCTGACGGATCTGTACGTACCGTTCAATACAGCGCTGATGATCACTCTGGTTTCAACGCCGTCGTCAGCCACTCCGCTCCATCAGCTCACGCTGTTCCAGTGCCAACGCACGTACTCGCACATCATTAA

Protein sequence:

>DPOGS215901-PA
MFTKVLSLSAVLAVAAAGLIAEPHYSSAAAVSSQSIVRHDQSHAVAAAPVAIHSAPVAIHSAPVAYHAAPVHYSSAGAVSSQSIQRHDQQRAAIAVAPVAHYAAAPVAVHSAPVAYHVAPVHYSSAGAVSSQSIQRHDQQRAAIAVAPVAHYATAPVAVHSAPVAYHAAPVHYSSAGAVSSQSIQRHDQPRAAIAVAPVAHYSAAPVAHYAAAPVAHYSAPIAHAAYAAHEEIDSHPQYDFSYSVHDGHTGDNKSQHESRDGDAVHGEYSLVEADGSVRTVQYSADDHSGFNAVVSHSAPSAHAVPVPTHSIQRHDQQRAAIAVAPVAHYAAAPVAVHSAPVAYQAAPVHYSSAGAVSSQSIQRHDQPRAAIAVAPVAHYSAAPVAHYAAAPVAHYSAPIAHAAYAAHEEIDSHPQYDFSYSVHDGHTGDNKSQHESRDGDAVHGEYSLVEADGSVRTVQYSADDHSGFNAVVSHSAPSAHAVPVPTHVLAHH-