Monarch geneset OGS2.0

DPOGS209320
TranscriptDPOGS209320-TA1152 bp
ProteinDPOGS209320-PA383 aa
Genomic positionDPSCF300234 + 3354-5045
RNAseq coverage1384x (Rank: top 9%)
Annotation
HeliconiusHMEL0181023e-14960.21% 
BombyxBGIBMGA013756-TA5e-12958.91% 
DrosophilaCda9-PA9e-9846.43% 
EBI UniRef50UniRef50_B1NLD61e-13660.39%Chitin binding PM protein n=6 Tax=Noctuidae RepID=B1NLD6_HELAM
NCBI RefSeqNP_001103904.12e-10648.02%chitin deacetylase 9 [Tribolium castaneum]
NCBI nr blastpgi|1878846021e-13659.94%chitin deacetylase 1 [Mamestra configurata]
NCBI nr blastxgi|2838268196e-14058.33%chitin deacetylase 5a [Helicoverpa armigera]
Group
Gene OntologyGO:00059756.1e-32carbohydrate metabolic process
GO:00038246.1e-32catalytic activity
GO:00168104e-11hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
KEGG pathway 
InterPro domain[26-330] IPR0113306.1e-32Glycoside hydrolase/deacetylase, beta/alpha-barrel
[239-325] IPR0025094e-11Polysaccharide deacetylase
Orthology groupMCL18051 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209320-TA
ATGAGGTGCACTTCAGTTTTATTAGGTTTGGTCCTTTTTGTGTCCGTTCATTGTCAGAACGATAGTCTTCCCGCGGCGGAAAAGTGTGATCCGGAGAAATGCAAGTTGCCAAACTGTAGATGCTCGTCCACTGAAATCCCTGGAAACTTGGAAGCCCGCGATACACCACAGTTCGTTATACTCACATTCGATGACGCAGTGACCACAGTGAACATCGAGACCTATCGCAGTATCCTCTACAATAGAGCCAACTCGAACAGGTGTCCCATCGGAGTGACATTCTTCATCAACCATGAGTACACAGATTACAGCATTGTTAACGAGTTATACAACCGGGGCTTTGAAATTGCTCTTCATTCGATCACTCACAAAACTAATCAAACATACTGGAAAGAAGCCACCGTTGAAGAATCCACCAGAGAATTCGTAGATCAGAGAATTCTCGTGTCTCATTTTGCAAATATTCCCCAAAGATCTATCCAAGGGATTCGCAGTCCTTTCCTTCAGTTGTCCGGCAACAGTACCTATCAAATGATAAAAGAGAACGGTTTGACTTACGACTTGAGTTGGCCGACTGTCAGGTTTACTGATCCCGGTCTCTGGCCCTACACTCTCGACTACGCTTCAATCCAGGATTGCGTCATCGCCCCTTGCCCTACAGCGTCTGTTCCCGGTGTTTGGGTCATTCCCATGATCTCCTGGACTGATCTGGAGGGTTTCCCTTGCTCTTTTGTTGATGCCTGTTTTTCCAATCCTAACTTAAGCGACGAAGATGCTTGGTTCCAATACATCGTCAAAGCATTCGAGAAGCACTACCTCGGCAATCGTTCTCCTTTCGGATTCTATGTCCACGAATGGTTCGTTAGGATCAATCCGGGAGTTAAAGGCGCTCTGGTCCGCTTTATGAATATGGTCCAAAATATGAACGACGCATTTTTGGTGAACGCTAATGAGGTAGTCAACTGGGTAAAGAACCCGGTACCACTGAATGAGTTTGTAAAACAGGATTGTCCCCGCTTCGTCCCTGCTGCCTGTCGTCGGACGACCTGCTCCGCTCTGAAAGAGGAGGAAAGTGGCAATACTTATTACATGACAATCTGCAACAGATGTCCCCGAGTCTATCCTTGGCTTAACAATCCTCGTGGCGTCTAG

Protein sequence:

>DPOGS209320-PA
MRCTSVLLGLVLFVSVHCQNDSLPAAEKCDPEKCKLPNCRCSSTEIPGNLEARDTPQFVILTFDDAVTTVNIETYRSILYNRANSNRCPIGVTFFINHEYTDYSIVNELYNRGFEIALHSITHKTNQTYWKEATVEESTREFVDQRILVSHFANIPQRSIQGIRSPFLQLSGNSTYQMIKENGLTYDLSWPTVRFTDPGLWPYTLDYASIQDCVIAPCPTASVPGVWVIPMISWTDLEGFPCSFVDACFSNPNLSDEDAWFQYIVKAFEKHYLGNRSPFGFYVHEWFVRINPGVKGALVRFMNMVQNMNDAFLVNANEVVNWVKNPVPLNEFVKQDCPRFVPAACRRTTCSALKEEESGNTYYMTICNRCPRVYPWLNNPRGV-