Monarch geneset OGS2.0

DPOGS214167
TranscriptDPOGS214167-TA1617 bp
ProteinDPOGS214167-PA538 aa
Genomic positionDPSCF300014 - 259321-265648
RNAseq coverage10805x (Rank: top 1%)
Annotation
HeliconiusHMEL0068110.085.66% 
BombyxBGIBMGA006213-TA0.086.69% 
Drosophilaserp-PB0.079.70% 
EBI UniRef50UniRef50_Q86P230.079.70%RE22242p n=16 Tax=Arthropoda RepID=Q86P23_DROME
NCBI RefSeqXP_320597.30.084.66%AGAP011936-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|2838268170.086.74%chitin deacetylase 1 [Helicoverpa armigera]
NCBI nr blastxgi|2838268170.086.74%chitin deacetylase 1 [Helicoverpa armigera]
Group
Gene OntologyGO:00059751.4e-36carbohydrate metabolic process
GO:00038241.4e-36catalytic activity
GO:00168103e-15hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
GO:00055157.4e-10protein binding
GO:00080612.5e-08chitin binding
GO:00060302.5e-08chitin metabolic process
GO:00055762.5e-08extracellular region
KEGG pathway 
InterPro domain[167-476] IPR0113301.4e-36Glycoside hydrolase/deacetylase, beta/alpha-barrel
[397-471] IPR0025093e-15Polysaccharide deacetylase
[112-155] IPR0021727.4e-10Low-density lipoprotein (LDL) receptor class A repeat
[60-101] IPR0025572.5e-08Chitin binding domain
Orthology groupMCL10267 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214167-TA
ATGGCGCGCTACGCCCGTGTCGCTCCTCTGCTCGCGTGTTTCTTTTTCGCCTGCGCGGCCGCTAATAACGGTGCCCACCGTTGGCGTCGTCAGGCAGATGCCAAGAGTGAGGACCAGAATAGTGAGATATGCAAGGACAAGGACGCTAGCGAATGGTTCCGGCTGGAGATCGGCGAGGGCGACGCCTGCCGCAGCGTCATCCAGTGCACCGCCTCGGGCATTGAAGGTATTAAGTGCCCACCCGGTTTATATTTCGATATTGAAAAACAAACCTGTGACTGGAAAGATGCCGTAAGAAACTGTAAAGTAAAGAGCAAAGAACGCAAAGTAAAACCTCTTCTTTATACTGAGGAACCACTTTGCCAAGACGGCCTACTTGCCTGTGGGGATGGCATTTGTATAGAACATGGCCTTTTCTGTAATGGCGAATTAGATTGTAACGATGGATCAGACGAAAACTCTTGCGACATCAACAATGACCCCAACAGTGCTCCTCCTTGCGACACATCTCAGTGTACATCACCTGACTGTTTCTGCTCTGAAGACGGAACCGTAATCCCCGGTGATCTGCCCGTAAAGAACGTACCTCAAATGATAACCATTACTTTTGATGACGCTATTAACAACAACAACATTGATTTGTACAAAGAAATTTTCAATGGCAAACGTAAAAATCCTAACGGTTGCGACATTAAGGCGACATACTTTATTTCACACAAATATACTAACTATTCAGCTGTTCAGGAAACTCACAGAAAGGGTCACGAAATCGCCGTACACTCTATCACCCACAATGATGATGAACGCTTCTGGAGCAATGCTAGCGTTGATGATTGGGGTAAGGAAATGGCTGGTATGAGAGTTATTATAGAAAAGTTTGCAAACATAACCGACAACAGCGTAGTTGGAGTTCGTGCACCTTACCTACGAGTTGGAGGCAACCGTCAATTCACCATGATGGAGGAACAGGCCTTCTTATACGACAGCACCATCACCGCTCCTTTATCCAATCCTCCTCTATGGCCTTACACTTTGTACTACCGCATGCCCCATCGCTGCCACGGTAATTTACAAAATTGTCCCACTAGAAGTCACGCTGTTTGGGAAATGGTAATGAATGAGCTCGACCGTCGTGAAGACCCAAGTAATGACGAATACTTACCAGGATGTGCTATGGTTGATTCTTGCTCGAACATTCTTAGTGGTGATCAATTTTACAACTTCCTTAACCATAACTTCGACCGGCATTACGATCAAAACAGAGCTCCATTAGGTCTTTACTTCCATGCTGCTTGGTTGAAAAATAACCCTGAATTCTTGGAAGCATTTTTATACTGGATTGACGAAATCCTTCAGACTCACGATGATGTATACTTTGTAACAATGACTCAAGTAATCCAATGGATTCAAAACCCACGTTCTGTTTCTGAAGCAAAGAACTTCGACCCATGGCTAGAGAAGTGTTCCGTAGAAGGTATCCCAGCATGCTGGGTGCCTCACTCTTGCAAACTTAACTCTAAGGAACTCCAAGGTGAGACCATTAATCTTCAGACATGTCTCAGATGCCCAGCCAATTACCCATGGCTCAATGATCCAACGGGTGAAGGTCATTATTAA

Protein sequence:

>DPOGS214167-PA
MARYARVAPLLACFFFACAAANNGAHRWRRQADAKSEDQNSEICKDKDASEWFRLEIGEGDACRSVIQCTASGIEGIKCPPGLYFDIEKQTCDWKDAVRNCKVKSKERKVKPLLYTEEPLCQDGLLACGDGICIEHGLFCNGELDCNDGSDENSCDINNDPNSAPPCDTSQCTSPDCFCSEDGTVIPGDLPVKNVPQMITITFDDAINNNNIDLYKEIFNGKRKNPNGCDIKATYFISHKYTNYSAVQETHRKGHEIAVHSITHNDDERFWSNASVDDWGKEMAGMRVIIEKFANITDNSVVGVRAPYLRVGGNRQFTMMEEQAFLYDSTITAPLSNPPLWPYTLYYRMPHRCHGNLQNCPTRSHAVWEMVMNELDRREDPSNDEYLPGCAMVDSCSNILSGDQFYNFLNHNFDRHYDQNRAPLGLYFHAAWLKNNPEFLEAFLYWIDEILQTHDDVYFVTMTQVIQWIQNPRSVSEAKNFDPWLEKCSVEGIPACWVPHSCKLNSKELQGETINLQTCLRCPANYPWLNDPTGEGHY-