Monarch geneset OGS2.0

DPOGS214168
TranscriptDPOGS214168-TA1710 bp
ProteinDPOGS214168-PA569 aa
Genomic positionDPSCF300014 - 246469-250953
RNAseq coverage12539x (Rank: top 1%)
Annotation
HeliconiusHMEL0068110.087.75% 
BombyxBGIBMGA006214-TA0.088.89% 
Drosophilaverm-PE0.075.18% 
EBI UniRef50UniRef50_Q7PKC70.081.53%AGAP011937-PA n=14 Tax=Endopterygota RepID=Q7PKC7_ANOGA
NCBI RefSeqXP_320596.40.081.53%AGAP011937-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1839792310.083.16%chitin binding protein [Papilio xuthus]
NCBI nr blastxgi|1583007370.081.82%AGAP011937-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00059753.7e-31carbohydrate metabolic process
GO:00038243.7e-31catalytic activity
GO:00055152.1e-09protein binding
GO:00080616.8e-09chitin binding
GO:00060306.8e-09chitin metabolic process
GO:00055766.8e-09extracellular region
GO:00168104.4e-08hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
KEGG pathway 
InterPro domain[182-503] IPR0113303.7e-31Glycoside hydrolase/deacetylase, beta/alpha-barrel
[145-185] IPR0021722.1e-09Low-density lipoprotein (LDL) receptor class A repeat
[35-96] IPR0025576.8e-09Chitin binding domain
[398-499] IPR0025094.4e-08Polysaccharide deacetylase
Orthology groupMCL10267 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214168-TA
ATGGCTCGACACGCGTACCTTCTACTTGGGGTACTACTCCTTGCCTACTCTGTTAAGGCACAAGACGATGACGGTGATGATGAACAGAATCCTGAACAGCTTTGTGACGGACGTCCTGGAGATGAATACTTTAGATTGTCTACCGAAGGCGATTGCCGAGAAGTAGTTCGGTGTGACAAAGGAGGTGAAAATGGCGTCACAAGGCTCGCTTCAGTGCGCTGCCCTGGTGGACTAGCATTCGATATCGACCGCCAGACATGTGATTGGAAAACACATGTTAAGAATTGTGATAAACTAGAAAGTCTTAAACAGATCACATGCCCGTCTGGACTGTCTTTCGATCTTGATAAACAAACCTGCGACTGGAAGGGTAAAGTGACCAACTGTGACAAAATTGAGAAACCAAGAAAAATTCTGCCCATTCTGAAGACTGATGAACCAATTTGCTCCGAAGGCAAGCTTGCTTGCGGAAGTGGTGACTGCATCGAGAAAGAATTATTCTGTAACGGAAAACCAGACTGCAAAGATGAATCTGATGAAAATGCTTGCACCGTCGATTTGGACCCTAATAGAGCACCAGACTGCGATACCAGCCAATGCAAACTTCCTGATTGCTTCTGCTCAGCTGATGGTACTCGTATCCCCGGAGGCTTGGAGCCTAGTCAAGTCCCTCAGATGATCACAATCACCTTCAACGGTGCTGTAAACGTTGACAACATTGACTTGTACGACCAGATCTTCAATGGAAACCACCAAAATCCTAATGGTTGTCAGATCCGTGGTACATTCTTTGTCTCCCACAAATATAGTAACTACGCTGCTATTCAGGAATTACACCGCAGGGGACACGAAATCGCAGTTTTCTCAATCACACATAAAGATGATCCTAACTATTGGACCAGTGGAAGCTATGACGATTGGTTAGCCGAAATGGCTGGAGCGCGTCTTATAATTGAACGTTTTGCGAACATTAGCGATGCTTCCATTATTGGAGTAAGAGCCCCATACCTGAGAGTTGGAGGAAATAAACAATTTGAAATGATGACTGACCAATACTTTGTATATGATGCTTCTATAACCGCACCTCTAGGTCGTGTCCCTATCTGGCCTTACACATTATTCTTCCGCATGCCACATAAGTGTAATGGAAACGCCCATAACTGTCCCTCAAGGAGTCACCCAGTCTGGGAAATGGTTATGAATGAACTTGACAGAAGAGATGACCCAACCTTTGATGAATCTCTTCCTGGTTGTCACGTGGTGGACTCTTGTTCAAACATTCAAACTGGAGAACAATTCGCACGTCTTCTTCGTCACAACTTCAACCGTCACTACACGACCAACCGTGCCCCTCTTGGTTTCCATTTCCATGCTTCTTGGCTCAAGTCAAAGAAAGAATTCAGAGATGAACTTATCAAATTTATCCAAGAAATGAATGAAAAGAACGATGTCTACTTCACTTCTCTCATTCAGGTGATACAATGGATGCAGAACCCCACAGAACTGTCCCAACTCAGAGATTTTGCGGAATGGAAACAAGACAAATGTGACGTAAAAGGTCAACCATTCTGCTCTCTACCAAATGCGTGTCCCTTAACGACCCGGGAACTGCCAGGCGAGACACTGCGTCTTTTCACCTGTATGGAATGCCCTAATAACTACCCCTGGATTTTAGATCCCACGGGAGAGGGCTTCAGCGTTAGGAAGTGA

Protein sequence:

>DPOGS214168-PA
MARHAYLLLGVLLLAYSVKAQDDDGDDEQNPEQLCDGRPGDEYFRLSTEGDCREVVRCDKGGENGVTRLASVRCPGGLAFDIDRQTCDWKTHVKNCDKLESLKQITCPSGLSFDLDKQTCDWKGKVTNCDKIEKPRKILPILKTDEPICSEGKLACGSGDCIEKELFCNGKPDCKDESDENACTVDLDPNRAPDCDTSQCKLPDCFCSADGTRIPGGLEPSQVPQMITITFNGAVNVDNIDLYDQIFNGNHQNPNGCQIRGTFFVSHKYSNYAAIQELHRRGHEIAVFSITHKDDPNYWTSGSYDDWLAEMAGARLIIERFANISDASIIGVRAPYLRVGGNKQFEMMTDQYFVYDASITAPLGRVPIWPYTLFFRMPHKCNGNAHNCPSRSHPVWEMVMNELDRRDDPTFDESLPGCHVVDSCSNIQTGEQFARLLRHNFNRHYTTNRAPLGFHFHASWLKSKKEFRDELIKFIQEMNEKNDVYFTSLIQVIQWMQNPTELSQLRDFAEWKQDKCDVKGQPFCSLPNACPLTTRELPGETLRLFTCMECPNNYPWILDPTGEGFSVRK-