Monarch geneset OGS2.0

DPOGS200411
TranscriptDPOGS200411-TA1422 bp
ProteinDPOGS200411-PA473 aa
Genomic positionDPSCF300236 - 448762-452379
RNAseq coverage22x (Rank: top 78%)
Annotation
HeliconiusHMEL0116000.081.40% 
BombyxBGIBMGA008988-TA0.076.03% 
DrosophilaChLD3-PA0.062.16% 
EBI UniRef50UniRef50_B4MYW40.062.58%GK18229 n=15 Tax=Pancrustacea RepID=B4MYW4_DROWI
NCBI RefSeqXP_001356540.10.063.00%GA14716 [Drosophila pseudoobscura pseudoobscura]
NCBI nr blastpgi|1259855530.063.00%GA14716 [Drosophila pseudoobscura pseudoobscura]
NCBI nr blastxgi|1571109720.062.58%hypothetical protein AaeL_AAEL005685 [Aedes aegypti]
Group
Gene OntologyGO:00059754.3e-30carbohydrate metabolic process
GO:00038244.3e-30catalytic activity
GO:00168101.6e-13hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
GO:00080616.2e-11chitin binding
GO:00060306.2e-11chitin metabolic process
GO:00055766.2e-11extracellular region
GO:00055151.3e-07protein binding
KEGG pathway 
InterPro domain[97-412] IPR0113304.3e-30Glycoside hydrolase/deacetylase, beta/alpha-barrel
[335-408] IPR0025091.6e-13Polysaccharide deacetylase
[5-57] IPR0025576.2e-11Chitin binding domain
[61-116] IPR0021721.3e-07Low-density lipoprotein (LDL) receptor class A repeat
Orthology groupMCL15428 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200411-TA
ATGTGGACTAGCAACGAATGTTCTAGATACTTACTCTGCTTGGAGGGGGAGGTATTTGAATTCAAGTGCTCTAAAGGTCTATTGTTTGATGTCAATCGGCAGTTATGTGATATGCCGCAAAATGTTCACAACTGTGACGTAACAACAGAGACGCTTATACCAAAACCGCAGTTAGAAAATGCGAAATGCGCGAACGAAACCCATCTGGGATGCGCTAACGACATGTGCATGCCCGCAGAATATTTCTGCGACGGTGCCTTCGACTGCGAAGATAATTCTGATGAGGGTTGGTGTGACGTAACCTACGACCCCAACGCTGCTCTCCCGTGCGATCCCGGATTATGTCTTTTACCGGAATGTTTTTGCACAAAACACGGCAACGAAACGCCGAACCACATAGTTCCGAGTCAGACTCCCCAAATGATAACATTGACTTTCAACGGTGCGGTAAACCATGAAAACTGGGATATATACACTAGACAGCTGTTCACTTTGGATAGAACTAATCCCAACGGATGTCCTATAAAGGCAACGTTCTTCGTATCACATCCGTACACCAATTATAGGCACGTGCAGAAACTGTGGAACGACGGTCACGAAATCGCTGTTCATTCAATCACCCATCGTGGCCCAGAGGAGTGGTGGTCCAAAAACGCTACAGTCGAAGAATGGTTTGATGAAATGGTTGGACAAGCAAATATTATAAACAGATTTAGCAAAGTTTGGATGGAAGACTTCAGGGGTCTAAGGGTTCCGTATCTGTCTGTGGGTTGGAATAGGCAGTTTCTAATGATGCAAGAATTCGGGTTTGTTTACGACGCTACAGTTGTAGCACCAGCGGTAGACCCACCTTACTGGCCGTATACTCTGGACTACAAAATGCCTCACTCTTGTACTGGAAATAATCAGTACTGTCCAACAAGAAGCTATGCAGGCCTTTGGGAGATGGTCATTAACCCGCTAATTTACGGAAAACATGTTTGTGCCACATTAGAATACTGTCCAACCAACCTCAACGGGGACGACATATATCAGATCCTGATGAATAACTTCAAAAGACATTATTTAAAAAATAGAGCTCCGTTTGGAATACATCTGAACGCGACTTGGCTTAAAAATAATGAATATCTGGCAGCTTTCAGGAAATTCACAGATGAGTTGCTAAAACTTAATGACGTTTACTTTGTGACATATCGCGAAGTCATTGATTGGATAAGGAGACCAACGCCAGTGTTGCAACTAAAGAAATTTCAACCATGGCAGTGTAATAATAAACAATTTCAGGAATCTGATATTGCTTGCGGCAAACCCAAGACTTGCAAACTACCCTCGAAAGTTCTAGAACATGATAAATATATGATAACTTGCATGGATTGTCCAAAGAGTTATCCATGGATAAGAAATGAGTTTGGCTTAGAATAG

Protein sequence:

>DPOGS200411-PA
MWTSNECSRYLLCLEGEVFEFKCSKGLLFDVNRQLCDMPQNVHNCDVTTETLIPKPQLENAKCANETHLGCANDMCMPAEYFCDGAFDCEDNSDEGWCDVTYDPNAALPCDPGLCLLPECFCTKHGNETPNHIVPSQTPQMITLTFNGAVNHENWDIYTRQLFTLDRTNPNGCPIKATFFVSHPYTNYRHVQKLWNDGHEIAVHSITHRGPEEWWSKNATVEEWFDEMVGQANIINRFSKVWMEDFRGLRVPYLSVGWNRQFLMMQEFGFVYDATVVAPAVDPPYWPYTLDYKMPHSCTGNNQYCPTRSYAGLWEMVINPLIYGKHVCATLEYCPTNLNGDDIYQILMNNFKRHYLKNRAPFGIHLNATWLKNNEYLAAFRKFTDELLKLNDVYFVTYREVIDWIRRPTPVLQLKKFQPWQCNNKQFQESDIACGKPKTCKLPSKVLEHDKYMITCMDCPKSYPWIRNEFGLE-