Monarch geneset OGS2.0

DPOGS213240
TranscriptDPOGS213240-TA1038 bp
ProteinDPOGS213240-PA345 aa
Genomic positionDPSCF300124 - 377596-383947
RNAseq coverage434x (Rank: top 28%)
Annotation
HeliconiusHMEL0125868e-15489.19% 
BombyxBGIBMGA004000-TA3e-5742.75% 
DrosophilaCG5731-PA4e-13381.85% 
EBI UniRef50UniRef50_E0VBI59e-13284.59%Alpha-N-acetylgalactosaminidase, putative n=5 Tax=Bilateria RepID=E0VBI5_PEDHC
NCBI RefSeqXP_974398.12e-14282.61%PREDICTED: similar to alpha-galactosidase/alpha-n-acetylgalactosaminidase [Tribolium castaneum]
NCBI nr blastpgi|3072133906e-14780.41%Alpha-N-acetylgalactosaminidase [Harpegnathos saltator]
NCBI nr blastxgi|3323732701e-14180.73%unknown [Dendroctonus ponderosae]
Group
Gene OntologyGO:00081526.8e-62metabolic process
GO:00038246.8e-62catalytic activity
GO:00431697.6e-30cation binding
GO:00059757.6e-30carbohydrate metabolic process
GO:00045531e-28hydrolase activity, hydrolyzing O-glycosyl compounds
KEGG pathwayaga:AgaP_AGAP0096211e-139 
 K01204 (NAGA)maps-> Lysosome
    Glycosphingolipid biosynthesis - globo series
InterPro domain[52-222] IPR0137856.8e-62Aldolase-type TIM barrel
[53-221] IPR0178532.8e-51Glycoside hydrolase, superfamily
[225-318] IPR0137807.6e-30Glycosyl hydrolase, family 13, all-beta
[53-70] IPR0022411e-28Glycoside hydrolase, family 27
Orthology groupMCL13446 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213240-TA
ATGGCAGGGTATCAGGAAATCGTTAAGCGTTACTTCAGTGACCGGCCACTGGTTTCGAAAACCGTCAGCCGATTCCAATTTCCCACGTTAGTTGCGACAGTGGTTGTAACGTCTAATGGTCTTGGTTGCGACGTTGATGTAAAGGACACTAGACCAGACGCATCAACATTCGCATCTTGGGGAGTTGATTATGTGAAGCTGGATGGATGTTACGCTCTGCCAGTTGACATGGACCACGGATATCCTGAGTTCGGACGACAGCTCAACCTAACCGGTCGCCAAATGGTCTATTCTTGCAGTTGGCCAGTCTACCAAATATACGCTGGCCTTCAGCCAAACTTTTCGTCAATCATCGAGCACTGCAATTTGTGGCGTAACTTTGACGACATCCAGGATTCTTGGGCTTCCGTAGAATCTATCATAGACTATTACGGAAACCACCAAGACGTGATCGTTCCCAACGCTGGACCAGGACATTGGAATGACCCTGACATGTTAATCATTGGAAACTTCGGTCTATCATACGAGCAAAGCAAGACTCAGTTCGCGATATGGGCCATACTCGCCGCTCCGCTGCTGATGAGTGTTGATTTGAGAACCATCAGACCCGAATACAAAGCGATACTACAAAACAGGAAGATCATAGAAGTGGATCAAGATCCGTTAGGGATACAAGGCAGACGGATTTATAAGCATCGCGGCATCGAGATCTGGTCTCGTCCGATCATTCCCATCCACGGACAATACTACTCGTATGCCGTGGCTTTCCTCAACAGACGAACAGACGGCACGCCGTCAGATGTCGCCGTCACACTCAAGGAGTTGGGACTCAACAACCCCGCCGGGTACAGAGTCGAGGATTTGTATGAAGACGTTGACTACGGCGTGCTGTCTCCGGCGACCAAGATCAAAGTGAAAGTCAACCCTTCAGGTGTTGTCATCCTCCGAGCGGACGCTCAACCTCCATACGCATATAACGCCATACCGACACGGACGCCATATTCACCACTCAATGATGTGTTCAGACTTCGCAAGTGA

Protein sequence:

>DPOGS213240-PA
MAGYQEIVKRYFSDRPLVSKTVSRFQFPTLVATVVVTSNGLGCDVDVKDTRPDASTFASWGVDYVKLDGCYALPVDMDHGYPEFGRQLNLTGRQMVYSCSWPVYQIYAGLQPNFSSIIEHCNLWRNFDDIQDSWASVESIIDYYGNHQDVIVPNAGPGHWNDPDMLIIGNFGLSYEQSKTQFAIWAILAAPLLMSVDLRTIRPEYKAILQNRKIIEVDQDPLGIQGRRIYKHRGIEIWSRPIIPIHGQYYSYAVAFLNRRTDGTPSDVAVTLKELGLNNPAGYRVEDLYEDVDYGVLSPATKIKVKVNPSGVVILRADAQPPYAYNAIPTRTPYSPLNDVFRLRK-