Monarch geneset OGS2.0

DPOGS206424
TranscriptDPOGS206424-TA2916 bp
ProteinDPOGS206424-PA971 aa
Genomic positionDPSCF300181 + 140086-144937
RNAseq coverage11x (Rank: top 84%)
Annotation
HeliconiusHMEL0180812e-4458.14% 
BombyxBGIBMGA013822-TA2e-12158.70% 
Drosophila% 
EBI UniRef50UniRef50_E2C0S47e-7744.79%Hyaluronoglucosaminidase n=9 Tax=Apocrita RepID=E2C0S4_HARSA
NCBI RefSeqXP_972926.14e-7745.58%PREDICTED: similar to hyaluronidase [Tribolium castaneum]
NCBI nr blastpgi|1609485557e-7840.53%hyaluronidase [Anoplius samariensis]
NCBI nr blastxgi|910845377e-8145.58%PREDICTED: similar to hyaluronidase [Tribolium castaneum]
Group
Gene OntologyGO:00081522.4e-104metabolic process
GO:00038242.4e-104catalytic activity
GO:00059751.4e-94carbohydrate metabolic process
GO:00044151.4e-94hyalurononglucosaminidase activity
GO:00069526.2e-09defense response
KEGG pathwaytca:6616851e-76 
 K01197 (hya)maps-> Glycosaminoglycan degradation
InterPro domain[37-352] IPR0178532e-104Glycoside hydrolase, superfamily
[35-351] IPR0137852.4e-104Aldolase-type TIM barrel
[35-355] IPR0181551.4e-94Hyaluronidase
[78-91] IPR0013296.2e-09Glycoside hydrolase, family 56, allergen Api/Dol m 2
Orthology groupMCL10654 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206424-TA
ATGTCTTGGATCTCGATTCTGGCTTTCTTGATTTGGAAATGTGCAACTTTAAGCGACAGTTACTTCATCATTGAAGCTCCAGAATTAAATGTAAAGGAATTTAAAAAAGAGTTCCGGGTGTACTGGAACGTTCCGACATTTCAGTGTGCTTCAAAGAAAATACCCTTTGATAAATTGTATGAAAAATTCGGTATCATTCAAAATAACGGAGACAGATTCAACGGTGAAAAAATCACTATTCTGTACGAACTGGGTGACTTTCCGTCCATTTTTAAGAATAAAACCAGCGGAGAATATGAATTTATAAACAACGGTGTACCTCAGGAGGGAAATCTGCAAGAACATTTGGTTGCTTTCAAAGAACAATTACTCGAGAAGATACCAGACCCTTTCTTTGATGGCGTTGGAGTAATTGATTTTGAGATGTGGAGACCAGTTTTCGATCAAAACTTCGGCAATCTCGATGTCTATAGAAAAAGATCCATTGAAATTGAGAAGGAAAGGCATTCGTGGTGGCTAAATATGTGGATTAAAAAAGAGGCGGCAATTCGTTTCGAAACCGCTGCAAGGAAGTTCATGGAATCTACTCTGGTTCTTGCAAAACAAATGCGACCTAACGCCTTGTGGGGCTATTATGGATACCCACACTGTTTCAATATGAGAGACATGGATATGAAAGAAGATTGTGTGAAAAGTATCCCAGGTGCAAATGACAATATTTATTGGCTTTGGGCTGAGAGCACTGCTCTATTTCCTTCTATATACAGCTCAAAAAAACTTACCAATTCACAACTTCCCTTTCATATTAAAGGGAGAATAAAAGAGAGTGCACGCGTTAGGCTCGAAGACACGCCAATTTTGCCGTACTTCTGGTTTAGATATAGAGACGCTGAGTTTTTTAGTCAGGAAGACCTTTCAATAGCTCTCAACACACTGTACCAGTCGAAAGCATCTGGTGTAATAATATGGGGCAGCTCAAACGATGTGAATACTGTTGACAAATGTAAGAAACTTTATAACTACGTGGAGACTATACTCGGACCTAATATAGCAAAATATACGCAACGATCTAAAAAAAGTAGCAACGCAGTCAATAATGAAAATAATCTTTTCAAGTTTAAGTCGCAAGGAAATAGAAAAATTTTAGAAAGTGAAGAAATTCTTTTAAAGGAAAGTTCGACAAATTACGACTTTACCTTTACCGATGAAACAACACAAAGCGCTTATTATGAAGAAATAGATTTTACAGAATATTTAAAAGTCAACTATGATTTTACCTTTGAGCAAAGTCTGACAACTTTACGTGATAAAGAAGATTTTACACCAGAAACATCTACAGAAATTGCTACAAATATGGTATCCCAAAATTTTATTACAACGGAAGAAGATTCATATAATGATGACAATGTTCCTAATGCAATTGAAATTACCGATGATTACCTTATAATTATTGAGAATTCAACAACCAATGACACTTTTAAACAGGATACAACTAATACAAATATCGATGATGACTATTTGATAATTGTTAGAGACAATTATGAAGACAGATTAACGAAATATAATAATTCACTAGAACAGCCACTAACAACAAAATCACCAAGTAATATGATTGTAAAAAGTTCACTAGAAAATGTAACAGAGGCTTACCAAGATGCAGAATACGATTATTACGATGAAAATACAATTGAAATGCCGGCAACGAAAGATAAGCTATCTGATCAAGATGAAAGCTCCGATTATATAATAGTGGTTGATAATAATTACACCGAAAAAGATACTTTTGAAGCCATCGAGTCTAAAAGGGTAACGGATACAGAAAATGATTACTTAATAATTGTCAATGACTTCAATAACTTACGCACATCAAACAAAAGTAATACAGATTATGTTGCAATTGATAACACAAAAAGTAATTCAACAATCCAGTTTATACCATATACAGACGAACGTAGTACTACTGAAGAAAGTTTAGACGATTTTGAAGATGATGACGAATTTATACCAAATAAATCATATCGTAACGAAGTGTTTAATTCCGGTCAAGATGAATCCAACTACTTGATAGTTATTGATTATAATTTCACAGATGTAAACACTTTCACACCCTCTTATTACACGGATAGTACAGACATTAAAGATGATTATCTGATTATACCTAAACATGGAAATAAAAGGCATAATTTTACTATAAACAGCGATAACAGTAACGCTAATTATGTGCCGAATAAAATGAAGATTTCAACAACACAGTTCCTTGAATCCAATGATGTTACTCTATCAACTGAAAACGAAAACTTTAGCAGTGAGATATTTTCTCCTAGAATAGAACTTAATACTCCAGAAATAACCACCCCAAGCGATTTTTTGGATGAATCAACGCTTACGGAAGCTACGTCTGAAGGTTTAATTACATTAGAAATAATGAATAATTTAACTGCTAATGTTTTATTAAATGAAATGCTTGTTACAGACCGAAGCGTAGACAGTATAAGCAATTTACATTTTATAAAAACTACAAATGATGCTGATGAATTCTCTGCATCACACATTAGTATGGAAGCAATGGAAAACGAACAATATATAGATAAAGCTTCCAACAAGAATGACGCTGTAGAAGAAACGAAATCAACATATATCAGTGAATATAAAGAATTTAAAAATTCACAATCAACTGAAAGTGGTTATAGCAGCAAGCTGATCACGTCCAGTGAAAAATATTCTATACCAACTGAAGGAACATTCTCACAAAATGATTTTACAGATCTTGACCTCGGCACTGACTTAACAAGATATTCTGAAATTAATGAAGAAATCACTCAACGGAAGGAAGAGATCAACACAGAGACGTACAATATTCACACCAAAGACACTGAAAACGTCGATCATATTATATCATATATTTATAAAAAATAA

Protein sequence:

>DPOGS206424-PA
MSWISILAFLIWKCATLSDSYFIIEAPELNVKEFKKEFRVYWNVPTFQCASKKIPFDKLYEKFGIIQNNGDRFNGEKITILYELGDFPSIFKNKTSGEYEFINNGVPQEGNLQEHLVAFKEQLLEKIPDPFFDGVGVIDFEMWRPVFDQNFGNLDVYRKRSIEIEKERHSWWLNMWIKKEAAIRFETAARKFMESTLVLAKQMRPNALWGYYGYPHCFNMRDMDMKEDCVKSIPGANDNIYWLWAESTALFPSIYSSKKLTNSQLPFHIKGRIKESARVRLEDTPILPYFWFRYRDAEFFSQEDLSIALNTLYQSKASGVIIWGSSNDVNTVDKCKKLYNYVETILGPNIAKYTQRSKKSSNAVNNENNLFKFKSQGNRKILESEEILLKESSTNYDFTFTDETTQSAYYEEIDFTEYLKVNYDFTFEQSLTTLRDKEDFTPETSTEIATNMVSQNFITTEEDSYNDDNVPNAIEITDDYLIIIENSTTNDTFKQDTTNTNIDDDYLIIVRDNYEDRLTKYNNSLEQPLTTKSPSNMIVKSSLENVTEAYQDAEYDYYDENTIEMPATKDKLSDQDESSDYIIVVDNNYTEKDTFEAIESKRVTDTENDYLIIVNDFNNLRTSNKSNTDYVAIDNTKSNSTIQFIPYTDERSTTEESLDDFEDDDEFIPNKSYRNEVFNSGQDESNYLIVIDYNFTDVNTFTPSYYTDSTDIKDDYLIIPKHGNKRHNFTINSDNSNANYVPNKMKISTTQFLESNDVTLSTENENFSSEIFSPRIELNTPEITTPSDFLDESTLTEATSEGLITLEIMNNLTANVLLNEMLVTDRSVDSISNLHFIKTTNDADEFSASHISMEAMENEQYIDKASNKNDAVEETKSTYISEYKEFKNSQSTESGYSSKLITSSEKYSIPTEGTFSQNDFTDLDLGTDLTRYSEINEEITQRKEEINTETYNIHTKDTENVDHIISYIYKK-