Monarch geneset OGS2.0

DPOGS203972
TranscriptDPOGS203972-TA1941 bp
ProteinDPOGS203972-PA646 aa
Genomic positionDPSCF300005 + 735108-742147
RNAseq coverage193x (Rank: top 48%)
Annotation
HeliconiusHMEL0103750.058.18% 
BombyxBGIBMGA000733-TA0.059.66% 
DrosophilaPgant35A-PA2e-15049.44% 
EBI UniRef50UniRef50_Q16ZW80.050.23%N-acetylgalactosaminyltransferase n=5 Tax=Endopterygota RepID=Q16ZW8_AEDAE
NCBI RefSeqXP_001814012.10.052.11%PREDICTED: similar to N-acetylgalactosaminyltransferase [Tribolium castaneum]
NCBI nr blastpgi|1892377990.052.11%PREDICTED: similar to N-acetylgalactosaminyltransferase [Tribolium castaneum]
NCBI nr blastxgi|1892377990.051.96%PREDICTED: similar to N-acetylgalactosaminyltransferase [Tribolium castaneum]
Group
KEGG pathwaytca:6619290.0 
 K00710 (GALNT)maps-> O-Glycan biosynthesis
InterPro domain[479-645] IPR0089974.6e-28Ricin B-related lectin
[182-367] IPR0011731.9e-20Glycosyl transferase, family 2
[523-644] IPR0007723e-17Ricin B lectin
Orthology groupMCL11087 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203972-TA
ATGGGTCGGCTTAATTATTCTTTTTGGTGGGGTATATTGTTTGCCTCTTTGACATGGAGTATTTCATTATATTTGTATTGGTTGTTAAATATGAGTGGTGACAATAATATAAATAGTACAGAAAACAGAGTTCAGTACCATCTAATTAATAGAGAAGAAAAGGATCGTGTAAAAAACAATGACAAATCTATTGCTGCACTGGAGGGCAAAAATCTTTTATTTGACGGTCCTAGCCCGGGTAATGATAAATATATTGGCAAAATGTGGAAGTATTCAAGAGATAGGCCTGATTATATTAGAAAGATGAGGCTTAGGGAGAAATTTACAAAAGAACTTGATAAAGTTGCACAACCAGATAAAAGTTTGGATTTCGAATTCGGCCTCATACACAATGCTGACGATGTAAGAATTAGGGAAAAAGGATATAATATGCATGCTTTCAACACATTGATCTCTCAAAGAATTGGTAATCACCGAGGATTGCCTGACACAAGAAATAAGTTATGTCGGTCACAAAAGTACCCCGATAAGTTACCTAAGGCATCCATTATAATTTGTTTCTACAATGAACATTTCGAAACTCTCATGAGATCAGTTCACTCCATACTAGATCGTACTGATCTGAAATATCTGAAAGAGATAATTCTGGTTGATGATTATAGTGACATAACTGATTTACATGAAGAGGTACAAAAAGCTGTTAATGAGCTAAACGGAAAAATGTTGATAACATTGACATCTACCAGGGAAGGGCTCATTAGGGCTAGATTGTATGGTGCGGATAATAGTGTTGGAGATGTGCTTGTGTTCCTGGATTCTCATATAGAAGTTAATGTTGATTGGTTACCACCTCTTCTCACAAGATTATCGGAAGGTGTTGATGGCGTCAATGTGAGATTTTCTCCTCGAGCTGTCACTCCTATCATAGATGTTATCAATGCTGATACTTTTGAGTATACCTCAAGCCCTTTGGTTAGGGGCGGATTTAACTGGGGATTACACTTCAAATGGGATAATCTGCCTAAAGGGACTCTGAAAGATGATGAAGACTTCATTAAACCCATACGATCTCCAACTATGGCTGGCGGGCTGTTTGCTATTTACAGAGAATATTTTAATAAAATTGGCAAATATGATTCGGGCATGAACCTGTGGGGAGGTGAAAACTTAGAAATATCTTTCAGGATTTGGATGTGCGGTGGAGTGTTGGAGCTATGTCCCTGCAGTCGAGTGGGCCATGTATTTCGTAAGAGACGACCTTACGGCGCCGGCGAGGATTATATGCTGAGGAACTCTATGAGAATGGCTCGAGTATGGATGGATGAATATGTTAACAAAGTCATAGAGCAGAATCCGTCAGCGGCCCACGTATCCATCGGTGATATATCGGAGAGGGTTGAGTTGAGGAAGAGTTTAAAATGCAAATCATTTAAATGGTACTTGGAGAATGTTTATCCTGAATTGGAGACGGGCGAAGATACGGCAGCGAGGAAGAGAATAGCGGCTCTGAACGACCCTGAGAAGAACAAGTTTCAGCCATGGCATTCCAGGAAAAGAAATTACACCGATTCCTATCAGATACGTTTGAGGAATACTTCATTGTGTATACAAAGCGCTAAAGACATCAAAAGCAAAGGCAGTCCGCTGTTACTAGCTGGTTGTACGAGAACCATAAATCAGATGTGGTTTGAGACTGATCGTGGCGAGCTTGTCCTTGGTCGTACTTTATGCCTAGACGCTAACACCTCTCCCATAATAGCCAAGTGTCATGAACTGGGCGGCACACAGGAGTGGAAGCATAAGGGAACTGCTAATAGTCCCATCTACAATATTGCTATGGGTATGTGTCTGGGAGTTGAACGCGCGTACCGCAGCGAACCGATCATGATGGTCATATGCGACAACCAACCAACAAATCAATGGGATTTTGTGAGAACTTAG

Protein sequence:

>DPOGS203972-PA
MGRLNYSFWWGILFASLTWSISLYLYWLLNMSGDNNINSTENRVQYHLINREEKDRVKNNDKSIAALEGKNLLFDGPSPGNDKYIGKMWKYSRDRPDYIRKMRLREKFTKELDKVAQPDKSLDFEFGLIHNADDVRIREKGYNMHAFNTLISQRIGNHRGLPDTRNKLCRSQKYPDKLPKASIIICFYNEHFETLMRSVHSILDRTDLKYLKEIILVDDYSDITDLHEEVQKAVNELNGKMLITLTSTREGLIRARLYGADNSVGDVLVFLDSHIEVNVDWLPPLLTRLSEGVDGVNVRFSPRAVTPIIDVINADTFEYTSSPLVRGGFNWGLHFKWDNLPKGTLKDDEDFIKPIRSPTMAGGLFAIYREYFNKIGKYDSGMNLWGGENLEISFRIWMCGGVLELCPCSRVGHVFRKRRPYGAGEDYMLRNSMRMARVWMDEYVNKVIEQNPSAAHVSIGDISERVELRKSLKCKSFKWYLENVYPELETGEDTAARKRIAALNDPEKNKFQPWHSRKRNYTDSYQIRLRNTSLCIQSAKDIKSKGSPLLLAGCTRTINQMWFETDRGELVLGRTLCLDANTSPIIAKCHELGGTQEWKHKGTANSPIYNIAMGMCLGVERAYRSEPIMMVICDNQPTNQWDFVRT-