Monarch geneset OGS2.0

DPOGS205119
TranscriptDPOGS205119-TA2073 bp
ProteinDPOGS205119-PA690 aa
Genomic positionDPSCF300172 + 306339-321609
RNAseq coverage388x (Rank: top 31%)
Annotation
HeliconiusHMEL0145760.069.71% 
BombyxBGIBMGA014556-TA2e-12278.74% 
Drosophilapgant2-PB0.060.89% 
EBI UniRef50UniRef50_Q6WV190.060.89%Polypeptide N-acetylgalactosaminyltransferase 2 n=8 Tax=Drosophila RepID=GALT2_DROME
NCBI RefSeqXP_969621.20.064.25%PREDICTED: similar to n-acetylgalactosaminyltransferase [Tribolium castaneum]
NCBI nr blastpgi|1892366510.064.25%PREDICTED: similar to n-acetylgalactosaminyltransferase [Tribolium castaneum]
NCBI nr blastxgi|1947614200.061.81%GF15680 [Drosophila ananassae]
Group
KEGG pathwaytca:6581180.0 
 K00710 (GALNT)maps-> O-Glycan biosynthesis
InterPro domain[561-690] IPR0089973.2e-26Ricin B-related lectin
[574-691] IPR0007721.4e-17Ricin B lectin
[227-317] IPR0011732.7e-15Glycosyl transferase, family 2
Orthology groupMCL14250 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205119-TA
ATGTCAAATTTGATCTCGCAGTATGCCTCAGGAGCTCCTGAGGAGAACGTAGCGCTGCGACTGATAAAACAGCATGTGGACGAACCGGAGAGCAGTCCAGCAGCACCCGAGGAGTTTACTGGCAGTACTGGCAGATACTTCGACGAGACTGGCTACGTCACTGGCGGGGTCAGAGATAACGACCCCTACGTCAGGAACAGATTCAACCAAGCGGCCTCCGACAACCTCCCCAGCAACCGTCTCATACCCGACACCAGGAACTCGATGTGCCGTCTCAAGAAGTACGATGAGGATCTTCCACAGACCAGCGTCATCATAACATTCCACAACGAAGCTCGCTCCACACTGCTAAGAACTATCGTGAGTGTACTAAATAGGAGTCCAGAGCACCTTATAAAAGAAATCATTTTAGAGGAGAACGTAGCGCTGCGACTGATAAAACAGCATGTGGACGAACCGGAGAGCAGTCCGGCAGCACCCGAGGAGTTTACTGGCAGTACTGGCAGATACTTCGACGAGACTGGCTACGTCACTGGCGGGGTCAGAGATAACGACCCCTACGTCAGGAACAGATTCAACCAAGCGGCCTCCGACAACCTCCCCAGCAACCGTCTCATACCCGACACCAGGAACTCGATGTGCCGTCTCAAGAAGTACGATGAGGATCTTCCACAGACCAGCGTCATCATAACATTCCACAACGAAGCTCGCTCCACACTGCTAAGAACTATCGTGAGTGTACTAAATAGGAGTCCAGAGCACCTTATAAAAGAAATCATTTTAGTCGATGACTTCACCGAAGATGGTGCAGCGTTGCGTGCTATACACAAGGTTCATGTTATACGTAACACGAAACGCGAGGGTCTGATGCGGTCGAGGGTCAGAGGTGCAGACGCGGCCACGGCACCAGTGCTGACCTTCCTCGACTCCCACGTGGAGTCCGAAGATGGTGCAGCGTTGCGTGCTATACACAAGGTTCATGTTATACGTAACACGAAACGCGAGGGTCTGATGCGGTCGAGGGTCAGAGGTGCAGACGCGGCCACGGCACCAGTACTGACCTTCCTCGACTCCCACGTGGAGTGTAATGTGCACTGGCTGGAGCCGTTGTTGCAGAGAATCAAAGAGGACCCAACTCGGGTGGTTTGCCCCGTCATCGACGTCATCAGTATGGACACTTTCCAATACATCGGGGCCTCGGCGGACCTGCGCGGGGGCTTCGACTGGAATCTAGTGTTCAAGTGGGAGTATCTATCGCAAGCCGAGCGCGGCGCCCGACTGAGTGATCCCACTCAAGTTATAAGAACCCCGATGATAGCCGGGGGCCTGTTTAGTATGGATCGGAAATACTTCAGCAAGCTCGGCAAATACGACATGAAGATGGACGTCTGGGGCGGAGAGAATTTGGAGATTTCATTCAGGGTTTGGCAATGCGGCGGTTCGCTGGAGATCGTGCCGTGTTCCCGGGTGGGACACGTTTTCAGGAAGCGCCACCCCTACTCGTTCCCCGGCGGGTCAGGGGCGGTGTTCGCCAGGAACACGAGGCGAGCGGCGGAAGTATGGATGGATGACTACAAAGAACTCTACTACAGATCTCAGCCGTTGGCGAAACAAGTGGACTTTGGAGATATATCTGAGCGTGTCTCCATCCGTCAGAGACTTCACTGTAAGCCGTTCCGCTGGTATTTGGAGCACGTGTACCCTGAACTCCGCGTCCCCACCTTCGGGAACTCGATTGCCATTAAGCAGGGACCACGCTGTTTAGACACCATGGGGCACCAGGTCGATGGGACAGTCGCGATGTATCCCTGTCATAACACTGGCGGGAATCAGGAGTGGAGTTTCGACAACGGCCTCATACGTCACCAGTCGCTCTGCCTGGGACTGTCTCAGGAGGACAGCGTGACGGTGGTCCTCGCAGTCTGCGACCCCTCGGACCACAACCAGCTGTGGACCAGGCGGAGGAGTTTCATCAAGCACAACACCCTCGGCCTCTGCATCGACTCCGAGCAACCGATACTCCACCTCCAGCAGTGCGACAACGAGAGACTCAGCCAGCAATTCGTTTTCTAG

Protein sequence:

>DPOGS205119-PA
MSNLISQYASGAPEENVALRLIKQHVDEPESSPAAPEEFTGSTGRYFDETGYVTGGVRDNDPYVRNRFNQAASDNLPSNRLIPDTRNSMCRLKKYDEDLPQTSVIITFHNEARSTLLRTIVSVLNRSPEHLIKEIILEENVALRLIKQHVDEPESSPAAPEEFTGSTGRYFDETGYVTGGVRDNDPYVRNRFNQAASDNLPSNRLIPDTRNSMCRLKKYDEDLPQTSVIITFHNEARSTLLRTIVSVLNRSPEHLIKEIILVDDFTEDGAALRAIHKVHVIRNTKREGLMRSRVRGADAATAPVLTFLDSHVESEDGAALRAIHKVHVIRNTKREGLMRSRVRGADAATAPVLTFLDSHVECNVHWLEPLLQRIKEDPTRVVCPVIDVISMDTFQYIGASADLRGGFDWNLVFKWEYLSQAERGARLSDPTQVIRTPMIAGGLFSMDRKYFSKLGKYDMKMDVWGGENLEISFRVWQCGGSLEIVPCSRVGHVFRKRHPYSFPGGSGAVFARNTRRAAEVWMDDYKELYYRSQPLAKQVDFGDISERVSIRQRLHCKPFRWYLEHVYPELRVPTFGNSIAIKQGPRCLDTMGHQVDGTVAMYPCHNTGGNQEWSFDNGLIRHQSLCLGLSQEDSVTVVLAVCDPSDHNQLWTRRRSFIKHNTLGLCIDSEQPILHLQQCDNERLSQQFVF-