Monarch geneset OGS2.0

DPOGS200434
TranscriptDPOGS200434-TA1803 bp
ProteinDPOGS200434-PA600 aa
Genomic positionDPSCF300236 + 95785-101181
RNAseq coverage158x (Rank: top 52%)
Annotation
HeliconiusHMEL0024970.074.41% 
BombyxBGIBMGA008895-TA0.079.59% 
Drosophilapgant3-PA2e-12441.25% 
EBI UniRef50UniRef50_F4W6Q82e-14246.13%Polypeptide N-acetylgalactosaminyltransferase 3 n=11 Tax=Pancrustacea RepID=F4W6Q8_ACREC
NCBI RefSeqXP_001946977.18e-14847.24%PREDICTED: similar to AGAP008229-PA [Acyrthosiphon pisum]
NCBI nr blastpgi|3287233981e-14747.74%PREDICTED: polypeptide N-acetylgalactosaminyltransferase 3-like isoform 1 [Acyrthosiphon pisum]
NCBI nr blastxgi|3287233984e-14647.74%PREDICTED: polypeptide N-acetylgalactosaminyltransferase 3-like isoform 1 [Acyrthosiphon pisum]
Group
KEGG pathwaytca:6640974e-144 
 K00710 (GALNT)maps-> O-Glycan biosynthesis
InterPro domain[164-349] IPR0011733.2e-25Glycosyl transferase, family 2
[460-596] IPR0089973e-15Ricin B-related lectin
Orthology groupMCL15904 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200434-TA
ATGTACATCACAAATAAGAAAATTCAGTCTTTCAAATGCAGAGGGCGTAAGAAAAAAATTGTTGGTTTAGTTATAGTATTATTATTTTTGATAAATGCAATGTTTTATGTGAGTTTAGAGTTATTAAATACAATCAAAAGGAGAAACAAGATTGCCTGGTATAATAATTTATCATATGACCAAGATTCTTATATGGATAGAAGTGGTATGAGGGTGATTGTTGGTCATTATGTTGGAGGTCATGGAGGGGGTAATTTATCTGAAGATGTCATAAACACAAATCACTATTCTCCGGTACAAGGAGCTGGCGAGGGTGGCCGACCAGTCCAGCTTCTCCCCAAGGAGATCATACCGGCTAGGGAGCTTTACAGCTTACATTCCTATAACATATTTGTTAGTGATAGGATATCTATAAATAGACATCTTCCGGATATGAGGAGTGAAAGTTGCAGAAATGTGAAATACGATATAGAAAATCTCCCTACAGCAAGTGTCATAATAGTCTTCCACAACGAAGCTTGGTCCACTCTCATGAGGACTGTCATGTCTGTTATATTGAGGTCACCAGATATGTTATTGAAGGAGATAATCCTAGTAGATGATGCTAGTGAAAGAAAATATTTAGGTAAGGAGCTGGACGATGCTGTTGCCAATTTAGATAAAGTGGTTATATTGAGAAGTTTGAATAGGACTGGTTTAGTTGGTGCTAGGCTCATGGGCGCCAAAACAGCCACGGGGAACGTATTAGTTTTTTTAGATGCACATTGTGAGGTAACAAAAGGTTGGTTGGAACCGCTTTTGGATAGGGCTGGGAGTGATGACGTTTTTATATGTCCTCATATCGATCTGTTGTCCGATGATACATTGGCTTACACAAAGAGTATTGACGCTCACTGGGGCGCTTTTAGCTGGCGTCTACACTTCAGATGGCTGATGCCAAGTAATGAGATAATGATGAATAAATCCAGGTATCCTTCCAAACCGTTTCCAACACCGGCCATGGCTGGCGGATTATTTGCTGTAAGAAAAAGTTTGTTCTGGCGTCTGGGTGGCTATGACGAGGAAATGTCGATTTGGGGTGGAGAAAATCTGGAACTTTCATGGCGAGCGTGGCAATGTGGTGCCAGAGTAGAGATAACGCATTGCTCAAGAGTTGGTCATATATTTAGACGCCATAGCCCATACAAATACCCTGGGGGAGTATTTAAAGTGCTCAATACAAATTTAGCGCGAGCAGCAACAGTTTGGATGGATGAATGGGCAGATTTCTTCTTCAAATTTAACCCATCTGTGGCCGCCATACGCGATACGCAAAATGTAGCAAATCGTATCGAGCTGCGGAAAAATTTGAAATGTAAGAGCTTTAAGTGGTATCTGGAAAATGTGTGGCCCAAAAATTTCTTTCCCAGTGATGAAAGGTGGTTTGGCAGAATACGGAATGATAAAGAAGGTTGTATAGGCGTCGTTGGTGGAACACCAGGCTTGGGAGGTCCCGCATCCGGTGTACATTGCGGTAGTGATCTTGATCTTGACAGACTGTTAGTCTACACCCCCGATGGTAATATAATGGCGGACGAAGGTCTATGCCTCCAGCAAGGAAATGGGAGATCTGTATGGAAAAGCTGTAGTGAAAATAAGAAACAAATATGGAAGCAGAAAGGTCCAAGGTTGGTAACTCTAGATGGACTATGCTTGACGATGCTTAAAGCGGATGACAAACAGCCCTTTGGTGCTTTAACAGCTAAGAGATGTCTCAACGATAGTCGACAAATATGGCATTTCGAAAGAGTGCCCTGGCGATGA

Protein sequence:

>DPOGS200434-PA
MYITNKKIQSFKCRGRKKKIVGLVIVLLFLINAMFYVSLELLNTIKRRNKIAWYNNLSYDQDSYMDRSGMRVIVGHYVGGHGGGNLSEDVINTNHYSPVQGAGEGGRPVQLLPKEIIPARELYSLHSYNIFVSDRISINRHLPDMRSESCRNVKYDIENLPTASVIIVFHNEAWSTLMRTVMSVILRSPDMLLKEIILVDDASERKYLGKELDDAVANLDKVVILRSLNRTGLVGARLMGAKTATGNVLVFLDAHCEVTKGWLEPLLDRAGSDDVFICPHIDLLSDDTLAYTKSIDAHWGAFSWRLHFRWLMPSNEIMMNKSRYPSKPFPTPAMAGGLFAVRKSLFWRLGGYDEEMSIWGGENLELSWRAWQCGARVEITHCSRVGHIFRRHSPYKYPGGVFKVLNTNLARAATVWMDEWADFFFKFNPSVAAIRDTQNVANRIELRKNLKCKSFKWYLENVWPKNFFPSDERWFGRIRNDKEGCIGVVGGTPGLGGPASGVHCGSDLDLDRLLVYTPDGNIMADEGLCLQQGNGRSVWKSCSENKKQIWKQKGPRLVTLDGLCLTMLKADDKQPFGALTAKRCLNDSRQIWHFERVPWR-