Monarch geneset OGS2.0

DPOGS215452
TranscriptDPOGS215452-TA1272 bp
ProteinDPOGS215452-PA423 aa
Genomic positionDPSCF300098 - 669366-676092
RNAseq coverage718x (Rank: top 18%)
Annotation
HeliconiusHMEL0083631e-16774.35% 
BombyxBGIBMGA007485-TA2e-13968.48% 
Drosophilabeta4GalNAcTA-PA3e-8348.46% 
EBI UniRef50UniRef50_Q6J4T93e-16371.67%Beta 1,4-N-acetylgalactosaminyltransferase n=6 Tax=Arthropoda RepID=Q6J4T9_TRINI
NCBI RefSeqXP_001662147.11e-9148.65%beta-1,4-galactosyltransferase [Aedes aegypti]
NCBI nr blastpgi|471560631e-16271.67%beta 1,4-N-acetylgalactosaminyltransferase [Trichoplusia ni]
NCBI nr blastxgi|471560635e-16771.67%beta 1,4-N-acetylgalactosaminyltransferase [Trichoplusia ni]
Group
Gene OntologyGO:00167572.1e-142transferase activity, transferring glycosyl groups
GO:00059752.1e-142carbohydrate metabolic process
KEGG pathwaycel:Y73E7A.74e-81 
 K07968 (B4GALT3)maps-> Glycosphingolipid biosynthesis - lacto and neolacto series
    Glycosaminoglycan biosynthesis - keratan sulfate
    N-Glycan biosynthesis
InterPro domain[151-419] IPR0038592.1e-142Galactosyltransferase, metazoa
Orthology groupMCL10856 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215452-TA
ATGATGATAATGGGACCCTTCCATAGAGCTCCCTCCTTAAACAAACTTAAGTCACAGGCTGCGCCATATGCGTTGATGGGCACCGCGGGCGGAGGGCGGGCGGCGCGGGCGCTCCGGCTTCTGCTTCTGCTAGTGCTGGCGCTGGCCGCTGTCGAGTACCTGTTCGGTTCCATCCTGGACGCATCACCTCTCAGGACCTATCTGTATGCACCGATGCACAACGTCACGCCGTCTATCAAAAACGATAAACAGATTATGGCTAAAAAACTTCTCACACAAGGAACAGAATCAGTTACAAACTACACACACACAACAAACAGTTCAAATAAAAATCCAGCCAAGGAAACATTTAACATGACGAAACCCAATCTATCTGACGACACGAGCACGCCGCTTTTAATCACCAAGATTATGGAGAGCATAAAGAATTTGGTAACCACAGAAGAAGACTTCAGAGATGAACCATCCTTACCGCTCTGTGATGAAATGCCGCCAGATCTAGGTCCCATATCAGTGAACAAGACTGAGATTGAACTAGATTGGGTGGAGAAGAGGTACCCGGAGGTCCGGAGTGGAGGAATATACTCTTCCTCTAATTGCACAGCCAGACATAGAGTTGCTATCATAGTACCCTACAGGGACCGTCAACAACACCTAGCGATATTCCTGAACCACATGCATCCATTCTTGATGAAACAGCAGATAGAATACGGAATATATATAATTGAACAAGAAGGTACCAGCGAATTTAATCGCGCGAAGCTGATGAACGTAGGCTTCGTGGAGAGTCAGAGACAGAGGTCGTGGCAGTGCTTCATCTTCCACGACATAGACCTCCTTCCTCTAGACTCACGGAACATGTACTCGTGTCCGAAACAACCGCGTCACATGTCCGCATCTATAGACAAACTCAACTTTAGGTTACCATACGAAGATATATTCGGAGGCGTCTCAGCTATGACACTGGAACAGTTCACGAAGGTGAACGGATTCTCCAACAAGTACTGGGGCTGGGGTGGAGAAGACGACGATATGTTTTATAGATTGAAAAAAATGAATTACCACATAGCGAGGTATAAAATGTCAATTGCAAGATACGCCATGTTAGATCATAAGAAGTCAGCGCCTAATCCTAAGAGATATCAGTTGTTATCACAGACGAGCAAAACATTTCAGAAAGACGGTCTATCGACGCTGGAATACGAAGTAATAAAGGTGACGGCCAACCATCTCTACACGCACATACTAGTGAACATAGACGAGCGCAGCTGA

Protein sequence:

>DPOGS215452-PA
MMIMGPFHRAPSLNKLKSQAAPYALMGTAGGGRAARALRLLLLLVLALAAVEYLFGSILDASPLRTYLYAPMHNVTPSIKNDKQIMAKKLLTQGTESVTNYTHTTNSSNKNPAKETFNMTKPNLSDDTSTPLLITKIMESIKNLVTTEEDFRDEPSLPLCDEMPPDLGPISVNKTEIELDWVEKRYPEVRSGGIYSSSNCTARHRVAIIVPYRDRQQHLAIFLNHMHPFLMKQQIEYGIYIIEQEGTSEFNRAKLMNVGFVESQRQRSWQCFIFHDIDLLPLDSRNMYSCPKQPRHMSASIDKLNFRLPYEDIFGGVSAMTLEQFTKVNGFSNKYWGWGGEDDDMFYRLKKMNYHIARYKMSIARYAMLDHKKSAPNPKRYQLLSQTSKTFQKDGLSTLEYEVIKVTANHLYTHILVNIDERS-