Monarch geneset OGS2.0

DPOGS210837
TranscriptDPOGS210837-TA930 bp
ProteinDPOGS210837-PA309 aa
Genomic positionDPSCF300027 + 52224-53310
RNAseq coverage14x (Rank: top 82%)
Annotation
Heliconius% 
BombyxBGIBMGA003911-TA1e-3741.72% 
Drosophilaalpha4GT1-PA5e-4029.51% 
EBI UniRef50UniRef50_D6WW572e-4130.62%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WW57_TRICA
NCBI RefSeqXP_969928.14e-4230.62%PREDICTED: similar to GA14400-PA [Tribolium castaneum]
NCBI nr blastpgi|3800196737e-4234.17%PREDICTED: uncharacterized protein LOC100863408 [Apis florea]
NCBI nr blastxgi|3800196735e-4134.17%PREDICTED: uncharacterized protein LOC100863408 [Apis florea]
Group
Gene OntologyGO:00057959.9e-17Golgi stack
GO:00083789.9e-17galactosyltransferase activity
KEGG pathwaytca:6584481e-41 
 K01988 (A4GALT)maps-> Glycosphingolipid biosynthesis - globo series
InterPro domain[188-302] IPR0076529.9e-17Alpha 1,4-glycosyltransferase domain
[55-167] IPR0075774.6e-10Glycosyltransferase, DXD sugar-binding motif
Orthology groupMCL10812 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210837-TA
ATGATACATATGGATTTTGGTCAAGAGTTTTCACGGGGTCCGCCGGCCAATGTATCGTGTCACTACGTAAACAGCAACTTTAGTTTGCCGCTTTTACAATCAACTTTACCACAAAGATCAATATTCTTCCACCAAACTTCTTGTCCAGCGAACCTGACCCCCAGGCAGGTCTGTTCCATCGAATCGGCTGCTAGAACTCATCCGGCTTGGCAAATTAACGTTATATTCTCAGGTCCTGTACATTTAGAGACAGTGAATGGCGTAAAATTATTAAAAAGCTTCACTAATATTAAATTCTGGACGATTAATATAAAGGATTTCGCCACGAATACTCCGTTGGAAGGATTGGTTAGCAGTGGTGTGTTAAGTAAATGCAAATGGGGCATGGAGCGCACGAGAGATGTGTTGAAATATTTATCATTATACAGGTTTGGGGGTATATTCTTAGATTTGGACATTATAATTGCCCGCACCTTAGGCTCTTTGGCCAGGAATTGGGCGGCGAGGGAAAACGCAAATAAAGTAGGAGATGGTATATTAGCTATTTCCAAGAATAGTATAGGACATAATATCACTGACGCTGCGATCAGGTATATCGTGTCAATTTACAAGAATAACGATTGGTGTAAGGAAAGTCAGGATGTGGTGATGGGGGTGCTCCAAGAATTATGTTCCACTCCCGATGCGAACTATATGTCCGCAGCAACTTGTAACGGTTTCGAAGTTTATGGCTCACAATTCTTCTATCCAATTGAAAAGCAATCGGCCCGCGAATATTTTGTTCCTGGAGAAGTACAAGACCTCAGCGCCTATATTTATCACCTATGGGGAGATGTTACGAATGGATATAAAATTTCTAAGTCTTCTCCATACTCTAAACTTGCTAGAAGGTTCTGTCCTTTCAATTCATTATTAAATATAAAAAAGTAA

Protein sequence:

>DPOGS210837-PA
MIHMDFGQEFSRGPPANVSCHYVNSNFSLPLLQSTLPQRSIFFHQTSCPANLTPRQVCSIESAARTHPAWQINVIFSGPVHLETVNGVKLLKSFTNIKFWTINIKDFATNTPLEGLVSSGVLSKCKWGMERTRDVLKYLSLYRFGGIFLDLDIIIARTLGSLARNWAARENANKVGDGILAISKNSIGHNITDAAIRYIVSIYKNNDWCKESQDVVMGVLQELCSTPDANYMSAATCNGFEVYGSQFFYPIEKQSAREYFVPGEVQDLSAYIYHLWGDVTNGYKISKSSPYSKLARRFCPFNSLLNIKK-