Monarch geneset OGS2.0

DPOGS200445
TranscriptDPOGS200445-TA2115 bp
ProteinDPOGS200445-PA704 aa
Genomic positionDPSCF300236 + 537848-553190
RNAseq coverage727x (Rank: top 18%)
Annotation
HeliconiusHMEL0036418e-11141.46% 
BombyxBGIBMGA008904-TA0.083.45% 
Drosophilapgant5-PA4e-17663.96% 
EBI UniRef50UniRef50_Q6WV177e-17463.96%Polypeptide N-acetylgalactosaminyltransferase 5 n=28 Tax=Coelomata RepID=GALT5_DROME
NCBI RefSeqXP_002064617.10.064.11%GK23729 [Drosophila willistoni]
NCBI nr blastpgi|1954332280.064.11%GK23729 [Drosophila willistoni]
NCBI nr blastxgi|1947615620.064.56%GF15722 [Drosophila ananassae]
Group
KEGG pathwaydwi:Dwil_GK237290.0 
 K00710 (GALNT)maps-> O-Glycan biosynthesis
InterPro domain[565-693] IPR0089973.4e-38Ricin B-related lectin
[183-367] IPR0011732e-30Glycosyl transferase, family 2
[570-693] IPR0007722.5e-30Ricin B lectin
Orthology groupMCL11025 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200445-TA
ATGTTTAGAAGCAAAATAAGGATACACACATGTCGCATAATTTTACTCACATCATTAGTGTGGTTATTAGTTGATGTTGCCTTGTTAGCACTCTATTCAGATTGTTTTGGTGATGGATGGGATTGCAATAAAAATAAAAATCTAAATAATGACTACACAATCAAGACAAATGATGAGTACAAAGGAAAGAAAGCAGCCATAGCAGCAGCTTTACAGAAGGATGAAGATTTTGATGAGACAAATTTAGAAGAGAATGAAGTGGAACATGACATGGGTGATGATGGTTTAATTTTACCACCCTATCCAAAATCACAGCTTAGGAGATGGGCCCCAGCACCATTTGTTAAACCCCAGGAAGAGACTCCTGGTGAAATGGGTAAGGCAGTTAACATACCGATTGAACAAGAAAAAGTGATGTTGGAAAAGTTCCAAGAGAATCAGTTCAATTTACTCGCAAGTGACATGATATCACTGAACAGATCACTCACTGATGTTAGATTTGAAAAATGTAAAGCCAAACGCTATCCGACACTTTTGCCGACGACGAGTGTAGTTATAGTTTTCCATAATGAAGCGTGGACTACACTACTTAGGACAATATGGAGTACAATCAATCGGTCTCCCAGACCGCTGTTGAAGGAGATCATTCTCGTCGACGATGCCAGCGAAAAAGAACATCTAGGTAAGAAATTGGAAGAATATATAAAGACCCTGCCAGTTTCTACCCGGTTGTTCCGTACAGAGAGTCGATCGGGTTTAATAAGAGCCAGATTGCTTGGAGCCAAACACGTTAAAGGGGATGTCATAACGTTTTTGGACGCTCATTGTGAATGTACCGAGGGATGGTTGGAGCCGTTGTTATCACGGATCGTTGAGGACAGGAGTACGGTGGTGTGTCCTATTATAGATGTTATATCGGACACAACCTTCGAATATATACAGGCGTCTGATATGACCTGGGGCGGATTCAACTGGAAACTGAACTTTAGATGGTATCGTGTCCCAGAACGCGAGATGCAGCGCCGTGGTGGTGACCGCACCGCTCCTCTGCGTACACCCACCATGGCTGGCGGCTTGTTCGCCATCGATCGTGAATACTTCTACAAGATAGGATCCTATGATGAGGGCATGGATATATGGGGTGGGGAGAACTTGGAGATGAGCTTCAGGCTTGGTCCATACAAATGTACCAGTGTTACCAACATAAAATCCGCGGCTAAGGTATGGCAGTGCGGCGGCGTGCTGGAGATCGTTCCGTGCTCTCACGTGGGCCACGTGTTCAGGGACAAGTCCCCCTACTCCTTCCCCGGGGGGGTACAGGCCGTGGTGCTGAAGAACGCGGCCAGGGTCGCAGAAGTTTGGATGGACGAATGGGGGGAATTCTATTACGCCATGAACCCAGGTGTGTGGATGTGCGGTGGTACTCTGGAGATAGCCCCGTGCTCGCACGTGGGTCACGTGTTCAGGAAGACCACGCCGTATTCCTTCCCCGGCGGCACAGGCCGCGTCGTGAACCACAACAACGCCCGTCTAGCTGAAGTCTGGCTCGACGACTGGAAACATTTCTACTACAATATTAACCCAGGCGCTCTCAACGTACCCGTGGGCGACGTGAGCGAGCGGAAGGCGCTCCGTGAGCGTCTCAAGTGTAAAAGCTTCAGGTGGTACCTCGAAAACATATATCCAGAAAGTCAAATGCCATTGGATTATTACTATTTGGGAGAGATACGGAACGCGGAAACATCGAACTGTTTGGATACATTGGGTGGGAAGGCCGGGCAGCCGCTGGGTATGGGATACTGTCACGGGATGGGGGGAAACCAGGTGTTCGCGTATACTAAACGCAAGCAGATCATGTCGGATGACAATTGTTTGGACGCAGCTCACCCTCGCGGACCAATCAAGCTGATACGATGTCATGGGATGAGGGGAAATCAAGAGTGGACGTATGATACTAAGAGCCGTACAATAAAGCACACCAACACTGGCATGTGTCTCGACAAGCCAGAGTCTACAGACGTTTGGAAGCCGGTGTTGAGGTCCTGCGACAGGTCCAGAGGTCAACAGTGGCTGATGCAGGTCGACTTCAAGTGGCAAGCGAGGCATTCCAGCTAG

Protein sequence:

>DPOGS200445-PA
MFRSKIRIHTCRIILLTSLVWLLVDVALLALYSDCFGDGWDCNKNKNLNNDYTIKTNDEYKGKKAAIAAALQKDEDFDETNLEENEVEHDMGDDGLILPPYPKSQLRRWAPAPFVKPQEETPGEMGKAVNIPIEQEKVMLEKFQENQFNLLASDMISLNRSLTDVRFEKCKAKRYPTLLPTTSVVIVFHNEAWTTLLRTIWSTINRSPRPLLKEIILVDDASEKEHLGKKLEEYIKTLPVSTRLFRTESRSGLIRARLLGAKHVKGDVITFLDAHCECTEGWLEPLLSRIVEDRSTVVCPIIDVISDTTFEYIQASDMTWGGFNWKLNFRWYRVPEREMQRRGGDRTAPLRTPTMAGGLFAIDREYFYKIGSYDEGMDIWGGENLEMSFRLGPYKCTSVTNIKSAAKVWQCGGVLEIVPCSHVGHVFRDKSPYSFPGGVQAVVLKNAARVAEVWMDEWGEFYYAMNPGVWMCGGTLEIAPCSHVGHVFRKTTPYSFPGGTGRVVNHNNARLAEVWLDDWKHFYYNINPGALNVPVGDVSERKALRERLKCKSFRWYLENIYPESQMPLDYYYLGEIRNAETSNCLDTLGGKAGQPLGMGYCHGMGGNQVFAYTKRKQIMSDDNCLDAAHPRGPIKLIRCHGMRGNQEWTYDTKSRTIKHTNTGMCLDKPESTDVWKPVLRSCDRSRGQQWLMQVDFKWQARHSS-