Monarch geneset OGS2.0

DPOGS207662
TranscriptDPOGS207662-TA1767 bp
ProteinDPOGS207662-PA588 aa
Genomic positionDPSCF300133 + 99610-103611
RNAseq coverage643x (Rank: top 20%)
Annotation
HeliconiusHMEL0026680.074.49% 
BombyxBGIBMGA010523-TA0.083.83% 
DrosophilaGalNAc-T2-PA0.063.18% 
EBI UniRef50UniRef50_Q8MV480.063.18%N-acetylgalactosaminyltransferase 7 n=13 Tax=Neoptera RepID=GALT7_DROME
NCBI RefSeqXP_973938.10.066.44%PREDICTED: similar to n-acetylgalactosaminyltransferase [Tribolium castaneum]
NCBI nr blastpgi|910817970.066.44%PREDICTED: similar to n-acetylgalactosaminyltransferase [Tribolium castaneum]
NCBI nr blastxgi|910817970.066.44%PREDICTED: similar to n-acetylgalactosaminyltransferase [Tribolium castaneum]
Group
KEGG pathwaytca:6627670.0 
 K00710 (GALNT)maps-> O-Glycan biosynthesis
InterPro domain[456-581] IPR0089975.3e-29Ricin B-related lectin
[142-321] IPR0011734.7e-26Glycosyl transferase, family 2
[465-579] IPR0007721e-17Ricin B lectin
Orthology groupMCL13968 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207662-TA
ATGCGGATAACAAACATAAGACGCGGTAGAATTTTCCGTACAACCGGTTTGTGTGCGATTCTTTTGCTATTTATTTATGCTATACACAGTTACTCTCGTTCAAATGGCAACGTTGATTCTTATGAAAAGCTAATTGTTAATGAACCCCTTAGCAAATACAAATTTCACAAGGATATATACCCTACTTTACAAACTGAGCTAGGCAACTTTGAGCCTCGTTCCAAGAATATTGTAAAAGGGCCGGGTGAAGAGGGCCTACCATACCACATGCCTCAGGACAGAGCCAATGATATAGCTGAGTCGGAAAGCGAATACGGCATGAACATAGCTGCATCTAATGATATTGCTATGAACAGATCAATTCCGGACACTCGTCTGGATGAATGTAAATATTGGCACTATCCTGAAGAACTGCCGAGTACATCAGTAATTATTGTGTTCCACAACGAAGGTTTCTCGGTGCTCATGAGGACCGTGCACACTGTCATAGATCGTTCACCGCCTAACATATTGAAGGAGGTTGTTATGGTTGACGATTTTTCAGATAAAGACGATTTAAAAGAAAACTTAGACAACTATGTTAAACGTTGGAAAGGCAAAGTGAGAATAATAAGAAACAGTGAAAGACAGGGTCTGATACGTACCAGATCGAGAGGGGCTATGGAAGCGACGGGGGAGGTCATAGTATTTTTGGACGCTCACTGCGAGGTCAACGTCAACTGGTTACCGCCACTACTCGCTCCCATATACAGGGACTACAAGATCATGACCGTACCAGTTATAGATGGTATCGACCACAAAACCTTCGAATACAGACCGGTTTACTCGCATGGTATTAATTATAGAGGTATATTCGAATGGGGTATGCTTTACAAAGAAAACGAAGTACCTGACAGGGAAGCCAGTTTGCACAAACATAAATCTGAACCATACAAAAGTCCTACCCACGCTGGTGGTCTTTTCGCTATAAACAGGAATTATTTCCTTGAAATCGGTGCATACGATCCCGGTCTTTTGGTATGGGGTGGAGAGAATTTCGAATTAAGCTTCAAGATTTGGCAATGCGGCGGTAGTATTGAATGGGTGCCATGCTCCAGGGTCGGTCACGTGTATAGAGCCTTCATGCCGTACTCGTTCGGAAATCTAGCTAAAAACCGGAAAGGATCTCTCATCACAATTAATTACAAACGGGTCATTGAAACTTGGTTCGATGAGGAGCATAAGGAATTTTTCTATACAAGGGAACCCATGGCCAGGTTTCTGGATATGGGCGACATCAGTGAACAAGTAGCCCTGAGGGACAAATTGAACTGCAAGAGCTTCAGTTGGTACATGGAGAATGTCGCTTATGACGTATATGATAAATTCCCCAAATTACCCAAAAATGTTCATTGGGGTATGGTGAAGAATAAAGCAATCGGCCTGTGTCTAGATACTATGGGAAAAGCAGCTCCTTCATATATTGGTATACAGTCCTGTCATGGGGCTGGGAACAATCAGCTGTACAGATTGAATGAGGCGGGACAGTTGGGTGTTGGCGAGAGATGTCTGGAAGCCGATACGGACAGCCTCAAACAGACGATCTGCCGGCTAGGGACTGTTGACGGACCTTGGAGGTACGACAAAGAGCGCAGCCATCTGATACACAGGTTGCACAGCTATTGTCTGACCCTGCAGCCCAATTCCAGAACACTTGGTCTGGCTCCTTGCGACCCCAACAATACTTATCAACAGTGGACCATAACGCAGAAGAACCCCAAGTGGTGA

Protein sequence:

>DPOGS207662-PA
MRITNIRRGRIFRTTGLCAILLLFIYAIHSYSRSNGNVDSYEKLIVNEPLSKYKFHKDIYPTLQTELGNFEPRSKNIVKGPGEEGLPYHMPQDRANDIAESESEYGMNIAASNDIAMNRSIPDTRLDECKYWHYPEELPSTSVIIVFHNEGFSVLMRTVHTVIDRSPPNILKEVVMVDDFSDKDDLKENLDNYVKRWKGKVRIIRNSERQGLIRTRSRGAMEATGEVIVFLDAHCEVNVNWLPPLLAPIYRDYKIMTVPVIDGIDHKTFEYRPVYSHGINYRGIFEWGMLYKENEVPDREASLHKHKSEPYKSPTHAGGLFAINRNYFLEIGAYDPGLLVWGGENFELSFKIWQCGGSIEWVPCSRVGHVYRAFMPYSFGNLAKNRKGSLITINYKRVIETWFDEEHKEFFYTREPMARFLDMGDISEQVALRDKLNCKSFSWYMENVAYDVYDKFPKLPKNVHWGMVKNKAIGLCLDTMGKAAPSYIGIQSCHGAGNNQLYRLNEAGQLGVGERCLEADTDSLKQTICRLGTVDGPWRYDKERSHLIHRLHSYCLTLQPNSRTLGLAPCDPNNTYQQWTITQKNPKW-