Monarch geneset OGS2.0

DPOGS202401
TranscriptDPOGS202401-TA1788 bp
ProteinDPOGS202401-PA595 aa
Genomic positionDPSCF300233 - 188868-193260
RNAseq coverage406x (Rank: top 30%)
Annotation
HeliconiusHMEL0074184e-5683.72% 
BombyxBGIBMGA003295-TA0.069.18% 
DrosophilaCG11851-PA1e-14241.83% 
EBI UniRef50UniRef50_E2B7161e-14345.83%Alpha-1,2-mannosyltransferase ALG9 n=2 Tax=Formicidae RepID=E2B716_HARSA
NCBI RefSeqXP_971096.17e-14448.46%PREDICTED: similar to CG11851 CG11851-PA [Tribolium castaneum]
NCBI nr blastpgi|3323738363e-14544.95%unknown [Dendroctonus ponderosae]
NCBI nr blastxgi|910841692e-15248.55%PREDICTED: similar to CG11851 CG11851-PA [Tribolium castaneum]
Group
Gene OntologyGO:00065062.8e-101GPI anchor biosynthetic process
GO:00167572.8e-101transferase activity, transferring glycosyl groups
GO:00312272.8e-101intrinsic to endoplasmic reticulum membrane
KEGG pathwaytca:6597252e-143 
 K03846 (ALG9)maps-> N-Glycan biosynthesis
InterPro domain[55-575] IPR0055992.8e-101GPI mannosyltransferase
Orthology groupMCL14458 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202401-TA
ATGCCACTTACTGCTCGACAACGAAGTATTCACAATAAAAATGGAGCTAAACGTACTGCTAGTTTTACTAAAGGCAGAAGATCGCAGAAATCTAACGAAGGAGAATATCTGTTTACAAATGCTCCATCAGACCTTCGAAGTATAGCATATCCTGGAGGAGCAGTGGCCCTAGCACTGCTCCTGTCAGCTCGCCTTGCATCTGCTTACTGGGGACAAATAGCAGACTGTGATGAAACTTACAATTATTGGGAGCCATTGCATTATTTGGTATACGGCAGCGGTCTTCAAACGTGGGAATACAGTGCCCAATATGCCATTCGATCATATATGTCTCTCTGGTTGTTTGCCGTACCGGCTAAAATACTATCCCTCATAATGACTCCTGTGACTATATTTTATACTCTGAAAGCACTACTGGCAGTGTTGATGGCTTGTTCTGAACTGATGTTTTATAAAGCCGTGTGCCATGAGTTTGGGGTCCATGTTGGTCGAGTGTGGTTGTTCCTGAACCTTCCAGCGGCTGGGTGCTTTGCTTCATCAGCTGCTATGTTACCGTCGTCTTGGAGCTCAGCGCTGGTGACGGCGGCCCTCGCTTGTTGGTGGCGTCGTAGATATCCGCCCGCTATCTTCCTTATCGCTGCCACTGTACTACTAAGTTGGCCATTCACAGCACTCCTGGGTGTACCGATAGCGGTGGATATGTTGTTATTCAAAGGACTTTTCAAAGAATTCATTAAATGGTCAATGATATCGCTGGTCATAATTCTTCTCCCGACTGTTGCTGTGGACTCCTGGCACTACGGACGTCTTGTGGTCGCTCCGTGGAACATTGTAGCTTATAATATATTCACCGAGCACGGTCCTGATCTGTATGGCGTTGAGCCGTGGACCTATTACTTTGTGAATGGATTCCTTAATTTTAATATTGTATGGGTCTTAGCTCTGTCCTGTCCCCTACTATTGGTCGCGTGTTCTCTTATATCAACTCGGTCGTCGCGTGCGTCGTTCTGTATCCCCTACTGGCTTAGTCTGATGCCATTGGCCTTATGGCTCGCCGTGTTCATGACGCAGCCGCACAAAGAGGAGAGATTTTTATATCCTGTGTACAGTATGATAATACTCTGTGGGGCAATATCCTTGGACTGTCTCCAGAAGATGACCTTCGCTGTCGGAACTGAACTGCTCCGCTGGAGGAAGGAGAGGGAAAGGCGACATTATCTAGTGTACACCGGGCCACTCGTAGTCATGTGTGTCTTGTTGGCCGGACTGTTGAGTATATCCCGTATTATAGCGTTACACAGTCATTACGGCTCAGTTTCTTCGTTGACCAGCCACGTGTCCCCTACTACAGCCGCTGGCGCCACCAACGACGTCCTAGTGTGTTACGGCAAGGACTGGTACCGCTCGTCTTCGAGTTTCCTAGCTCCTGGGCACGTGAGGTTCATCGCCAGCGAGTTCGACGGACAGCTACCCGCGCCATATTCTGTCGGAGCCAATGCAACCCGCGTGATACACCCGTACTTCAACGACCAGAACAGAGGCGATAACCGCACGTACCTACAACCGTCGGAATGCCATTACCTCGTGGACTCGGACGCGGGTAAACCGACGAGACTTCAACCACACTACCACAAGAGGGACGAATGGGAGATAGTTGCGAGAGTACCGCTACTGGACGCTGACAGATCACACAGGATATTCAGAGCCTTCTATGTGCCCGTGTTGACTAACAAGAACTGCGTTTACGCCAATTTGTACTTATTAAAGAACAGGCTTATAGAGTTTTAG

Protein sequence:

>DPOGS202401-PA
MPLTARQRSIHNKNGAKRTASFTKGRRSQKSNEGEYLFTNAPSDLRSIAYPGGAVALALLLSARLASAYWGQIADCDETYNYWEPLHYLVYGSGLQTWEYSAQYAIRSYMSLWLFAVPAKILSLIMTPVTIFYTLKALLAVLMACSELMFYKAVCHEFGVHVGRVWLFLNLPAAGCFASSAAMLPSSWSSALVTAALACWWRRRYPPAIFLIAATVLLSWPFTALLGVPIAVDMLLFKGLFKEFIKWSMISLVIILLPTVAVDSWHYGRLVVAPWNIVAYNIFTEHGPDLYGVEPWTYYFVNGFLNFNIVWVLALSCPLLLVACSLISTRSSRASFCIPYWLSLMPLALWLAVFMTQPHKEERFLYPVYSMIILCGAISLDCLQKMTFAVGTELLRWRKERERRHYLVYTGPLVVMCVLLAGLLSISRIIALHSHYGSVSSLTSHVSPTTAAGATNDVLVCYGKDWYRSSSSFLAPGHVRFIASEFDGQLPAPYSVGANATRVIHPYFNDQNRGDNRTYLQPSECHYLVDSDAGKPTRLQPHYHKRDEWEIVARVPLLDADRSHRIFRAFYVPVLTNKNCVYANLYLLKNRLIEF-