Monarch geneset OGS2.0

DPOGS209247
TranscriptDPOGS209247-TA3150 bp
ProteinDPOGS209247-PA1049 aa
Genomic positionDPSCF300111 - 441299-449739
RNAseq coverage477x (Rank: top 26%)
Annotation
HeliconiusHMEL0167380.069.56% 
BombyxBGIBMGA007063-TA9e-14651.01% 
DrosophilaCG3160-PA1e-9929.38% 
EBI UniRef50UniRef50_Q16K532e-13231.02%Gpi inositol deacylase pgap1 n=1 Tax=Aedes aegypti RepID=Q16K53_AEDAE
NCBI RefSeqXP_001656358.14e-13331.02%gpi inositol deacylase pgap1 [Aedes aegypti]
NCBI nr blastpgi|1571345427e-13231.02%gpi inositol deacylase pgap1 [Aedes aegypti]
NCBI nr blastxgi|2700117276e-15533.94%hypothetical protein TcasGA2_TC005802 [Tribolium castaneum]
Group
Gene OntologyGO:00068863e-58intracellular protein transport
GO:00312273e-58intrinsic to endoplasmic reticulum membrane
GO:00167883e-58hydrolase activity, acting on ester bonds
GO:00065053e-58GPI anchor metabolic process
KEGG pathwayaag:AaeL_AAEL0131151e-132 
 K05294 (PGAP1)maps-> Glycosylphosphatidylinositol(GPI)-anchor biosynthesis
InterPro domain[44-261] IPR0129083e-58GPI inositol-deacylase PGAP1-like
Orthology groupMCL14934 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209247-TA
ATGACATTTATGTTTGAATATCCACAATTTGTACGAATATCTTTGGAGGAAAATAAAAAATATCCTCAATATGGTCTATATGCCTATAGTGAAGGAAGATTTACTGAAAAGGCTAGAAAGATGTGGTTTGATGGAATACCTGTATTATTTTTGCCAGGCAATTCTGGTAGTCACATGCAAGCTAGGTCTTTAGCTTCAGTTGCACTAAGAAAGGCTTTATCAGAATCATATGAATATCATTTCGACTATTTTACGATTAGCTATAATGAAGAATTGTCAGGCTTATATGGGGGAGTTCTTCAAGGGCAGACTGAATTTGCTTCAGCATGTATAAATAAAATACTTACATTATATAAAAGCAATAAATATACTAAGTCAGTACCAACATCAGTAATTCTTATTGGACATTCAATGGGTGGAATTATTGCAAAGAGATTACTAACATATCCATATACAAAAAATTCAACCAACATTGCAATAACTCTAGTAGCACCTTTGAAAGCACCAGTCATTAATTTTGATATACTTTTAAATGAATACTATATGCAAATGGATATGGAATGGATGGAATATAAACTATCAAATTTAAGACATGATAAGATTCTAATCAGCATTGGGAGTGGCCCTCGAGATATGTTGATACCAGCTGGTTTAACAGCTTCCAACTATAGCCATATTAATACTCTGTCTACAGCTATTCCAGGTGTTTGGGTTAGTCCTGATCATGTTAGTATGGTGTGGTGCAAACAGCTAGTGTTAGTTATCAATAGATTCCTCTTTGACATTGTTGATACATGGACCGAACAGATATCGATCAACTCAGCATATATTGATCAAAAAGCTAGACAATATTTTAAGGCCAATCGTTCAACAACTTTAGATAAGTCTATATTACGTCATAACGTCAGTATGCAAGTCGACGGTTTTTGGTATGAAGATAGCAGGAGAATTTATCAGATATCACGACCGGGGATCGAAAGAACAACACATTTAATGATAAGATTAGTAAGTTTCCCTCAAAATAGATTTGTTGCAATCGAATCTACTAATGTTGATGATAAAGATTGGATATTTGGATGCACCGCTAAGGATGTCCACAATAATTACAGATATTGCAAAGAGGCGACATCACTAAGCGAACTGAGTCGATGGTCGGGAGCAGCAACCGATTACGGAAGGAGAAAGTTAGCGACGATCCATCTTCACCAAATAATGGAAGAAGAACCGCAATGGACGCACGTTATTGTTAAAGTTTCGCCAACTAGGAAACCGATTGTACTTAACGTGGATACAAATGATTACGCTTCCAGACAAATAGAAGTATCAGCGCCATCGGATTATTCATTTGGGAAACGCGTTGTAATACCCGAAACTGTACCGAATAGTCTCTATTATGAGCTGATCCTATCAGAATTTAATTTAATACACCAAGCTTATTTATTATACGTGGAACCTACTGAAACATGTAAAGGACAGTATCATGTGTCAGCTGAAGTTCACGTGCCCTGGGCTCAGAATAACGAATATTACCATTATTTCACCCCTCTCAAACGATCCCCGATGAAACTGCGATTGTTTGAAAGCAATCCTAATATAACATTAGGACAAGACGCCACAGAAAAAGTTAAAATTACTTTATTGTTAGACCCACAATGCACCTTTAGTGTAAGTATATCAATATCCTGGTATCACCGTTTGGCTCAACTGTCTCGTAACTATACTCCGATTCTGGTTCCATATGTGGCAGCTATATTGCTACTGGCAGCTAGAAATAATATACTCAATATACAGAGTACTGGATGTCCTTCCATACATAGTGCCTTAATGAGTGACAGTGTTAGACCATATTTTGTGCTAGTCTTTGCCCGTCTAGCCATAACATCATTTATGTCAGTTCCATTTTTATCGTTTCTTTTCGAGAATGCCAGTTGGAGAAATCTTGAATTACAATACTTTGTACGCTCGCTACTTGTGTTACCAACATACATGACAGCTCTCGGTATTATCAATATTGTTGCTCTGGCCCTGCTTATTATTATGGTATTTTCATCTCAATTGGCACATCGATTGTTATTCAGGATAGTATGGCGTGGTGGAATGGGTCTGGCTGAAAAAATGGCTGTGGGTTTACAGAAAGTACCCATGCTGGTTAGTGCTGCGCTGATATGTGCTGTACCATTGTCTTGTGGGGCTGCATCATTGGCCGCTGGTGCCACATTCTATATGTTCATTCTGTCAAAAATGTATGAAGAATATTTAGAGGATTATGTTTATAAGCTGATGGCAAAATTGGCAAGTCGTATGTGTTATATGTTCAAGGGTAAGAAACCGAAAGAAGACTCAAAACAATGTACTCAAAAAGAAGACAATTCCAAAGATATTACTAATTCAGAGAATCTCAAAGCCATAGAGTTCAAAGAACATTCCAAAAAGGATGAAGATCAACCTGATAAACAAACGAATGATACTAAAAGTAACAATATACAGAAATGTGATAGCAGTGAGAACCTTATAGATGAGGAGCTCAGTAGTATAAATTTTCATGTCATGATGTTCTTTTTGTGGATGGCAGTAACTGTAGTCAATATTCCCGCTTTGTTGACATGGGCACGAAACTTCAAATATAGTATGGTCTTAAAACCTGACACCTCATACTACACTGGCCTCGTCATGGCAGCATGTTCATCAATTATTTGGCAGATGGACAGCCCAAGGAAAAACTTAAGAAACTATGAAATGGTGTCTTCCGCACTATTCATAATGGCTGTATTGATATGTGCTCTGGGACCATTTTCCCTCTCAATTGTAAATTATGGAGTGACATTTATGTTTGCAATAATAACTCTGCAGCAGTTATATGATGTTGATGATAACATTGATGAAAATTTGCTTACACAAGAGCCCTTACAAGATAAAGAAATTACTAATATAGACCAAGAAAATGAGGCAGACAAAAAAGATACTCAAGAAGAAAATTGTAGCAAAAAGGAAGATTTAAATAAAGATGCGGGTGAAAGTCTTAAAACTGATAATTCTTCTGAGGCTGAGTGTGGCGATAAGTCCAGTGACTGCGATAAATGTGATGAGAGTAGAATTTACAGAGTTTTCAAAAATCTCAGAGAAAAATTTAGTTTTGGTGACAATGAATGA

Protein sequence:

>DPOGS209247-PA
MTFMFEYPQFVRISLEENKKYPQYGLYAYSEGRFTEKARKMWFDGIPVLFLPGNSGSHMQARSLASVALRKALSESYEYHFDYFTISYNEELSGLYGGVLQGQTEFASACINKILTLYKSNKYTKSVPTSVILIGHSMGGIIAKRLLTYPYTKNSTNIAITLVAPLKAPVINFDILLNEYYMQMDMEWMEYKLSNLRHDKILISIGSGPRDMLIPAGLTASNYSHINTLSTAIPGVWVSPDHVSMVWCKQLVLVINRFLFDIVDTWTEQISINSAYIDQKARQYFKANRSTTLDKSILRHNVSMQVDGFWYEDSRRIYQISRPGIERTTHLMIRLVSFPQNRFVAIESTNVDDKDWIFGCTAKDVHNNYRYCKEATSLSELSRWSGAATDYGRRKLATIHLHQIMEEEPQWTHVIVKVSPTRKPIVLNVDTNDYASRQIEVSAPSDYSFGKRVVIPETVPNSLYYELILSEFNLIHQAYLLYVEPTETCKGQYHVSAEVHVPWAQNNEYYHYFTPLKRSPMKLRLFESNPNITLGQDATEKVKITLLLDPQCTFSVSISISWYHRLAQLSRNYTPILVPYVAAILLLAARNNILNIQSTGCPSIHSALMSDSVRPYFVLVFARLAITSFMSVPFLSFLFENASWRNLELQYFVRSLLVLPTYMTALGIINIVALALLIIMVFSSQLAHRLLFRIVWRGGMGLAEKMAVGLQKVPMLVSAALICAVPLSCGAASLAAGATFYMFILSKMYEEYLEDYVYKLMAKLASRMCYMFKGKKPKEDSKQCTQKEDNSKDITNSENLKAIEFKEHSKKDEDQPDKQTNDTKSNNIQKCDSSENLIDEELSSINFHVMMFFLWMAVTVVNIPALLTWARNFKYSMVLKPDTSYYTGLVMAACSSIIWQMDSPRKNLRNYEMVSSALFIMAVLICALGPFSLSIVNYGVTFMFAIITLQQLYDVDDNIDENLLTQEPLQDKEITNIDQENEADKKDTQEENCSKKEDLNKDAGESLKTDNSSEAECGDKSSDCDKCDESRIYRVFKNLREKFSFGDNE-