Monarch geneset OGS2.0

DPOGS203831
TranscriptDPOGS203831-TA3105 bp
ProteinDPOGS203831-PA1034 aa
Genomic positionDPSCF300010 + 2522278-2528976
RNAseq coverage94x (Rank: top 62%)
Annotation
HeliconiusHMEL0069570.074.41% 
BombyxBGIBMGA003738-TA6e-14659.75% 
DrosophilaCG13089-PA2e-7048.78% 
EBI UniRef50UniRef50_E2BG092e-8656.74%Phosphatidylinositol glycan anchor biosynthesis class U protein n=8 Tax=Formicidae RepID=E2BG09_HARSA
NCBI RefSeqXP_001648170.14e-8856.06%hypothetical protein AaeL_AAEL014183 [Aedes aegypti]
NCBI nr blastpgi|1571038867e-8756.06%hypothetical protein AaeL_AAEL014183 [Aedes aegypti]
NCBI nr blastxgi|1571038815e-8956.06%hypothetical protein AaeL_AAEL014183 [Aedes aegypti]
Group
Gene OntologyGO:00065063.2e-116GPI anchor biosynthetic process
GO:00160213.2e-116integral to membrane
GO:00057893.2e-116endoplasmic reticulum membrane
KEGG pathwayaag:AaeL_AAEL0141831e-87 
 K05293 (PIGU)maps-> Glycosylphosphatidylinositol(GPI)-anchor biosynthesis
InterPro domain[15-289] IPR0096003.2e-116GPI transamidase subunit PIG-U
Orthology groupMCL11954 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203831-TA
ATGCTTCTAAGTGAATCACAATTGAGTGAAGTTCCAGGACATGTCTTAGCGCTGTACTTGTTTAATCCTTATTCAGTCCTCAATTGTGTGGGGATGACGACAACTGTTATTCAGAATTTGACACTGGCTTTATCACTGTGGGGAGCTACAAATGGACAAAGAATCCTGGCTTGTGCTTTCATAGCTCTTGCAACTCATCAAGCCCTTTACCCCATTCTACTCATTGTTCCTATATCAATACTACTTGCTAATGTCAACAAAGGTTGTAACAAATGCTCGTATATTAGAACATTATTAGTATTTGTATTATGTTGGGGTTTCTTAATATTCATATCTGCATTTATTATGGATGGTTCATATAACTATGTATATAATACCTATGGATTCATATTAAGTGTTCCTGATCTGAAGCCAAATATTGGTCTGTTTTGGTACTTCTTTACGGAAATGTTTGAGCACTTTAGATTACTATTTGTGTGTGCATTTCAAATTAATGCACTAGCTCTCTATGTGGTGCCATTGACCCTTAGGTTCCACAAGGAGCCTGTATTACTAGCCACAGTCCTAATAGCTCTCTCAACAATATTCAGATCATACCCATGTGTGGGTGATGTCGGCTTTTACTTGGCTTTATTACCATTGTGGAAGCATCTGTTCTCATTTATGCAACAAAAGTTTATTGTTGGATGCGCCTTTATCATCACATCAGCTCTTGGACCAACTGTATGGCACCTCTGGATCTACTCTGGTTCAGCAAATGCAAACTTCTTCTTTGGAGTTACCCTTTCTTTTGCTACTGCACAAATATTTTTGATCACGGACCTACTATTTGCATATATTAAAAGGGAATTCACGCTCAAACACGGTTCCAATGAAGTGGTATTATCAAGAGTGCCAACACACTTGTTGGACTGCTACCAGGGCGGAGGCCCAATTTTAGGTGCACCGAGACGACTTGACGTTTTCTTATCACTTCTAAGAAAGTTGGAGCTGAACTCTCGTCTCGATATGAGATTGTTAAGTTCGGCTTTACTAAGAAGTCTTCGCCTTGACGGTATAGAGCAATCAGCCAATTCTGTAGAAACTGATCTCTATTTGCCATACGGAGCGTCTGCATTCCAATTTCATAGATATAAGCTGCTCATGGAGATATTTCTACCAAGTCAAGACTTATTGAATGTTAATGAGACTTTGAGCACTGTAGAAAAATGCACTCTTCACAAAATGCTTAGTTCCACAGTGCAACGTTGGGAAAGGGGCGATGAGAATGTAGTATGTCCCTTGTCAGCTGAACGTAGACATATGGAACAGAGTGCCAATAGAATAAACAGCCGATGTCCTATAGAAGACGGTGTAATCAAAACAGACTGGGGTACGATTTCACCTGGCATTCTAGTCGCTGCTCTTGCATCGTCCCTAGAAGCACAGAGAGTTGATATAACAGATATATTAGGAGCAGATATATTTAAGGATGAAGTATCTCAATCCTTAGTAGAATCTGCTAAAGAAGATTGGTACGACGAACTAGAACAATTCGATGTTAAATCTAAAAGTCTTAACACAAACACGGACATAAGCAATGTGTGGGTAGCAACTCTCGCAGGTGACTTAGCAGAGGTTGTTATTAACCAAGGCGCTAGAGTGGGAGCGTCGGCACAGAAGTTGATGGTAGGCAGCAGCAATCGTTGGAACGATACATTTATTCCTAGAACTTATTACTTATTTCCTCAAAACGCTACTCTTCCTGATTGGCATTTTACTGATGCGGAAATTTTAGCTGGCATAGACGGACTTATCATAGCTAATTACTTACCAAAGTGGGTGGAACAACGACGTTCTTTACGACTATCTCAGATAATAGAGATGTATTACTCAAATGAAGGAGTATCTTTCGATACATCTGTACGAGCTTGTAACAGACAAGCCTTGTTTGCGAATATCGTCAACGGCTCCCAGTTGTTCACAGAAACATCTAGATTTGCTCACATGTTATCACTCCAACAAATAACTGTTTATATTCCCAAGGAGGAAATGGAGAGAATAACAACAACCGCGGTGGGAGTGTTCATGAATTATGTCCCAAATCTTTTAAGGCGGTCTCACCAAGAATGTAAATGGAGGCCAGTTGTTGCGAACGTTGACTTAATATTAGCAACTGATGGCAGTTGGAAGGGATATGAAGTAGAACAGTTTATGTCATGGATTAGTGAAGCTATAGAAGTAGGAGCTCAGGGTAGCTCTATATCCCTCGTAAATGGAAATACTGGTGAATGGATAGTTCGTCCTACTAATTTAACAGATTTCTTTGTAATGCTTACCAACGAGACAATACAATGGCCAAATCGTCTCAACCTCCCAAATGTTATATCTACAATAATCGAGTACAGTCGTGATCAAACTCTTCAAGAGATATCTGATATGGTGAGTGCTGGTCGTAGCACAGTTGTCCTCATCGTTACTTCAGAGCGACCCTCAAATGACGAGTTGGAGCGCTCCAGGTCCTTAATGCAATCCTTGAGGCAAAGCTTTTATGACGTTTACTTTGCATACGCAGCCACAGACATGACAGAGTATCAGAATATAAACAACCAGTTCATGGATTATTCAGAGCTATTTTTAAAGATAGAATCAAATTCAGTCATTGACGTAATAAGGACAGTGGATATTCATCTAGTAAAGAATATTATCCCATTTAGAATAATCGGTCCTCAGTGTCCAGTCAATGGCACTAATTATTTTCAAACTCCATATGAGAATTATGTTCTACCACACCGTGAACAGTTCTACCGCATTCATCCGTTTTACTTAAGGCAACAGTCGCTTATTAATATACAGTTTCGGAACGATGGGCAGGGGCAGATACTGGTGTGCTTGTGGCGCGGAGCTGAAGTTTCCCGCAGCTGCCAAATGATAAAAGAGAGGGATGTTTATACATTTAATCTAACGGATCCTTGTCCTTCTCGCGAATTTTGTCCTCCAGCTCATCTTTCAGTAAAAGCTATAGCGATTGTCGCCTGCCGCACCAAGTTGGTTATTACTTCCAACATTCTGGCTTTAGATGTTTGCCTCTTTTGGGAGCCGCGGCCTATGTCAAGCCGTTTTTGA

Protein sequence:

>DPOGS203831-PA
MLLSESQLSEVPGHVLALYLFNPYSVLNCVGMTTTVIQNLTLALSLWGATNGQRILACAFIALATHQALYPILLIVPISILLANVNKGCNKCSYIRTLLVFVLCWGFLIFISAFIMDGSYNYVYNTYGFILSVPDLKPNIGLFWYFFTEMFEHFRLLFVCAFQINALALYVVPLTLRFHKEPVLLATVLIALSTIFRSYPCVGDVGFYLALLPLWKHLFSFMQQKFIVGCAFIITSALGPTVWHLWIYSGSANANFFFGVTLSFATAQIFLITDLLFAYIKREFTLKHGSNEVVLSRVPTHLLDCYQGGGPILGAPRRLDVFLSLLRKLELNSRLDMRLLSSALLRSLRLDGIEQSANSVETDLYLPYGASAFQFHRYKLLMEIFLPSQDLLNVNETLSTVEKCTLHKMLSSTVQRWERGDENVVCPLSAERRHMEQSANRINSRCPIEDGVIKTDWGTISPGILVAALASSLEAQRVDITDILGADIFKDEVSQSLVESAKEDWYDELEQFDVKSKSLNTNTDISNVWVATLAGDLAEVVINQGARVGASAQKLMVGSSNRWNDTFIPRTYYLFPQNATLPDWHFTDAEILAGIDGLIIANYLPKWVEQRRSLRLSQIIEMYYSNEGVSFDTSVRACNRQALFANIVNGSQLFTETSRFAHMLSLQQITVYIPKEEMERITTTAVGVFMNYVPNLLRRSHQECKWRPVVANVDLILATDGSWKGYEVEQFMSWISEAIEVGAQGSSISLVNGNTGEWIVRPTNLTDFFVMLTNETIQWPNRLNLPNVISTIIEYSRDQTLQEISDMVSAGRSTVVLIVTSERPSNDELERSRSLMQSLRQSFYDVYFAYAATDMTEYQNINNQFMDYSELFLKIESNSVIDVIRTVDIHLVKNIIPFRIIGPQCPVNGTNYFQTPYENYVLPHREQFYRIHPFYLRQQSLINIQFRNDGQGQILVCLWRGAEVSRSCQMIKERDVYTFNLTDPCPSREFCPPAHLSVKAIAIVACRTKLVITSNILALDVCLFWEPRPMSSRF-