Monarch geneset OGS2.0

DPOGS208646
TranscriptDPOGS208646-TA1149 bp
ProteinDPOGS208646-PA382 aa
Genomic positionDPSCF300281 - 85465-97228
RNAseq coverage71x (Rank: top 66%)
Annotation
HeliconiusHMEL0117589e-3955.41% 
BombyxBGIBMGA007780-TA3e-7947.32% 
DrosophilaCG2794-PA2e-7950.45% 
EBI UniRef50UniRef50_F2TVQ56e-8258.65%Indigoidine synthase A family protein n=1 Tax=Salpingoeca sp. ATCC 50818 RepID=F2TVQ5_SALS5
NCBI RefSeqXP_310362.63e-8357.19%AGAP003806-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|2240955951e-8557.45%PREDICTED: hypothetical protein [Taeniopygia guttata]
NCBI nr blastxgi|3016247382e-8659.12%PREDICTED: pseudouridine-metabolizing bifunctional protein C1861.05-like [Xenopus (Silurana) tropicalis]
Group
Gene OntologyGO:00167983.1e-113hydrolase activity, acting on glycosyl bonds
KEGG pathway 
InterPro domain[1-268] IPR0073423.1e-113Pseudouridine-5'-phosphate glycosidase
[1-271] IPR0228309.9e-96Indigoidine synthase A-like
Orthology groupMCL15718 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS208646-TA
ATGCCGTATCCGAAGAACTTGGAAACTGCTTTTCAGGTGGAAGAAGTAATACGAGAAAAAGGTGCAATCCCAGCAACCATCGCAATCCTTAAAGGCCAACTAACTGTTGGCCTTTCAAAGACCCAATTGGAGTATCTTGCTAAGGCCAAAGGAGTGATCAAAGCATCAAGGCGGGATATGGCACCCGTCATGGCGAAAAAGATGGATGGTGCAACTACCGTCGCTGGCACTATCATCGCTTCTGAATTAGCTTGCATACCAGTCTTCGTTACCGGAGGAATTGGCGGGGTTCATCGCGAAGGTCATAATACGATGGATGTGTCGGCCGACCTCACTGAGCTGGGTCGCAGTCGAACCTTGGTCGTGTGCAGTGGAGTCAAATCCATCCTTGACATCGGACGCACCCTTGAGTATCTGGAAACTCAAGGAGTAACTGTTTGTTCATTCGGTGATAGTCAAGACTTCCCTGCATTCTATACGACCAGGTCAGGTTTCAAAGCCCCGTACCAAGCGAGCGACGCCACACAGGCAGCAAAGATTTTATATTCGTCACACAAATTACAACTATCCTCTGGTATTGTTGTCGCTGTTCCCATACCCGTCGAGCATGCTATGGACGAGAACAAGATCGAGGCAGCTATTAAAAGTGCGTTAGTTGACGCTAACAAACTCGGAATCCAGGGTAAAGAAGTGACCCCATTCCTCTTAGCTGCAGTGTCTAAGGCGACGGGCGGGGAGTCACTTGCTGCCAATATTGCTCTCATAAAAAATAACGCAAGAATCGGTGCCGATATAGCGGTTCATTTTCAGAAGCTTAGGAATGCTGGCGATTTTGATAAAGATTTCAATATCGGAAACACAAAGACCAGGAAATGTAAGACGGATACCAAAAGACAATACCATACGTGCGCTGAGACGGGTGGAAACTCAAACGTGGACAATGGAGACGTGCTGGTCGTTGGGGGGTCGAATGTGGACAGGACCTTTAGAATCATTGAAGGAAACGTGCGGCATACTCGTTATCGAAAGTACAATAACATGGAAGGGGAAATGAAAGCTGATTTTGATAAGATTGAGGGAAGATTTAGGCAGTACGTCGGAGAAGTTGCAGGTTTTGAAAAAAAGAAGAAAAGTGGCTACAATAATTAA

Protein sequence:

>DPOGS208646-PA
MPYPKNLETAFQVEEVIREKGAIPATIAILKGQLTVGLSKTQLEYLAKAKGVIKASRRDMAPVMAKKMDGATTVAGTIIASELACIPVFVTGGIGGVHREGHNTMDVSADLTELGRSRTLVVCSGVKSILDIGRTLEYLETQGVTVCSFGDSQDFPAFYTTRSGFKAPYQASDATQAAKILYSSHKLQLSSGIVVAVPIPVEHAMDENKIEAAIKSALVDANKLGIQGKEVTPFLLAAVSKATGGESLAANIALIKNNARIGADIAVHFQKLRNAGDFDKDFNIGNTKTRKCKTDTKRQYHTCAETGGNSNVDNGDVLVVGGSNVDRTFRIIEGNVRHTRYRKYNNMEGEMKADFDKIEGRFRQYVGEVAGFEKKKKSGYNN-