Monarch geneset OGS2.0

DPOGS200463
TranscriptDPOGS200463-TA1446 bp
ProteinDPOGS200463-PA481 aa
Genomic positionDPSCF300260 + 224360-229156
RNAseq coverage260x (Rank: top 41%)
Annotation
HeliconiusHMEL0130660.071.43% 
BombyxBGIBMGA011188-TA1e-16870.66% 
DrosophilaNitFhit-PA8e-11946.12% 
EBI UniRef50UniRef50_Q17CS43e-12548.05%Nitrilase, putative n=14 Tax=Coelomata RepID=Q17CS4_AEDAE
NCBI RefSeqXP_001863190.19e-13049.35%nitrilase and fragile histidine triad fusion protein NitFhit [Culex quinquefasciatus]
NCBI nr blastpgi|1700545732e-12849.35%nitrilase and fragile histidine triad fusion protein NitFhit [Culex quinquefasciatus]
NCBI nr blastxgi|1700545733e-12349.35%nitrilase and fragile histidine triad fusion protein NitFhit [Culex quinquefasciatus]
Group
Gene OntologyGO:00068077.2e-87nitrogen compound metabolic process
GO:00168107.2e-87hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
GO:00038243.6e-38catalytic activity
KEGG pathway 
InterPro domain[21-300] IPR0030107.2e-87Nitrilase/cyanide hydratase and apolipoprotein N-acyltransferase
[328-475] IPR0111513.6e-38Histidine triad motif
[327-475] IPR0111468.7e-34Histidine triad-like motif
[340-429] IPR0013105.8e-19Histidine triad (HIT) protein
Orthology groupMCL12225 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200463-TA
ATGATCCGGAATTTTACGTTATCAATATTGAAGAATCTGCATTTGGAATCCTCTCAGTTCAGTACAATGGCGAGCAGAAAATTAGCTGTGTGTCAAATGACTTCAGTTGCAGATAAATCAGCAAATTTAAATGTTGTCAGTCAGTTAATAAGCGATGCTGCAAAAGATGATGTTAAGATGTTATTTTTCCCTGAATGCTGCGATTATATTTGTGAGAACAAAGACGAAACAATTAGATCGGCTGAAAATCTTTTGACGGGTGAAACTGTTAAGAAATACAGGGAATTGGCCGCTACGCACAATGTGTGGTTGTCAATGGGCGGATTACATGAAAAGGATGAAGCGAGCGTAGATAAGATATTCAATACACATATAATAATTAATGATAAAGGCGACATAGTACAAACATACAGAAAATTACACTTGTTTGATGTTGACATACCGGAGAGAAATATACGTCTGAAGGAGAGCGACTTCTGTAACCCCGGAGGGCATATAGTTGCGCCTGTTGACACACCGATTGGCAAGATTGGCCTTTCAATATGTTATGACCTTCGATTCCCCGAGCTCAGTACATCTCTAAGTATGATGAAAGCTGAAATACTAACCTATCCTTCTGCCTTTACTTATGCTACTGGCTTGGCTCATTGGCATATACTATTAAGAGCAAGGGCAATAGAGAATCAATGCTACGTGGTAGCGGCGGCTCAAACGGGGCAGCACAATGCTAAAAGACGCTCCTTCGGACATGCGCTCGTAGTGGACCCGTGGGGCGAAGTCCTAGCCGACTGCGGAGACTCCGCTCCTTGTTACAAGGTTGTCGAAATTACTGATAGATTGCAAGAAGTGAGGAAAAACATGCCCGTGTTCCAACACAGACGGCCGGATGTGTACTCCCTGTATTCTTTAAGTATCCGCAACAAACCGTTCAATGAGCCTCCGCCCCCGCCGCCCCGGACTCCGCCCCTCGCCACGACCGGGAACGTGTTCGGTCACGTATCCGTTCCGGAAACGTGCGTCTTCCACAAGTCGGAACTGACTTACGCGTTTGTCAACTTACGTTGTGTGACCCCGGGCCATGTATTGGTAGCGCCTATAAGGTTGGCAGAGAGGAATAAAGATTTGACAGACGAAGAAGCAAGTGACTTCTTTAAAACCGTGAGATTAATACAAAACCTAATGGAACGAGTTCACAATACAGAGTCGTGTACCGTCACTATACAGGACGGACCAGACGCGGGGCAAACCGTGAAGCATCTGCACTGCCATATAATGCCAAGGAAGAAAGGAGATTTCATTGAAAATGATTTGATATACTTGGAGCTAGCGAAACATGATCAGATGAGGTCAGGTCACCCAGCGAAGCCAGCCAGGAGTTTGGAAGAAATGGAAGCAGAAGCGAAATACCTCAGAGAAGAGTTGAAGAAGATGACAGAGACCAGCTAG

Protein sequence:

>DPOGS200463-PA
MIRNFTLSILKNLHLESSQFSTMASRKLAVCQMTSVADKSANLNVVSQLISDAAKDDVKMLFFPECCDYICENKDETIRSAENLLTGETVKKYRELAATHNVWLSMGGLHEKDEASVDKIFNTHIIINDKGDIVQTYRKLHLFDVDIPERNIRLKESDFCNPGGHIVAPVDTPIGKIGLSICYDLRFPELSTSLSMMKAEILTYPSAFTYATGLAHWHILLRARAIENQCYVVAAAQTGQHNAKRRSFGHALVVDPWGEVLADCGDSAPCYKVVEITDRLQEVRKNMPVFQHRRPDVYSLYSLSIRNKPFNEPPPPPPRTPPLATTGNVFGHVSVPETCVFHKSELTYAFVNLRCVTPGHVLVAPIRLAERNKDLTDEEASDFFKTVRLIQNLMERVHNTESCTVTIQDGPDAGQTVKHLHCHIMPRKKGDFIENDLIYLELAKHDQMRSGHPAKPARSLEEMEAEAKYLREELKKMTETS-