Monarch geneset OGS2.0

DPOGS205068
TranscriptDPOGS205068-TA1431 bp
ProteinDPOGS205068-PA476 aa
Genomic positionDPSCF300074 - 89516-95620
RNAseq coverage224x (Rank: top 44%)
Annotation
HeliconiusHMEL0121241e-13451.69% 
BombyxBGIBMGA006812-TA0.066.11% 
DrosophilaCG6106-PA1e-7737.80% 
EBI UniRef50UniRef50_Q8I6V56e-11647.69%Allantoinase n=2 Tax=Ctenocephalides felis RepID=Q8I6V5_CTEFE
NCBI RefSeqXP_001605232.17e-11849.79%PREDICTED: hypothetical protein [Nasonia vitripennis]
NCBI nr blastpgi|3838551922e-11950.64%PREDICTED: probable allantoinase 1-like [Megachile rotundata]
NCBI nr blastxgi|3838551922e-12250.42%PREDICTED: probable allantoinase 1-like [Megachile rotundata]
Group
Gene OntologyGO:00040381.9e-140allantoinase activity
GO:00082701.9e-140zinc ion binding
GO:00002561.9e-140allantoin catabolic process
GO:00508971.9e-140cobalt ion binding
GO:00168101.2e-19hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
GO:00167871.5e-14hydrolase activity
KEGG pathwaynvi:1001216212e-117 
 K01466 (E3.5.2.5, allB)maps-> Purine metabolism
InterPro domain[10-472] IPR0175931.9e-140Allantoinase
[404-474] IPR0110591.2e-19Metal-dependent hydrolase, composite domain
[62-416] IPR0066801.5e-14Amidohydrolase 1
Orthology groupMCL16957 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205068-TA
ATGACGGGACAGTATAATTTGTTTTTAAGTAAACGCTTAGTAACAGAGGAAAAGATTTTCGATGGTGGGGTTTTGGTTAACGAGTTCGGAAAGATTGAAGACGTGCTTGACCGATACGATGCTGATAATTTGATAGCCGAGAGCAAGGATAAAATTACTGTAGTAGATGGTGGAGACTGTCCTCTCCTGGCAGGTGTGGTTGACTCCCATGTCCACGTCAACGAGCCCGGCCGCACCGCCTGGGAGGGTTTCCGTACAGCGACCAGTGCAGCTGCTGCTGGCGGCATAACTACCATCGTCGATAACGGCGACACTCACACACCATTTCAAATTGTCTTTCAAAACTCCATCCCTCCGACAACGACCTTGGAGAATCTTAAAATAAAAGCCAATTCTGCGAAGGAAAACGTATTCGTAGATGTTGCATTTTGGGGCGGCGTGATTGTTGGGAATCAGGATTCTCTTCGAGATTTAGTAAAAGCTGGAGTGGTGGGTTTTAAATGTTTTCTTTGTCCAAGTGGCGTTGAAGAGTTTCCTAATGTAGGAATCGAAGATTTAAATTTAGCATTTGATGCCCTTGAAGGAACCGGCTCGGTTTTGGCTTTTCACGCTGAATTTGAAGAGGAGACATCTAGCGGGAAATGTATGAAATTAGATCCTGAAGATTATAACACGTACCTAGAGTCGCGACCATCCCAAATGGAATTAAGTGCAGTCTCACTCATAACCAACTTCCTACAAAAAACGGACGTTCGAGTTCATATTGTTCACGTGTCATCAGCAAATGTGGTGCCGCTTCTGGTGAAGGCTCGTGAGGATCGTCTAGCCAAAGGCCATAATGCCTGGCGTGGAGGAGTGACCGCTGAGACATGCCACCACTATCTAACGCTGAGCGCAGATGAAGTGCCACGAGGACACAGCGAGTACAAATGTGCTCCTCCAATACGAGATGCTAATAATAAGGAAAAATTGTGGAAGTTTTTGCTAGATGATAAATTGGATATGGTTGTATCAGATCACTCGCCCTGTACCCCGGAACTCAAGTGTAGCAATAACTTAAAAGCTTGGGGTGGAATATCTTCGGTCCAGTTTGGTTTGCCATTATTTTGGACTCAAGCAAGTGCTCGTGGATTAGATTTAAGATCAATAACCAAATATCTAAGTTCTGGTCCAGCCCATCTTTGTGGGTTGCAAAATCGAAAAGGAGCACTTAAAAAAGGATTGGATGCCGATCTTATTTTCTTCGATTGCGACGCAAATTTCACTGTAACCCAAGAAATCATACGACATAAAAATAAGCTGACGCCCTATATTGGTAAAGAATTGAAAGGCATAGTTAGGAAGACCTATTTAAGAGGACATCTGATATATGACGGGGGCGATTTAATTGGTTCACCTCAAGGAGAACTGTTACTCAACGATATTAAATAA

Protein sequence:

>DPOGS205068-PA
MTGQYNLFLSKRLVTEEKIFDGGVLVNEFGKIEDVLDRYDADNLIAESKDKITVVDGGDCPLLAGVVDSHVHVNEPGRTAWEGFRTATSAAAAGGITTIVDNGDTHTPFQIVFQNSIPPTTTLENLKIKANSAKENVFVDVAFWGGVIVGNQDSLRDLVKAGVVGFKCFLCPSGVEEFPNVGIEDLNLAFDALEGTGSVLAFHAEFEEETSSGKCMKLDPEDYNTYLESRPSQMELSAVSLITNFLQKTDVRVHIVHVSSANVVPLLVKAREDRLAKGHNAWRGGVTAETCHHYLTLSADEVPRGHSEYKCAPPIRDANNKEKLWKFLLDDKLDMVVSDHSPCTPELKCSNNLKAWGGISSVQFGLPLFWTQASARGLDLRSITKYLSSGPAHLCGLQNRKGALKKGLDADLIFFDCDANFTVTQEIIRHKNKLTPYIGKELKGIVRKTYLRGHLIYDGGDLIGSPQGELLLNDIK-