Monarch geneset OGS2.0

DPOGS205069
TranscriptDPOGS205069-TA1491 bp
ProteinDPOGS205069-PA496 aa
Genomic positionDPSCF300074 - 81337-86542
RNAseq coverage96x (Rank: top 62%)
Annotation
HeliconiusHMEL0121240.067.70% 
BombyxBGIBMGA006812-TA6e-15356.77% 
DrosophilaCG6106-PA3e-8137.22% 
EBI UniRef50UniRef50_Q8I6V56e-12345.45%Allantoinase n=2 Tax=Ctenocephalides felis RepID=Q8I6V5_CTEFE
NCBI RefSeqXP_001605232.14e-12047.57%PREDICTED: hypothetical protein [Nasonia vitripennis]
NCBI nr blastpgi|252467062e-12245.45%allantoinase [Ctenocephalides felis]
NCBI nr blastxgi|3454818402e-12147.57%PREDICTED: allantoinase-like [Nasonia vitripennis]
Group
Gene OntologyGO:00040388.7e-139allantoinase activity
GO:00082708.7e-139zinc ion binding
GO:00002568.7e-139allantoin catabolic process
GO:00508978.7e-139cobalt ion binding
GO:00168101.3e-20hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds
GO:00167876.5e-15hydrolase activity
KEGG pathwaynvi:1001216211e-119 
 K01466 (E3.5.2.5, allB)maps-> Purine metabolism
InterPro domain[33-489] IPR0175938.7e-139Allantoinase
[422-492] IPR0110591.3e-20Metal-dependent hydrolase, composite domain
[85-434] IPR0066806.5e-15Amidohydrolase 1
Orthology groupMCL16957 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205069-TA
ATGGCTGCGTTTTATTATTTATTGATAACGTTTATTGGGACAGAAGTACTTGGCAGAGGATTACAGCCGGATAAAAGTGAAAACCAGTTATTTTTAAGTAAACGAGTTGTGACAGATTCTGGCGAAATTGATGGTGGAGTGCTTGTGAACGAAAATGGTATAATTGAAGGTATATTTACGAGGGACACTATAGATAAATTATTCAATGGATATGATAAGAATTTGCAGGTAATAGACGGTGGCGATTGGGCCTTAATGGCTGGTGTCGTGGATTCTCATGTACATGTTAATGAGCCAGGCCGTACATCATGGGAAGGGTTTGTGACTGCCACCAGGGCAGGAGCGGCAGGAGGCATCACCACCATTGTTGATATGCCCTTGAATTCTGTTCCACCGACCACTTCAATTGAAAATTTAAAAACTAAAGTCTCAGCTGCCAAAGAAGAGGTGTACGTAGATGTGGCATTTTGGGGAGGATTGGTTCCTGGCAATGAGGAGGAATTACAGAAGCTAGTTAAAGCGGGTGTTGTTGGTTTCAAAGGTTTTTTAATCGATAGTGGAGTGTCGGAGTTTCCTAATGTTGAAGGTGATGATTTAGATAAAATATTTACAACGCTGAATGGTTCTGACATTGCGGTAGCGTTCCACGCCGAATTACCCATTAGTGACGGCAACAACAGCAGTCTATGTGACAAATGCGAAAATCTAGATCCGGTCCTATATAGCACGTATCTATCATCTCGACCACCTCAAATGGAAATTGATGCTGCAACATTACTTGCGAAATATATTGCTAAATATGACGTCCACGTGCACGTCGTTCACGTGTCAGCTGAAGGTGTAATACCGATTTTAGAAAAAGCTAGAGAGTTTAGGATTCAAAATGGATCGAAACGTTGGAGAGGGGGTGTCACAGCTGAAACCTGTCATCATTATCTTACATTGAGCTCGGAACAAATCCCGCCAGGACGCACGGAGTATAAATGTTCACCTCCCATAAGAGACATCAACAATAAGTTGCGACTGTGGGAATATATAAAGCAGAGAAGAATTGATTTGATAGCGTCCGATCACTCCCCGTCAGTCGCTGGCCTCAAGAGTCCTAATTTCATGACCGCTTGGGGTGGTGTATCGTCCGTGCAATTCGGCTTATCTCTATTTTGGACTGAAGCAAAAGCTCGTGGTTATAGTCTTAGCACTGTCAGCCATTTCTTGTCGTCGGGACCCGCTCGGCTTGCTGGGTTACACGACAAAAAGGGGGCCTTGAAACCAGGCCTTGACGCCGACCTGGTTTTCTTTGATCCTGACGCTTCATTCGTGCTTACACCAGATAAAATATTCTACAAGAACAAGCTAAGTCCGTATATGTACAAAGTCCTAACCGGGAAAGTGATGCAAACTTACGTGAGAGGTCGTCTCGTGTTTAACGATGGTCAAGTGTATGGCAACCCACAAGGAAAGTTATTGATAAACGAAGACGAGTTATATTGA

Protein sequence:

>DPOGS205069-PA
MAAFYYLLITFIGTEVLGRGLQPDKSENQLFLSKRVVTDSGEIDGGVLVNENGIIEGIFTRDTIDKLFNGYDKNLQVIDGGDWALMAGVVDSHVHVNEPGRTSWEGFVTATRAGAAGGITTIVDMPLNSVPPTTSIENLKTKVSAAKEEVYVDVAFWGGLVPGNEEELQKLVKAGVVGFKGFLIDSGVSEFPNVEGDDLDKIFTTLNGSDIAVAFHAELPISDGNNSSLCDKCENLDPVLYSTYLSSRPPQMEIDAATLLAKYIAKYDVHVHVVHVSAEGVIPILEKAREFRIQNGSKRWRGGVTAETCHHYLTLSSEQIPPGRTEYKCSPPIRDINNKLRLWEYIKQRRIDLIASDHSPSVAGLKSPNFMTAWGGVSSVQFGLSLFWTEAKARGYSLSTVSHFLSSGPARLAGLHDKKGALKPGLDADLVFFDPDASFVLTPDKIFYKNKLSPYMYKVLTGKVMQTYVRGRLVFNDGQVYGNPQGKLLINEDELY-