rdk / p2rank

P2Rank: Protein-ligand binding site prediction tool based on machine learning. Stand-alone command line program / Java library for predicting ligand binding pockets from protein structure.

Home Page:https://rdk.github.io/p2rank/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Thread count limitation

skodapetr opened this issue · comments

Using -crossval_threads 1 -rf_threads 4 I would expect p2rank to use at most 4 threads, but that is not the case as on my 8 core machine the CPU utilization by p2rank sometimes reaches 100%; meaning that p2rank uses all 8 core, ie. 8 threads.

Is there a simple way how to limit the number of maximum threads to a given number? This is useful for running p2rank in the background, on the working station, or just on a server that is not entirely dedicated to p2rank.

commented

There is also a parameter -threads: this is the one that influences how many proteins/dataset-items are processed in parallel. -rf_threads only influences training of Random Forest and it may need to be set to a lower value due to the memory limitations.

commented

In any case, P2Rank will make the best effort to limit the parallelism to a given value, but it is not guaranteed (e.g. due to some library that might use the common global thread pool). For guaranteed max. CPU usage you will have to limit the available CPUs for P2Rank process on an OS level.

Thanks. it looks like -threads is doing fine in limiting the CPU usage to +- given number of threads. On an 8 core machine, the utilization is mostly 50% sometimes reaching 70% for a brief period of time.