jmschrei / apricot

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly. See the documentation page: https://apricot-select.readthedocs.io/en/latest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can I use GPU to accelerate the process?

Adasunnylily opened this issue · comments

I found the process of selecting only use CPU and is very time consuming, I am wondering if there is any way to accelerate or use the GPU for example?
Thanks a lot!

Unfortunately, I didn't build in GPU support. There are several built-in algorithms to speed things up, though. Have you tried using a different optimizer, such as the approximate-lazy algorithm?

Thanks a lot! I will try later

To expand on this a little -- many of the algorithms are not easily parallelizable. The greedy algorithm can be orders of magnitude faster than the naive algorithm but involves evaluating an item and then doing logic (putting it back in the queue with a new value, or keeping it). You could evaluate batches of items like this and discard the additional work but it's unclear to me if this would yield much benefit and would be a lot of work to implement correctly.