Minituna-distributed

A distributed optimization POC for Optuna, based on fork of awesome Minituna toy hyperparameter optimization framework.

Core idea

Much like in minituna-multiprocess, the idea is to distribute execution of trials across workers (in this case different physical machines) as tasks and keep study and its resources (storage, sampler etc.) local in client process (see rationale behind that in minituna-multiprocess README file). Task scheduler, client, cluster deployment and coordination primitives (pubsub) are provided by dask.

Implementation

Working example was build on top of minituna_v3.py and can be found in minituna_distributed.py. Changes include:

New implementation of Study.optimize, which now is responsible for spawning tasks, initializing and maintaining communication with workers
DistributedTrial class which holds communication with main process on the worker side
Simple communication protocol done via implementations of Command base class
Minimalistic optimization process controller

All examples can be executed out-of-the-box thanks to local cluster provided by dask. However, these are much cooler when running on multiple physical machines. To achieve that, deploy a small dask cluster (Raspberry Pis are fine!), instantiate dask client and point it at your cluster scheduler. Environment on cluster must more-or-less match your local, including minituna-distributed module, so standard setup.py script is available to package required files.

xadrianzetx / minituna-distributed

Minituna-distributed

Core idea

Implementation

About

Languages