dos-group / allocation-assistant

resource allocation for recurring parallel dataflow jobs

Repository from Github https://github.comdos-group/allocation-assistantRepository from Github https://github.comdos-group/allocation-assistant

allocation-assistant

Automatic resource allocation for recurring distributed dataflow jobs given a user-defined runtime target.

Given, for example, a recurring Spark job that runs SGD (mllib) to find parameters for a training dataset of 10 GB (20000000 data points, each with 20 features) using 100 iterations and a step size of 1.0, using the allocation-assistant (d041bde, Dec 8, 2016) to allocate resources for a target runtime of 800 seconds resulted in the following allocations and runtimes:

Example of a recurring Spark job implementing SGD

Compiling

install Freamon (https://github.com/citlab/freamon):

git clone <freamon url>
cd freamon
mvn install

compile allocation-assistant:

cd ..       # where you cloned allocation-assistant
mvn package

Running

  1. setup+start freamon (see freamon readme)
  2. create your own config based on doc/cluster.conf
  3. ./allocation-assistent <your args>

To see available arguments run ./allocation-assistent --help

About

resource allocation for recurring parallel dataflow jobs

License:Apache License 2.0


Languages

Language:Scala 99.6%Language:Shell 0.4%