Dragon: A Computation Graph Virtual Machine Based Deep Learning Framework

Compile Requirements for C++

Google Protocol Buffer
Python (2 or 3, 64bit) | Anaconda (2 or 3, 64bit)
CUDA [Optional]
CUDNN [Optional]
OpenMPI [Optional]
NCCL [Optional]

Installation

Clone this repository
(Optional) Download and install CUDA

(Optional) Download and install CUDNN

(Optional, Linux Only) Download and install NCCL
(Optional) Download 3rdparty.zip and unzip to Dragon/3rdparty (Out of source code dir)

Win64-VS2013 (OpenBLAS / Protobuf2.6 for VS2013 / CUDNN v7 / Microsoft MPI)

Win64-VS2015 (OpenBLAS / Protobuf2.6 for VS2015 / CUDNN v7 / Microsoft MPI)

Linux64 (OpenMPI)

For Windows, python27/35/36.lib should be copied to Dragon/3rdparty/lib, it depends on the version of Python.

For Linux, libpython-dev, libprotobuf-dev, libopenblas-dev and cuDNN should be installed by yourself.

Install Python Requirements

cd Dragon/python
pip install -r requirements.txt

Configure Dragon/CMakeLists.txt
- Select optional libraries [PYTHON3 / CUDA / CUDNN / BLAS / SSE / MPI]
- Set 3rdparty path (recommend to keep defualt)
- Set Python include path & Numpy root path
- Set CUDA compiling architectures if necessary
- GCC version(4.8+, 5.0-) should add -std=c++11 to CUDA_NVCC_FLAGS, if nullptr is not found
- We pre-generated files under Dragon/src/protos with protobuf-2.6, run protoc by yourself if higher are required
- OpenMPI can take NCCL and our CUDA impl at the same time, prefer not to use NCCL(memory inefficient)
Environment Variables

Linux(Only for OpenMPI):
- Create dragon.conf
```
sudo vim /etc/ld.so.conf.d/dragon.conf
```
  - Append 1 line for libraries dir of your 3rdparty, e.g. :
    - /home/Dragon/3rdparty/lib
- rebuild the scaning cache
```
 sudo ldconfig
```
Windows
- add binary directionary to system environment variables, e.g. :
  - PATH=........;C:\Dragon\3rdparty\bin;
Setup MPI [Optional]

Linux:
- We use OpenMPI which supports "cuda-aware-mpi"
- See more:
  - https://devblogs.nvidia.com/parallelforall/introduction-cuda-aware-mpi/
  - https://www.open-mpi.org/faq/?category=buildcuda
- Run 3rdparty/setup_mpi.sh
```
 bash ./setup_mpi.sh
```
- Install
```
sudo cp 3rdparty/openmpi/install/bin/mpirun /usr/bin
```
Windows:
- We use Microsoft MPI which can perfectly run at lastest Windows10
- Microsoft MPI is intergrated into 3rdparty and you should do nothing
Compile

Linux:
- Install cmake
```
sudo apt-get install cmake
```
- Make
```
cd Dragon
mkdir build
cd build
cmake ..
make install -j16
```
Windows:
- Install cmake-gui
- Mkdir Dragon/build
- Configure and generate MSVC project in Dragon/build
- Open Dragon/build/Dragon.sln
- Compile and generate for "INSTALL" solution

Deploy

cd Dragon/python
python setup.py install

Hint: If you do not have permission, try as follows:

cd Dragon/python
python setup.py install --user

Usage

Import

import dragon

Virtual DL Frameworks

------------------- Attention -------------------

tensorflow and theano are incomplete yet, prefer not to use it.

Currently, we recommend caffe and tiny-dragon(ops + thenao.function + theano.tensor.grad + updaters)

-------------------------------------------------

import dragon.vm.theano as theano
import dragon.vm.caffe as caffe
import dragon.vm.tensorflow as tf

Tutorials

[IPython Notebook] -> (https://github.com/PhyscalX/Tutorials)

We will revise several classical examples, covering both CV, NLP and RL.

Device

import dragon.config
dragon.config.EnableCPU()
dragon.config.EnableCUDA(device_id, use_cudnn=True)

Memonger

Dragon is a extremely memory efficient framework.

It is supported to drop intermediate results(mirrow stage) during forward phase, and share grads during backward phase,

takes 25% and 50% memory-usage comparing caffe and tensorflow respectively.

To use it, just:

import dragon.memonger as opt

ShareGrads

opt.share_grads()

Drop

import dragon.ops as ops
y = opt.drop(ops.Relu, x)

Scope

As a graph based framework, Dragon supports various scopes.

NameScope

import dragon
from dragon.core.tensor import Tensor
with dragon.name_scope(prefix='conv1'):
    w = Tensor('weight').Variable()    # named as conv1/weight
    b = Tensor('bias').Variable()      # named as conv1/bias

DeviceScope

import dragon
with dragon.deive_scope(deivce='gpu', id=0, use_cudnn=True):
    x = ops.Add(a, b)    # use /gpu:0 and cuDNN

PhaseScope

import dragon
import dragon.vm.theano as theano
 with dragon.phase_scope(phase='train'):
    f = theano.function(outputs=y)    # force the training phase even without gradients computation

License and Citation

Dragon is released under the BSD 2-Clause license.

Please cite Dragon in your publications if it helps your research:

@article{pan2017dragon,
  Author = {Pan, Ting},
  Journal = {arXiv preprint arXiv:1707.08265},
  Title = {Dragon: A Computation Graph Virtual Machine Based Deep Learning Framework},
  Year = {2017}
}

ForrestPi / Dragon

Dragon: A Computation Graph Virtual Machine Based Deep Learning Framework

Compile Requirements for C++

Installation

Linux(Only for OpenMPI):

Windows

Linux:

Windows: