Linux/MacOS: Windows: Coverage:

Introduction

Collective Knowledge (CK) is an open-source framework to speed up collaborative and reproducible R&D with reusable, customizable and portable components. Trusted by a growing number of academic and industrial partners, CK helps to automate artifact evaluation and accelerate complex experiments such as benchmarking, co-design and optimization of the whole SW/HW stack for AI/ML. Git it a try!

CK framework is based on agile, DevOps and Wikipedia principles helping users to:

decompose complex software projects with ad-hoc scripts into portable, customizable and reusable components (packages, software detection plugins, modules and workflows) with a unified Python JSON API and an integrated package manager;
organize all their local components (artifacts) in open CK repositories and continuously exchange them with the community via GitHub, GitLab, BitBucket, BitTorrent, ACM DL, etc. to encourage artifact reuse;
collaboratively improve all shared components and their JSON descriptions similar to Wikipedia while always keeping APIs backward compatible similar to Java;
quickly prototype research ideas from shared components (such as customizable, multi-objective, machine-learning based and input-aware autotuning);
enable universal virtual CK environment where multiple versions of different software can easily co-exist;
crowdsource different experiments across diverse data sets, models and platforms provided by volunteers (such as crowd-benchmarking deep learning);
convert existing benchmarks into portable and customizable CK workflows adaptable to any platform with Linux, Windows, MacOS and Android using (see ACM ReQuEST initiative);
unify access to predictive analytics (TensorFlow, TFLite, MXNet, Caffe, Caffe2, CNTK, scikit-learn, R, DNN, etc) via unified CK JSON API and CK web services;
enable reproducible, interactive and "live" articles as shown in this interactive CK report with Raspberry Pi foundation;
automate and unify Artifact Evaluation at systems, ML and AI conferences;
support open, reproducible and multi-objective co-design competitions of the whole SW/HW stack for emerging workloads such as AI (see ACM ReQuEST tournaments).

Please, check out the latest ACM ReQuEST-ASPLOS'18 report about results of the 1st CK-powered competition on co-designing Pareto-efficient SW/HW stack for deep learning, CK motivation slides and CK use cases from our partners including reproducible ACM tournaments on reproducible SW/HW co-design of emerging workloads and artifact sharing via ACM Digital Library.

Join the CK consortium to influence CK long-term developments and standardization of APIs and meta descriptions of all shared CK workflows and components!

CK resources

Minimal installation

The minimal installation requires:

Python 2.7 or 3.3+ (limitation is mainly due to unitests)
Git command line client
wget (Linux/MacOS)

Linux/MacOS

You can install CK in your local user space as follows:

$ git clone http://github.com/ctuning/ck
$ export PATH=$PWD/ck/bin:$PATH
$ export PYTHONPATH=$PWD/ck:$PYTHONPATH

You can also install CK via PIP with sudo to avoid setting up environment variables yourself:

$ sudo pip install ck

Finally, start from Ubuntu 18.10, you can install it via apt:

$ sudo apt install python-ck
 or
$ sudo apt install python3-ck

Windows

First you need to download and install a few dependencies from the following sites:

Git: https://git-for-windows.github.io
Minimal Python: https://www.python.org/downloads/windows

You can then install CK as follows:

 $ pip install ck

 $ git clone https://github.com/ctuning/ck.git ck-master
 $ set PATH={CURRENT PATH}\ck-master\bin;%PATH%
 $ set PYTHONPATH={CURRENT PATH}\ck-master;%PYTHONPATH%

Customization and troubleshooting

You can find troubleshooting notes or other ways to install CK such as via pip here. You can find how to customize your CK installation here.

Getting first feeling about portable and customizable workflows for collaborative benchmarking

Test ck:

$ ck version

Get shared ck-tensorflow repo with all dependencies:

$ ck pull repo:ck-tensorflow

List CK repos:

$ ck ls repo | sort

Find where CK repos are installed on your machine:

$ ck where repo:ck-tensorflow

Detect your platform properties via extensible CK plugins as follows (needed to unify benchmarking across diverse platforms with Linux, Windows, MacOS and Android):

$ ck detect platform

Now detect available compilers on your machine and register virtual environments in the CK:

$ ck detect soft --tags=compiler,gcc
$ ck detect soft --tags=compiler,llvm
$ ck detect soft --tags=compiler,icc

See virtual environments in the CK:

$ ck show env

We recommend to setup CK to install new packages inside CK virtual env entries:

$ ck set kernel var.install_to_env=yes

Now install CPU-version of TensorFlow via CK packages:

$ ck install package --tags=lib,tensorflow,vcpu,vprebuilt

Check that it's installed fine:

$ ck show env --tags=lib,tensorflow

You can find a path to a given entry (with TF installation) as follows:

$ ck find env:{env UID from above list}

Run CK virtual environment and test TF:

$ ck virtual env --tags=lib,tensorflow
$ ipython
> import tensorflow as tf

Run CK classification workflow example using installed TF:

$ ck run program:tensorflow --cmd_key=classify

Now you can try a more complex example to build Caffe with CUDA support and run classification. Note that CK should automatically detect your CUDA compilers, libraries and other deps or install missing packages:

$ ck pull repo --url=https://github.com/dividiti/ck-caffe
$ ck install package:lib-caffe-bvlc-master-cuda-universal
$ ck run program:caffe --cmd_key=classify

You can see how to install Caffe for Linux, MacOS, Windows and Android via CK here.

Finally, compile, run, benchmark and crowd-tune some C program (see shared optimization cases in http://cKnowledge.org/repo):

$ ck pull repo:ck-crowdtuning

$ ck ls program
$ ck ls dataset

$ ck compile program:cbench-automotive-susan --speed
$ ck run program:cbench-automotive-susan

$ ck benchmark program:cbench-automotive-susan

$ ck crowdtune program:cbench-automotive-susan

You can also quickly your own program/workflow using provided templates as follows:

$ ck add program:my-new-program

When CK asks you to select a template, please choose "C program "Hello world". You can then immediately compile and run your C program as follows:

$ ck compile program:my-new-program --speed
$ ck run program:my-new-program
$ ck run program:my-new-program --env.CK_VAR1=222

Find and reuse other shared CK workflows and artifacts:

Further details:

Trying CK using Docker image

You can try CK using the following Docker image:

 $ (sudo) docker run -it ctuning/ck

Note that we added Docker automation to CK to help evaluate artifacts at the conferences, share interactive and reproducible articles, crowdsource experiments and so on.

For example, you can participate in GCC or LLVM crowd-tuning on your machine simply as follows:

 $ (sudo) docker run ck-crowdtune-gcc
 $ (sudo) docker run ck-crowdtune-llvm

You can then browse top shared optimization results on the live CK scoreboard: http://cKnowledge.org/repo

Open ACM ReQuEST tournaments are now using our approach and technology to co-design efficient SW/HW stack for deep learning and other emerging workloads: http://cKnowledge.org/request

You can also download and view one of our CK-based interactive and reproducible articles as follows:

 $ ck pull repo:ck-docker
 $ ck run docker:ck-interactive-article --browser (--sudo)

See the list of other CK-related Docker images here.

However note that the main idea behind CK is to let the community collaboratively improve common experimental workflows while making them adaptable to latest environments and hardware, and gradually fixing reproducibility issues as described here!

Citing CK (BibTeX)

PDF 1
PDF 2

@inproceedings{ck-date16,
    title = {{Collective Knowledge}: towards {R\&D} sustainability},
    author = {Fursin, Grigori and Lokhmotov, Anton and Plowman, Ed},
    booktitle = {Proceedings of the Conference on Design, Automation and Test in Europe (DATE'16)},
    year = {2016},
    month = {March},
    url = {https://www.researchgate.net/publication/304010295_Collective_Knowledge_Towards_RD_Sustainability}
}

@inproceedings{cm:29db2248aba45e59:c4b24bff57f4ad07,
   author = {{Fursin}, Grigori and {Lokhmotov}, Anton and {Savenko}, Dmitry and {Upton}, Eben},
    title = "{A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1801.08024},
 primaryClass = "cs.CY",
 keywords = {Computer Science - Computers and Society, Computer Science - Software Engineering},
     year = 2018,
    month = jan,
    url = {https://arxiv.org/abs/1801.08024},
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180108024F}
}

Some ideas were also originally presented in this 2009 paper.

Discussions/questions/comments

Slack channel: https://collective-knowledge.slack.com ; please send an email to admin@cTuning.org with a subject "invitation to the CK Slack channel" to get an invite
Mailing list about CK, common experimental workflows and artifact/workflows sharing, customization and reuse: http://groups.google.com/group/collective-knowledge
Mailing list related to collaborative optimization and co-design of efficient SW/HW stack for emerging workloads: http://groups.google.com/group/ctuning-discussions
Public wiki with CK-powered open challenges in computer engineering: https://github.com/ctuning/ck/wiki/Research-and-development-challenges

CK authors

Grigori Fursin, cTuning foundation / dividiti
Anton Lokhmotov, dividiti

License

Permissive 3-clause BSD license. (See LICENSE.txt for more details).

Acknowledgments

CK development is coordinated by the cTuning foundation (non-profit research organization) and dividiti. We would like to thank the TETRACOM 609491 Coordination Action for initial funding and all our partners for continuing support. We are also extremely grateful to all volunteers for their valuable feedback and contributions.

ens-lg4 / ck