calvinmccarter / maddness-old

Code for ICML2020 submission

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This page describes how to reproduce the experimental results reported in our paper.

Note that this page (and the clean, easy-to-use version of our code) are still under construction and we refer the reader to https://smarturl.it/Maddness for the latest version.

Install Dependencies

To run the experiments, you will first need to obtain the following tools / libraries and datasets.

C++ Code

  • Xcode, to run the C++ timing benchmarks using Xcode (this is the "official" version that is much better tested and actually works at the moment)
  • Bazel, Google's open-source build system (support coming soon...)

Python Code:

  • Joblib - for caching function output
  • Scikit-learn - for k-means
  • Kmc2 - for k-means seeding
  • Pandas - for storing results and reading in data
  • Seaborn - for plotting, if you want to reproduce our figures

Datasets

The activations and weights from the CIFAR-10 and CIFAR-100 datasets are already included under python/assets.

View Existing Results

All results are in python/results/amm. The timing results are in the subdirectory timing.

Reproduce Timing / Throughput results

The C++ code is driven by Catch run via Xcode. You can just open Bolt.xcodeproj (Maddness was built as a fork of Bolt) and press run with the appropriate arguments. For different experiments, the arguments are:

  • f() speed for various methods: [scan][amm]~[old]
  • g() speed for various methods: [encode][amm]~[old]
  • h() speed for various (not reported in the paper, but interesting): [lut][amm]\~[old]
  • Overall AMM speed: [matmul][amm]~[old].

We highly recommend running this when the machine is otherwise idle. Also note that we haven't yet automated having the C++ code dump results into the appropriate files, so you'll have to manually paste the output into the corresponding file in python/results/amm/timing.

Coming soon: Working Bazel build for all the code and wrapper shell scripts to run and store the output of each experiment.

Reproduce Accuracy Results

From the python directory, run python -m python.amm_main. This will run all the methods we showed in the body of the paper (and some others that run quickly) on CIFAR-10, CIFAR-100, Caltech 101 using both the Sobel and Gaussian filters, and the datasets from the UCR Time Series Archive.

Reproduce Plots

From the python directory, run python -m python.amm_figs2. You can uncomment different lines in main to only produce subsets of the plots.

Other notes

Our method is called Mithral in the source code, not Maddness.

About

Code for ICML2020 submission


Languages

Language:C++ 60.9%Language:Python 36.9%Language:Jupyter Notebook 2.2%