eXascaleInfolab / bench-vldb20_full

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Imputation of Missing Values in Time Series Benchmark vldb20

Repository structure

  • Algorithms - missing blocks recovery algorithms: CDRec, STMVL, TRMF, TKCM, SPIRIT, TeNMF, GROUSE, SVDImpute, SoftImpute, SVT, ROSL, DynaMMo.
  • Datasets - different datasets and time series from different sources.
  • Testing Framework - a program to run automated suite of tests on the datasets with the algorithms mentioned above.

Prerequisites and dependencies (Linux)

  • Ubuntu 16 and higher (or Ubuntu derivatives like Xubuntu)
  • Sudo rights on the user
  • Clone the repository
    $ git clone https://github.com/eXascaleInfolab/bench-vldb19.git
    $ sudo apt-get install mono-devel
  • All other prerequisites will be installed using a build script.

Build & tests

Restart the terminal window after all the dependencies are installed. Open it in the root folder of the repository.

  • Build all the algorithms and Testing Framework using a script in the root folder (takes up to 5 minutes depending which prerequisites are already installed in the system):
    $ sh install_linux.sh
  • Run the benchmark:
    $ cd TestingFramework/bin/Debug/
    $ mono TestingFramework.exe
  • Test suite will go over datasets one by one and executes all the scenarios for them with both precision test and runtime test. Plots folder in the root of the repository will be populated with the results.
  • Remark: full test suite with the default setup will take a sizeable amount of time to run (up to 2 days depending on the hardware) and will produce up to 20GB of output files with all recovered data and plots unless stopped early.

Customize datasets

To add a dataset to the benchmark

  • import the file to TestingFramework/bin/Debug/data/{name}/{name}_normal.txt
    • Requirements: >= 10 columns, >= 1'000 rows, column separator - empty space, row separator - newline
  • add {name} to the list of datasets in TestingFramework/config.cfg

Customize algorithms

To exclude an algorithm from the benchmark

  • open the file TestingFramework/config.cfg
  • add an entry IgnoreAlgorithms = and specify the list of algorithm codes to exclude them
  • the line starting with #IgnoreAlgorithms = provides codes for all the algorithms in the benchmark

Prerequisites and dependencies (macOS) -- Experimental

  • It is possible to make the benchmark work on macOS with a few caveats:
    • TRMF algorithm does not work with octave (on macOS), so it will be disabled.
    • The installation can take longer than Linux. The longest processes are the installation of LLVM with brew and the compilation of mlpack.
  • macOS 10.13 or higher, homebrew
  • Sudo rights on the user
  • Clone the repository
    $ xcode-select --install
    $ git clone https://github.com/eXascaleInfolab/bench-vldb19.git
  • If you're running macOS 10.14 you also have to install C/C++ headers by typing the command below and going through the installation screen:
    $ open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg

Build & tests

  • Restart the terminal window after all the dependencies are installed. Open it in the root folder of the repository.
  • Build all the algorithms and Testing Framework using a script in the root folder (takes up to 10-12 minutes depending which prerequisites are already installed in the system):
    $ sh install_mac.sh
  • Run the benchmark:
    $ cd TestingFramework/bin/Debug/
    $ mono TestingFramework.exe

Customize datasets and algorithms

The process is identical to Linux.

About


Languages

Language:C 45.4%Language:C++ 24.0%Language:C# 17.9%Language:MATLAB 9.3%Language:Objective-C 1.4%Language:Makefile 0.5%Language:R 0.4%Language:Python 0.3%Language:Gnuplot 0.3%Language:M 0.2%Language:CMake 0.1%Language:Shell 0.1%Language:Forth 0.1%