MindsDB benchmarks

This repository contains various MindsDB benchmarks.

Usage examples

Run some quick benchmarks locally to check the performance: python3 run.py --modes=mindsdb_dev --platform=local --speed=fast,medium,slow
Compare your local version against sklearn (a naive and an expert implementation): python3 run.py --modes=sklearn_naive,sklearn_expert,mindsdb_dev --platform=local --speed=fast,medium,slow
Run benchmarks for current stable remotely: python3 run.py --modes=mindsdb_prod --platform=GCP --speed=fast,medium,slow [Implementation not done yet]

Before contributing to this repository please make sure that the dataset you are adding are publicly available and can be re-used.

Add your dataset as a csv in a directory datasets/{name_of_the_dataset}/data.csv.
To specify an accuracy function to evaluate it with and other parameters edit datasets/{name_of_the_dataset}/info.py, see this file as an example.
To add an "alternatives" benchmark for the dataset add it to alternatives/{alternative_name}/{name_of_the_dataset}/benchmark.py. Currently the supported alternatives are sklearn_expert and sklearn_naive. For an example see this file.

If you found any issues with MindsDB when executing the benchmarks, please make sure you report them in the MindsDB repository.

Public dataset benchmarks used for measuring the performance of MindsDB.

Apache License 2.0

Language:Python 100.0%