Debadri3 / benchmarks

Public dataset benchmarks used for measuring the performance of MindsDB.

Home Page:http://mindsdb.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MindsDB benchmarks

This repository contains various MindsDB benchmarks.

Usage examples

  • Run some quick benchmarks locally to check the performance: python3 run.py --modes=mindsdb_dev --platform=local --speed=fast,medium,slow
  • Compare your local version against sklearn (a naive and an expert implementation): python3 run.py --modes=sklearn_naive,sklearn_expert,mindsdb_dev --platform=local --speed=fast,medium,slow
  • Run benchmarks for current stable remotely: python3 run.py --modes=mindsdb_prod --platform=GCP --speed=fast,medium,slow [Implementation not done yet]

Contributions

Before contributing to this repository please make sure that the dataset you are adding are publicly available and can be re-used.

Adding a dataset

  • Add your dataset as a csv in a directory datasets/{name_of_the_dataset}/data.csv.
  • To specify an accuracy function to evaluate it with and other parameters edit datasets/{name_of_the_dataset}/info.py, see this file as an example.
  • To add an "alternatives" benchmark for the dataset add it to alternatives/{alternative_name}/{name_of_the_dataset}/benchmark.py. Currently the supported alternatives are sklearn_expert and sklearn_naive. For an example see this file.

Issues

If you found any issues with MindsDB when executing the benchmarks, please make sure you report them in the MindsDB repository.

About

Public dataset benchmarks used for measuring the performance of MindsDB.

http://mindsdb.com

License:Apache License 2.0


Languages

Language:Python 100.0%