python run/sgd.py <path to dataset> --project=<wandb project name> --dataset=<dataset name>
We conducted tests on these three datasets using our Table-SGD and compared the results with baseline GD [1, 2, 3, 4].
- MIMIC-III
- Yelp
- MovieLens-1M
We obtained the original dataset from the official site and performed some preprocessing to convert all data into a numerical format that can be easily processed by the model.
- Arun Kumar et al: Learning Generalized Linear Models Over Normalized Data. SIGMOD 2015
- Maximilian Schleich et al: Learning Linear Regression Models over Factorized Joins. SIGMOD 2016
- Lingjiao Chen, et al: Towards Linear Algebra over Normalized Data. VLDB 2017
- Maximilian Schleich et al: A Layered Aggregate Engine for Analytics Workloads. SIGMOD 2019