ludwig-ai / experiments

Reproducible benchmark experiments for Ludwig

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

experiments

Reproducible benchmark experiment scripts and results for Ludwig

Utilities for experiments are in the utils directory.

  • best_hyperopt_statistics.py fetches from the specified hyperopt_statistics.json file the combined loss and specified metric for the best model found by the hyperparameter search.

Scripts and results for automl experiments are in the automl directory.

  • The heuristics subdirectory contains subdirectories for each dataset used to run extensive hyperparameter searches from which to derive automl heuristics.
  • The validation subdirectory contains subdirectories for each dataset used to validate the derived heuristics.

Each dataset subdirectory contains the following scripts and configuration files, as appropriate:

  • Training

    • Simple train validation of concat model type:
      • Script w/Configuration: train_concat_sanity_laptop.py, config_concat_sanity_laptop.yaml
    • Simple train validation of tabnet model type:
      • Script w/Configuration: train_tabnet_sanity_laptop.py, config_tabnet_sanity_laptop.yaml
    • Simple train validation of transformer model type:
      • Script w/Configuration: train_transf_sanity_laptop.py, config_transf_sanity_laptop.yaml
    • Train validation of best tabnet model configuration found in heuristics search runs
      • Script w/Configuration: train_tabnet_reference_laptop.py, config_tabnet_reference_laptop.yaml
    • Train validation of best tabnet model configuration found in heuristics search runs using updated automatic feature type selection (if impacted)
      • Script w/Configuration: train_tabnet_reference_auto.py, config_tabnet_reference_auto.yaml
  • AutoML

    • Automatically generate configuration for hyperparameter search via create_auto_config API
      • Script: get_auto_train_config.py
      • Output for original feature type selection code: auto_config.json.orig
      • Output for updated feature type selection code: auto_config.json.update
      • Output for updated feature type selection code + automl code w/heuristics: auto_config.json.automl
    • Automatically generate and run configuration for hyperparameter search via auto_train API w/1hr time limit
      • Script: run_auto_train_1hr.py
      • Output: hyperopt_statistics.json.1hr
    • Automatically generate and run configuration for hyperparameter search via auto_train API w/2hr time limit
      • Script: run_auto_train_2hr.py
      • Output: hyperopt_statistics.json.2hr
    • Automatically generate and run configuration for hyperparameter search via auto_train API w/4hr time limit
      • Script: run_auto_train_4hr.py
      • Output: hyperopt_statistics.json.4hr

About

Reproducible benchmark experiments for Ludwig


Languages

Language:Roff 98.9%Language:Python 0.8%Language:HCL 0.3%Language:Shell 0.0%