Please see INSTALL.org
.
When this is done, cd into exps
and run cp env.sh.bak env.sh
and modify env.sh
to define the following environment variables. Here is an example:
export SAVEDIR=<path to store experiment checkpoints>
export MUJOCO_PY_MUJOCO_PATH=~/bin/mujoco/mujoco210
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MUJOCO_PY_MUJOCO_PATH}/bin:/usr/lib/nvidia
The exps
folder is is where experiments are launched from, and experiments are launched by invoking main.sh
. Its usage is as follows:
# run source env.sh before running main.sh bash main.sh <cls|sbgm> <experiment name> <path to json file>
Let’s break this down from left to right:
cls
for training a classifier (i.e. the oracle regression model),sbgm
for diffusion model.<experiment name>
is the name of the experiment, whose directory of results and checkpoints will be stored in$SAVEDIR/<experiment name>/<experiment id>
. The experiment ID by default is taken from$SLURM_JOB_ID
(because that is what I use internally), but in a non-Slurm environment this will be undefined. Therefore the script will replace it with Unix time. (If this is not desirable you can modify the script to do something else.)- Experiments are defined not through a messy string of argparse arguments but instead via a JSON dictionary. You can see all the possible options by viewing the respective dataclass in
trainval.py
.
By default, this script will copy the code to $SAVEDIR/<experiment name>/<experiment id>/code
and cd into that directory to run the experiment. To avoid this simply set RUN_LOCAL=1
.
Due to large file sizes I have only included the pretrained checkpoints for the oracles. These are needed to run the experiments defined in exps/json
. First download the checkpoints from here and extract their contents to the save directory defined in env.sh
(i.e. $savedir
).
The following experiment configurations are available in exps/json
:
. ├── ant │ ├── train-diffusion-cfg.json │ ├── train-diffusion-cg.json │ ├── train-training-oracle.json │ └── train-validation-oracle.json ├── hopper50 │ └── train-diffusion-cfg.json ├── kitty │ ├── train-diffusion-cfg.json │ ├── train-diffusion-cg.json │ ├── train-training-oracle.json │ └── train-validation-oracle.json └── sd ├── HYPERPARAMETERS.org ├── train-diffusion-cfg.json ├── train-diffusion-cg.json ├── train-training-oracle.json └── train-validation-oracle.json
What the filenames signify:
train-training-oracle.json
: only concerns classifier-based guidance experiments, this is an approximate oracle trained only on the training set. However, it is also trained on the same forward distribution as the diffusion modelsq(x0, ..., xT)
. In other words, they learn a regression modelp(y|x_t;t)
wherex_t
is the noised input at timestept
, andy
is the reward variable.train-validation-oracle.json
: this is the validation set oracle, which is trained on both the training + validation sets. This is also the oracle which is used for the validation metrics.train-diffusion-cfg.json
: train diffusion model with classifier-free guidance. The diffusion model simultaneously learns both a noise predictor conditioned ony
and without, and this can be used to derive a conditional score function.train-diffusion-cg.json
: train diffusion model with classifier-based guidance. This learns an unconditional score function which, when combined with the training oracle (seetrain-training-oracle
) defines a conditional score function.
TODO.
TODO.