andnp / PyExpUtils

Experiment utility code, specifically designed for use with Compute Canada.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Switch to tall format data instead of wide

andnp opened this issue · comments

This will take some work and will require extensive testing.

I need to save to disk a mapping from hyper_id -> hyper_settings. Then every row of the results dataframe will have the hyper_id instead of the hyper_settings.

This will allow saving multiple metrics per row and should radically reduce storage costs. Also should be easier on the cluster.

Need to think about:

  • What happens when the number of hypers changes?
  • How do we save throughout the experiment? All at once at the end like the current approach?
  • When we downsample, we need to keep track of the current step number. With subsampling, this is easy. With window averaging, this might be harder.