Switch to tall format data instead of wide
andnp opened this issue · comments
This will take some work and will require extensive testing.
I need to save to disk a mapping from hyper_id -> hyper_settings. Then every row of the results dataframe will have the hyper_id instead of the hyper_settings.
This will allow saving multiple metrics per row and should radically reduce storage costs. Also should be easier on the cluster.
Need to think about:
- What happens when the number of hypers changes?
- How do we save throughout the experiment? All at once at the end like the current approach?
- When we downsample, we need to keep track of the current step number. With subsampling, this is easy. With window averaging, this might be harder.