data61 / landshark

Large-scale spatial inference with Tensorflow.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add option to choose covariates when extracting to TF record

dtpc opened this issue · comments

Currently we import a set of TIFs into HDF5 format then extract ALL those covariates (with an optional halfwidth parameter) to create a tf records containing the train/test and query X data.

In order to exclude a particular TIF, or to try different combinations of TIFs, we need to import them to separate HDF5 files, resulting in a lot of data duplication.

The proposal is to add an option to the extract commands to list the covariates by name which we want to extract to a new tf record. The default would be to extract all covarates. It would likely also require a command to query the names of all covariates within an existing HDF5 file.

It could potentially work as --include/--exclude flags to allow for filtering out covariates.

Ok, so you can ignore any covariates within the model config. So it's just a trade-off between extracting a new tfrecord vs the overhead of reading in unused covariates during training. While I think this would still be useful, I guess its not that important as there is a way to achieve it already.