Precedence in Includes and excludes
manuGil opened this issue · comments
To include and exclude files from a dataset we are using the following operations:
local = fairly.dataset("./test-dataset/mydataset/")
# to include:
local.includes.append("*.jpg")
# to exclude:
local.excludes.append("*.jpg")
# then we save changes to manifest.yaml
local.save()
The above results in the following in manifest.yaml
:
files:
includes:
- ARP1_.info
- ARP1_d01.zip
- my_code.py
- Survey_AI.csv
- '*.jpg'
excludes:
- '*.jpg'
How fairly mange the precedence of this case? It is based on the order in the file? or have excludes precedence over includes?
Currently includes
has precedence over excludes
. These rules are used by the _get_files()
method in dataset/local.py
file. For each file under the dataset folder, the method first checks the includes
rules. If there is a match with any of the rules, then the file is included to the file list. If there is not match, then excludes
rules are checked and if there is any match, then the file is excluded.
We can provide some feedback to the user is there are any conflicting include and exclude rules, or duplicate rules (e.g. *.jpg
repeated twice in the includes). I think we can eliminate duplicate rules automatically, but it is better to let the user know and solve the conflicts. We can consider adding a check_rules()
method for that purpose.