Errors when reproducing Hey Firefox results
edwinzhng opened this issue · comments
Edwin Zhang commented
Just wanted to note down a few issues trying to reproduce the Howl results:
- It seems like the
-i/--dataset_paths
flag for setting the dataset path doesn't pick up the values (I tried on both Mac and Linux). Instead, it works fine if I just setDATASET_PATH
as an environment variable.
So I did
DATASET_PATH=/path/to/hey/firefox LR_DECAY=0.98 VOCAB='[" hey","fire","fox"]' USE_NOISE_DATASET=True BATCH_SIZE=16 INFERENCE_THRESHOLD=0 NUM_EPOCHS=300 NUM_MELS=40 INFERENCE_SEQUENCE=[0,1,2] MAX_WINDOW_SIZE_SECONDS=0.5 python -m howl.run.train --model res8 --workspace workspaces/hey-ff-res8
Instead of
LR_DECAY=0.98 VOCAB='[" hey","fire","fox"]' USE_NOISE_DATASET=True BATCH_SIZE=16 INFERENCE_THRESHOLD=0 NUM_EPOCHS=300 NUM_MELS=40 INFERENCE_SEQUENCE=[0,1,2] MAX_WINDOW_SIZE_SECONDS=0.5 python -m howl.run.train --model res8 --workspace workspaces/hey-ff-res8 -i /path/to/hey/firefox
Stack trace:
Traceback (most recent call last):
File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/edwinzhang64/howl/howl/run/train.py", line 217, in <module>
main()
File "/home/edwinzhang64/howl/howl/run/train.py", line 93, in main
opt('--dataset-paths', '-i', type=str, nargs='+', default=[SETTINGS.dataset.dataset_path]),
File "/home/edwinzhang64/howl/howl/settings.py", line 72, in dataset
self._dataset = DatasetSettings()
File "pydantic/env_settings.py", line 28, in pydantic.env_settings.BaseSettings.__init__
File "pydantic/main.py", line 338, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for DatasetSettings
dataset_path
field required (type=value_error.missing)
- It seems like there is an error that occurs on line 35 in
workspace.py
when I try to train the model by following the Hey Firefox replication steps. It isn't able to serialize some PosixPath to JSON when callingjson.dump(gather_dict(args), f, indent=2)
. Training works if I just comment out the line inwrite_args
.
train: {'zmuv_mean': tensor([-1.7890], device='cuda:0'), 'zmuv_std': tensor([3.9339], device='cuda:0')}
Traceback (most recent call last):
File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/edwinzhang64/howl/howl/run/train.py", line 217, in <module>
main()
File "/home/edwinzhang64/howl/howl/run/train.py", line 178, in main
ws.write_args(args)
File "/home/edwinzhang64/howl/howl/model/workspace.py", line 35, in write_args
json.dump(gather_dict(args), f, indent=2)
File "/home/edwinzhang64/anaconda3/lib/python3.7/json/__init__.py", line 179, in dump
for chunk in iterable:
File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 431, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
yield from chunks
File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 325, in _iterencode_list
yield from chunks
File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 438, in _iterencode
o = _default(o)
File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type PosixPath is not JSON serializable
Brandon Lee commented
Thank you for raising the issue 1, ur approach for setting env var should be working. we will make the fix sometime.
I have recently tried to follow the readme steps but didn't encounter the second issue though
Can you share which command you used for the second issue?
Brandon Lee commented
nvm I reproduced it. I will take a look
Brandon Lee commented
Issues have been fixed