Errors when reproducing Hey Firefox results

Question

Errors when reproducing Hey Firefox results

edwinzhng opened this issue 4 years ago · comments

Just wanted to note down a few issues trying to reproduce the Howl results:

It seems like the -i/--dataset_paths flag for setting the dataset path doesn't pick up the values (I tried on both Mac and Linux). Instead, it works fine if I just set DATASET_PATH as an environment variable.

So I did

DATASET_PATH=/path/to/hey/firefox LR_DECAY=0.98 VOCAB='[" hey","fire","fox"]' USE_NOISE_DATASET=True BATCH_SIZE=16 INFERENCE_THRESHOLD=0 NUM_EPOCHS=300 NUM_MELS=40 INFERENCE_SEQUENCE=[0,1,2] MAX_WINDOW_SIZE_SECONDS=0.5 python -m howl.run.train --model res8 --workspace workspaces/hey-ff-res8

Instead of

LR_DECAY=0.98 VOCAB='[" hey","fire","fox"]' USE_NOISE_DATASET=True BATCH_SIZE=16 INFERENCE_THRESHOLD=0 NUM_EPOCHS=300 NUM_MELS=40 INFERENCE_SEQUENCE=[0,1,2] MAX_WINDOW_SIZE_SECONDS=0.5 python -m howl.run.train --model res8 --workspace workspaces/hey-ff-res8 -i /path/to/hey/firefox

Stack trace:

Traceback (most recent call last):
  File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/edwinzhang64/howl/howl/run/train.py", line 217, in <module>
    main()
  File "/home/edwinzhang64/howl/howl/run/train.py", line 93, in main
    opt('--dataset-paths', '-i', type=str, nargs='+', default=[SETTINGS.dataset.dataset_path]),
  File "/home/edwinzhang64/howl/howl/settings.py", line 72, in dataset
    self._dataset = DatasetSettings()
  File "pydantic/env_settings.py", line 28, in pydantic.env_settings.BaseSettings.__init__
  File "pydantic/main.py", line 338, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for DatasetSettings
dataset_path
  field required (type=value_error.missing)

It seems like there is an error that occurs on line 35 in workspace.py when I try to train the model by following the Hey Firefox replication steps. It isn't able to serialize some PosixPath to JSON when calling json.dump(gather_dict(args), f, indent=2). Training works if I just comment out the line in write_args.

train: {'zmuv_mean': tensor([-1.7890], device='cuda:0'), 'zmuv_std': tensor([3.9339], device='cuda:0')}
Traceback (most recent call last):
  File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/edwinzhang64/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/edwinzhang64/howl/howl/run/train.py", line 217, in <module>
    main()
  File "/home/edwinzhang64/howl/howl/run/train.py", line 178, in main
    ws.write_args(args)
  File "/home/edwinzhang64/howl/howl/model/workspace.py", line 35, in write_args
    json.dump(gather_dict(args), f, indent=2)
  File "/home/edwinzhang64/anaconda3/lib/python3.7/json/__init__.py", line 179, in dump
    for chunk in iterable:
  File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/home/edwinzhang64/anaconda3/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type PosixPath is not JSON serializable

Brandon Lee · Answer 1 · Sun Sep 13 2020 00:31:37 GMT+0800 (China Standard Time)

Thank you for raising the issue 1, ur approach for setting env var should be working. we will make the fix sometime.

I have recently tried to follow the readme steps but didn't encounter the second issue though

Can you share which command you used for the second issue?

Brandon Lee · Answer 2 · Sun Sep 13 2020 00:48:35 GMT+0800 (China Standard Time)

nvm I reproduced it. I will take a look

Brandon Lee · Answer 3 · Tue Sep 15 2020 06:24:36 GMT+0800 (China Standard Time)

Issues have been fixed