[joss] Inconsistent naming causes cli arguments to be ignored
sneakers-the-rat opened this issue · comments
Trying to write a simple example test using the config and found a bug
Argument in argparse is named tau
diart/src/diart/console/benchmark.py
Line 21 in 858cd1a
passed to from_dict
, which looks for tau_active
instead, replacing whatever i feed to it with the default:
diart/src/diart/blocks/config.py
Line 110 in 858cd1a
Same is true of rho
-> rho_update
and delta
-> delta_new
and max-speakers
-> max_speakers
and cpu
-> device
Some additional things i noticed in these modules that aren't bugs per se but just some tips:
- The initialization logic is written twice, once in
__init__
and once infrom_dict
without clear purpose. this makes for very fragile and bug-prone code. I'm not sure what thefrom_dict
method is supposed to do differently than just calling the class likePipelineConfig(**args)
- my expectation as a user would be that they would be identical. In fact this seems to be the cause of the bug at hand - the**kwargs
in__init__
is superfluous since they are ignored, but if it wasn't there and I calledPipelineconfig(**args)
then I would have gotten an exception from passing an unexpected parameter. - The default values are actually defined three times - another time in the argparser itself. Definitely recommend having a single place these are defined.
- you probably want
@classmethod
here:diart/src/diart/blocks/config.py
Line 95 in 858cd1a
- try using the
abc
module for abstract classes like eg. -diart/src/diart/blocks/config.py
Line 12 in 858cd1a
- this signature should probably be
data: dict
since the method is named "from_dict" -diart/src/diart/blocks/config.py
Line 30 in 858cd1a
dict
s already have a get method, eg.{'a': 'dict'}.get('a', None)
that works the same asutils.get
- the
__init__
methods are very heavy for a config class - instantiating two models. If this is just a config file you should probably just store the string values and instantiate the models elsewhere.
part of openjournals/joss-reviews#5266
tau_active
and tau
are aliases, you can see in the following lines that it checks for tau_active
first, if that fails it looks for tau
defaulting to 0.6 (see here). The same applies for the other arguments you mention.
Notice that the cli argument max-speakers
is converted to max_speakers
automatically by argparse.
I do agree on the comments about the initialization code and the repetitiveness. Do you have any suggestions for a pythonic way of addressing this?
Ah, my bad, something was causing my test to fail and it was late so i must not have read carefully.
Yes! I am unsure what the from_dict
method is supposed to do differently than the __init__
method - i test that they are identical here: https://github.com/sneakers-the-rat/diart/blob/b1a0ccaa35f8b36aa30f978a4bcb16db69652a42/tests/test_config.py#L35
usually when you have a from_x
method it's because the thing you're instantiating from is different in form than how you usually instantiate it - eg. a from_json
method might take a path to a .json
file with instantiation arguments. In this case, instantiating like MyClass(**{'arg1':'val1', 'arg2':'val2'})
is the same as MyClass(arg1='val1', arg2='val2')
So my suggestion would be to remove the from_dict
method and use the same variable names everywhere.
I don't mean to harp on you about a minor thing, I am more focused on showing you how to make the code more testable by removing special cases, minimizing where things can go wrong, etc. I won't be able to review the entire package, so this is just one example of a strategy for avoiding bugs.
Implemented in #189