rlgraph / rlgraph

RLgraph: Modular computation graphs for deep reinforcement learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

*_spec parameters as dictionaries are inconvenient

janislavjankov opened this issue · comments

Working with parameters passed as dictionaries is inconvenient - no auto complete, no explicit documentation, no explicit defaults, fail at later point, etc. And can lead to bad practices such as adding undocumented fields instead of extending a class.
I saw that there is the Specifiable class and there are few classes that extend it but the usage seems inconsistent:

From agent.py

            policy_spec (Optional[dict]): An optional dict for further kwargs passing into the Policy c'tor.
            value_function_spec (list): Neural network specification for baseline.

            exploration_spec (Optional[dict]): The spec-dict to create the Exploration Component.
            execution_spec (Optional[dict,Execution]): The spec-dict specifying execution settings.
            optimizer_spec (Optional[dict,Optimizer]): The spec-dict to create the Optimizer for this Agent.

            value_function_optimizer_spec (dict): Optimizer config for value function otpimizer. If None, the optimizer
                spec for the policy is used (same learning rate and optimizer type).

            observe_spec (Optional[dict]): Spec-dict to specify `Agent.observe()` settings.
            update_spec (Optional[dict]): Spec-dict to specify `Agent.update()` settings.
            summary_spec (Optional[dict]): Spec-dict to specify summary settings.
            saver_spec (Optional[dict]): Spec-dict to specify saver settings.

For example optimizer_spec can be provided as Optimizer, but value_function_optimizer_spec needs to be dict -- the code below assumes this. Also update_spec is dictionary and some of the fields are specific to the particular algorithm.

Is there a specific reason for the difference in the parameters?

Yes, you are right and this is an ongoing discussion in our team.
I agree with you that it's not satisfactory to have to know the string keys for the different supported options inside these spec dicts. We will look into solutions, e.g. spec-classes, which do their type-checking and error reporting. If you have other solutions for this problem, please let us know.
We very much appreciate your input on this.

Thanks for looking into this! I think going with spec classes is a good clean solution. I'd even consider dropping the dict support in the constructors entirely in favor of objects - won't be too much effort to provide object deserialized from dict if someone insists to.

We will have new classes for that from version 0.6.x onward:

  • UpdateRules object (held by Worker) replacing main part of current update_spec.
  • SyncRules object (held by some Agents) replacing parts of current update_spec.
  • python_buffer_size (int ctor arg for Agent) replacing observe_spec.

execution_spec, summary_spec, saver_spec will remain for now ...

Leaving this issue open until full resolution.