[Algorithms] Separate component for root api methods.

Question

[Algorithms] Separate component for root api methods.

michaelschaarschmidt opened this issue 6 years ago · comments

michaelschaarschmidt commented 6 years ago

@janislavjankov has suggested the following:

The comment I had for the agent's component is that it looks cleaner to me if it was extracted and defined as a separate class - no need to attach the methods within the define_graph_api - just have a regular class (extending Component) that can be instantiated there.

So we could for example in the DQNAgent module have a class that implements the API of DQN as simple python methods.

Sven Mika · Answer 1 · Thu Feb 28 2019 16:00:40 GMT+0800 (China Standard Time)

I think there are two options, of which I favor the first:
a) Pass the agent into the root component's ctor.
Then also move construction of all sub-components of the root into the root's ctor. This way, the agent itself does not carry any components (no more self.memory, where self==agent), just agent settings such as discount, etc. This is clean, because an Agent should only interact via its graph_executor with its root component.

b) Pass the agent into root. Then add sub-components (created in the Agent c'tor as done now) into the root (after ctoring the root), then - inside root's APIs - refer to all sub-components as "agent.[some sub-component]". This is less clean as it does not fully separate component API from agent API.

Sven Mika · Answer 2 · Mon May 13 2019 14:34:27 GMT+0800 (China Standard Time)

Started working on this one.

We will call the root-components: AlgorithmComponents.
Agent's will be able to contain more than one AlgorithmComponent, but usually will contain only one. All sub-components and relevant settings will be passed directly into the AlgorithmComponent.
This will ensure strict separation of the Agent pure python API (e.g. get_action) and the root component's API (rlgraph_api/graph_fn) methods.

Sven Mika · Answer 3 · Thu May 23 2019 03:17:41 GMT+0800 (China Standard Time)

This will happen in 0.6.x, which is in the pipeline and currently undergoing testing.

Leaving this open.