rlgraph / rlgraph

RLgraph: Modular computation graphs for deep reinforcement learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[ Components ] Policy Component needs API-method cleanup and return value cleanup

sven1977 opened this issue · comments

The Policy Component needs some cleanup as its API-methods are becoming less and less organized.

  • Some API methods are called "...parameters_log_probs". Log probs are only really returned for discrete action spaces, so the suffix "_log_probs" should be removed from the API's name entirely and the log-probs should only be returned for categorical distributions (for all others, these "log_probs" are currently actually log(mean) or log(stddev), ...).
  • API methods to get the actual log-likelihoods for pdf-type continuous distribution functions, will be better named and organized and the actual log-likelihood returned for a certain action will have the key: "log_likelihood", rather than "log_probs".

This has been mostly completed in 0.4.1 and 0.4.2.

RLGraphObsoletedError messages will hint, which API-methods have been renamed into which new methods.

Closing this issue.