Question: how to evaluate rnad algorithm
white0721 opened this issue · comments
Hi.
Is there any way to evaluate a model trained with the rnad algorithm against a random agent? (like tic_tac_toe_dqn_vs_tabular.py for example)
In tic_tac_toe_dqn_vs_tabular.py, the action is taken from the return value of the step function (type step_output) and env is transitioned to the next state.
However, the return value of the step function in rnad is of type dect, so I ',m not sure if I can do the same thing in this way.
Is there any better way to do this?
Ok so you can do it, but you need a few steps because the RNaD implementation does not expose the policy as an RNaD agent.
You can get the policy for any state using action_probabilities
:
get_state
accessor in rl_environment). Then you can just sample an action from the policy and take a step on the environment using the action.
Hope this helps!
Thank you for your reply.
By following your comments, I was able to do what I wanted to do!
It helped, thank you very much.
No problemo, glad it helped!