Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

What's Conjugate gradients and line_search in TRPO?

Dreamlikec opened this issue 3 years ago · comments

Alan Feng commented 3 years ago

Could you please give me a sense/reference what these two func meaing for?

Jiachen Wang commented 3 years ago

chech here: https://spinningup.openai.com/en/latest/algorithms/trpo.html