boyu-ai / Hands-on-RL

https://hrl.boyuai.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

失业三年人不认可该观点!:UCB的U_t(a)的分母分母中为拉动每根拉杆的次数加上常数 1 ,这确保每个动作**至少被探索一次**

StevenJokess opened this issue · comments