google-deepmind / bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

In deep_sea the threshold is set to 0.8 instead of 0.9 as the paper indicates.

lcassano opened this issue · comments

The threshold is set to 0.8 by default instead of 0.9 as the paper indicates "The summary ‘score’
computes the percentage of runs for which the average regret drops below 0.9 faster than the 2^N episodes expected by dithering."

def find_solution(df_in: pd.DataFrame,
sweep_vars: Sequence[Text] = None,
merge: bool = True,
thresh: float = 0.8) -> pd.DataFrame: