[BUG] SharedBaseline on_dim default value should be -1

Question

[BUG] SharedBaseline on_dim default value should be -1

hyeok9855 opened this issue a year ago · comments

Sanghyeok Choi commented a year ago

Describe the bug

I think in the SharedBaseline, on_dim here default value should be -1

If not, it raises 👇

To Reproduce

Simply change the baseline and run

Checklist

I have checked that there is no similar issue in the repo (required)
I have provided a minimal working example to reproduce the bug (required)

Federico Berto · Answer 1 · Tue Sep 12 2023 16:08:07 GMT+0800 (China Standard Time)

How are you changing the baseline?
If you use SharedBaseline, you should also use some inference techniques (e.g. multistarts for POMO or augmentation for SymNCO), so if you just use the default model with baseline="shared", it will not work. This is because the reward is always converted to [batch, num_pomo/num_aug], which will avoid it failing

Sanghyeok Choi · Answer 2 · Tue Sep 12 2023 16:23:17 GMT+0800 (China Standard Time)

Okay I see. But actually I wanted to use just the mean of reward as a naive baseline.
So in this case, should I implement this locally?

Federico Berto · Answer 3 · Tue Sep 12 2023 16:31:12 GMT+0800 (China Standard Time)

I see! Then feel free to submit a PR ;)