lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Model Name

conceptofmind opened this issue a year ago · comments

Enrico Shippole commented a year ago

@lucidrains What would you like the first model to be named?

Enrico Shippole commented a year ago

One restart. 160B tokens.

Enrico Shippole commented a year ago

Did some tests with qk_norm vs no qk_norm as well. When using AdamW decided to go with qk_norm=False. I will explore this with Lion after.

Enrico Shippole commented a year ago

PaLM 1B