lucidrains / self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is this work in progress?

jbdatascience opened this issue · comments

Very nice initiative to do a code implementation of "Self-Rewarding LLMs" (https://arxiv.org/pdf/2401.10020.pdf !

Is this work in progress? I would very much like to do some experiments with this code, if it is ready of course!
I could not find an official implementation yet ...

hey yes it is. for all my repositories, if you see wip, it is not complete

hey yes it is. for all my repositories, if you see wip, it is not complete

OK, I can imagine this is not easy to code! Can you explain how far you are with this and what are the obstacles ? I am very curious!

@jbdatascience

the paper is 2 days old. give me until end of month

Do you also think this is a very important milestone for open source LLMs?

unknown, but due to the simplicity, worth exploring

@jbdatascience

the paper is 2 days old. give me until end of month

Do you also think this is a very important milestone for open source LLMs?

unknown, but due to the simplicity, worth exploring

Do you also think this is a very important milestone for open source LLMs?

I also asked a question to one of the authors of the paper:

question about the very intriguing findings in “Self-Rewarding LLMs”
13:46 (3 uur geleden)
aan jaseweston

Good afternoon,

I have a question about the very intriguing findings in “Self-Rewarding LLMs”. Of course the experimental findings support the hypothesis that LLMs can indeed self-improve, that is undeniable!

But I can not wrap my head around how that is even possible from a abstract point of view. A LLM contains a lot of knowledge / information. Are the findings just a another way of saying that you have to employ particular kinds of techniques to elicit responses from a LLM that can reach that knowledge?

My question is:
What is the theoretical basis for this bootstrapping process. Where is the information to improve coming from? Why should we expect it to improve in the direction we want?

I hope you can shed some light 💡 on these questions!

Also I would like to ask if there is some open source code available so that I can experiment with it?

Best wishes,

Jan Bours
Data Scientist / Certified Data Science Professional (CDSP)