openai / weak-to-strong

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Machine learning newbie hopes to get help

snowkcon opened this issue · comments

We only look at the first experiment, not the Train PIPELINE experiment, nor the LOSS optimization experiment.

Complete data set split into A data set and B data set

Weak model training A data set --> accuracy, for example, 70%
Strong model training B data set --> accuracy such as 90%
Weak model infers data set B --> trains strong model --> accuracy, for example, 80%
PGR = (80 - 70)/(90 - 70) = 0.5

As for the hypothesis of this problem, in the future we will use the weak model to guide the strong model to achieve improvement;

my question is

Strong model training A data set --> accuracy rate such as 90%
Strong model inference B data set --> accuracy, such as 80%? Will it get to 80% here?

Finally, it is actually a question of the generalizability of A data in B set? Actually we don’t need weak models?

where is my problem?

mark

Thanks for bringing these up, and I hope your ventures into machine learning are going well!

You present a lot of questions here and I'll do my best to answer them with what I know from the paper. I think there has been a misunderstanding with regards to how the weak model trains the strong model as described here, and how the dataset is divided. Admittedly I wasn't sure about it from the paper, it was only after looking at the code here https://github.com/openai/weak-to-strong/blob/main/train_weak_to_strong.py#L272
that I was sure about it.

In this method / paper we have several datasets:

  • Training Dataset Split 1 - This is half of the training dataset. It's only used to train the weak model.

  • Training Dataset Split 2 - This is the other half. It's only used to train the strong model.

  • Test Dataset - This is not used for training any models, only for testing them all to get a unified comparison of performance.

  • Weak-Label Dataset - This is obtained by training (fine-tuning in this case) the weak model on Training Dataset Split 1, and saving its answers. These labels are then combined with the inputs in Training Dataset Split 1 to create Weak-Label Dataset. It's only used to train the weak-to-strong model.

All of the "performance" metrics used in PGR are from the test dataset, composed of "ground-truth" labels. In the superalignment problem we may not have ground-truth labels, and they're provided here with the caveat that it is in an experimental setting for scientific and evaluation purposes.

I'm not sure exactly what you're asking with some of your remaining questions such as "will it get to 80% here?", but I'd be glad to elaborate on any of these (there's tons more details) or help out if you still have any questions. Thank you for asking them as they led me to get a better understanding via looking into it.

Anyways, hope that helps!

Thank you so much. I made a mistake and didn't read the code.

thanks @BlueHephaestus, closing this issue for now

image @BlueHephaestus But what you are saying is not consistent with the paper content mentioned. It seems that the weak labels come from the weak model's evaluation on "Training Dataset Split 2". I'm not sure if there is something wrong.