Exploring Weak to Strong Generalization from a pre-training standpoint

Question

Exploring Weak to Strong Generalization from a pre-training standpoint

rokosbasilisk opened this issue 6 months ago · comments

In the paper, a "stronger" model is defined as a model with the same architecture but a greater number of parameters. I am curious if any research has been conducted regarding weak to strong generalization, where the weak-supervisor model is less pretrained, and the stronger-student is more pretrained.

I am currently exploring the use of Pythia-models checkpoints to assess performance on BoolQ (https://github.com/rokosbasilisk/weak-to-strong where weaker student model is a checkpoint of the model which is few steps before the stronger student model).

Has any prior work been undertaken in this direction? If not, could you provide insights into why this area remains unexplored?

Jeff Wu · Answer 1 · Tue Dec 19 2023 04:09:13 GMT+0800 (China Standard Time)

i am not that familiar with the literature but there's which uses training time for strength https://aclanthology.org/2023.acl-long.796/. overall seems like a reasonable direction and I suspect there are many under-explored things in this space!

rokosbasilisk · Answer 2 · Wed Dec 20 2023 16:54:12 GMT+0800 (China Standard Time)

i am running the train_weak_to_strong over a range of parameter sizes at different checkpoint steps,
surprisingly when the weak model and the strong model are exactly the same (in terms of both params and checkpoint steps) there is a gain in the accuracy for the stronger model in most cases as seen in "acc_diff" column. i am currently trying to check this holds true for much larger (till ~12B params over multiple checkpoint steps).

Any idea why this might happen?

Jeff Wu · Answer 3 · Thu Dec 21 2023 04:05:27 GMT+0800 (China Standard Time)

i would guess it's just randomness, could be that the second training split is better for idiosyncratic reasons

rokosbasilisk · Answer 4 · Tue Jan 02 2024 18:35:20 GMT+0800 (China Standard Time)

Created a dataset of weak,strong and transfer accuracies for pythia 1b,1,4B,2.8B models at 5 different stages of their pretraining https://github.com/rokosbasilisk/weak-to-strong/blob/EDA/eda/results_df.csv.
Currently doing some EDA to check effect Pretraining vs Parameters on w2s generalization. any suggessions are welcome