What tool do you use to get the token prediction of each layer of large language models for Figure 2?

Question

What tool do you use to get the token prediction of each layer of large language models for Figure 2?

frankdarkluo opened this issue 7 months ago · comments

frankdarkluo commented 7 months ago

Yung-Sung Chuang · Answer 1 · Tue Mar 19 2024 11:18:41 GMT+0800 (China Standard Time)

Hi @frankdarkluo

I simply use matplotlib to make a table for figure 2!

frankdarkluo · Answer 2 · Fri Mar 22 2024 05:30:54 GMT+0800 (China Standard Time)

Thanks for the reply! But I am not asking about the drawing. I am curious how do you get the probability distribution from the middle (not output) layers? Thanks.

Yung-Sung Chuang · Answer 3 · Fri Mar 22 2024 06:23:56 GMT+0800 (China Standard Time)

Just insert some code to the transformers package (modeling_llama.py and generation/utils.py) and get the predictions along the decoding steps. It makes the code ugly but it works. I didn't use any tools for that.