THUDM / GLM

GLM (General Language Model)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

parameter SCB

zhaoqf123 opened this issue · comments

I notice that in every layer, there is a parameter SCB in self-attention part along with weight and bias, for example:

transformer.layers.0.attention.query_key_value.weight 	 torch.Size([12288, 4096])
transformer.layers.0.attention.query_key_value.bias 	 torch.Size([12288])
transformer.layers.0.attention.query_key_value.SCB 	 torch.Size([12288])

May I know what's the meaning of the SCB parameter?

It's from the library bitsandbytes, when you load a model in 8 bits. The question was asked there : TimDettmers/bitsandbytes#411.