parameter SCB
zhaoqf123 opened this issue · comments
Qifang Zhao commented
I notice that in every layer, there is a parameter SCB
in self-attention part along with weight
and bias
, for example:
transformer.layers.0.attention.query_key_value.weight torch.Size([12288, 4096])
transformer.layers.0.attention.query_key_value.bias torch.Size([12288])
transformer.layers.0.attention.query_key_value.SCB torch.Size([12288])
May I know what's the meaning of the SCB parameter?
Alain Le Noac'h commented
It's from the library bitsandbytes, when you load a model in 8 bits. The question was asked there : TimDettmers/bitsandbytes#411.