[Question] Does FBGEMM support packing B matrix offline now?
umiswing opened this issue · comments
Hello! I'm trying to accelerate the matrix multiplication in my own project with FBGEMM. However, I found the matrix multiplication get slower after using the FBGEMM, and I think it's caused by repacking the weight matrix(B matrix).
Your blog says that the cost of repacking can be avoided by prepacking B matrix. But I did not find an example showing how to pack B matrix offline and reuse the prepacked B matrix. I found #427, which provides an unofficial version of FBGEMM, but it doesn't work in my project.
Could you please tell me does FBGEMM support packing B matrix offline and reusing the prepacked matrix now? And is there any usage I can refer to?
I will be very grateful for your help.
@ngsiming packing weight matrix B and reusing it is indeed the use pattern FBGEMM is designed for. See https://github.com/pytorch/FBGEMM/blob/master/bench/PackedRequantizeAcc32Benchmark.cc#L214 for how packed B matrix is used multiple times.
@ngsiming packing weight matrix B and reusing it is indeed the use pattern FBGEMM is designed for. See https://github.com/pytorch/FBGEMM/blob/master/bench/PackedRequantizeAcc32Benchmark.cc#L214 for how packed B matrix is used multiple times.
@jspark1105 Thanks for your help! I think I have understood how to reuse the packed matrix. All I need to do is creating a PackBMatrix object once and reusing it in my code, is it right? Actually I tried in this way before but got some errors. Now I think it may caused by mistakes in my code.