mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Home Page:https://torchsparse.mit.edu

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can sparse convolution benefit from the pre-existing weights of dense convolution?

ZzTodd22 opened this issue · comments

Thank you immensely for sharing your work! I do have a query though: I've transitioned an identical model from dense convolution to sparse convolution. Despite meticulously loading the weights of all layers post-transition, I've yet to observe significant improvements. My question is, could the weights of the original dense convolution still influence the performance of the sparse convolution? In essence, can sparse convolution benefit from the pre-existing weights of dense convolution? Eagerly awaiting your insights on this matter!

Hi @ZzTodd22! Thanks for your interests in TorchSparse! This is a very good question!

We do have some unit tests demonstrating that the sparse convolution can be mapped to the dense convolution in a layer-wise comparison. However, transforming dense pre-trained weights to sparse models might be more challenging, since the properties of sparse workloads may be very different. While the question is really interesting, we think the answer to it still remains open for further exploration.

Best regards,
Shang