pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FBGEMM_GPU Question] When should I use FusedEmbeddingBagCollection over EmbeddingBagCollection?

JacoCheung opened this issue · comments

There is a benchmark under torchrec repo which compares EmbeddingBagCollection backed by nn.EmbeddingBag and FusedEmbeddingBagCollection backed by FBGEMM. However, AFAIK, there should be a non-fused EmbeddingBagCollection backed by FBGEMM

To my knowledge, there should be:

  • EmbeddingBagCollection backed by nn.EmbeddingBag
  • EmbeddingBagCollection backed by FBGEMM (The underlying implementation should ShardedEmbeddingBagCollection)
  • FusedEmbeddingBagCollection backed by FBGEMM (It also can be sharded, but the sharder is not present in the torchrec benchmark.)

I have few questions:

  1. To leverage FBGEMM kernel, the EmbeddingBagCollection must be used together with sharder, right?
  2. In DLRM repo, the EmbeddingBagCollection rather than FusedEmbeddingBagCollection is used. So I wonder when should I use FusedEmbeddingBagCollection.
  3. I'm curious if there is any performance result of non-fused EmbeddingBagCollection backed by FBGEMM.

Thanks!