pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

uvm cache

MichoChan opened this issue · comments

why do uvm cache using self lru/lfu in fbgemm?
as I know, uvm depends on page fault,so is there already exist a page replacement algorithm like lru/lfu in os system?

commented

Hi @MichoChan, FBGEMM-GPU's table batched embedding (TBE) supports 4 table placements

  1. GPU's HBM memory (DEVICE)
  2. GPU's UVM memory (MANAGED)
  3. GPU's UVM memory with software managed cache (MANAGED_CACHING)
  4. Host's memory (HOST)

What you are referring to is Option 3 that is MANAGED_CACHING. The full embedding table is stored in UVM. The cache stores only some rows of the embedding table, and it resides on HBM. Rows are staged from UVM to cache (HBM) in every iteration via the prefetch function. LRU/LFU is the cache eviction policy.

For more information, please refer to https://arxiv.org/pdf/2010.11305.pdf

Hope this helps.