JiakunYan / CPPuddle

Utility library to handle small, reusable pools of both device memory buffers (via allocators) and device executors (with multiple scheduling policies).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


ctest Build Status


This repository was initially created to explore how to best use HPX and Kokkos together! For fine-grained GPU tasks, we needed a way to avoid excessive allocations of one-usage GPU buffers (as allocations block the device for all streams) and creation/deletion of GPU executors (as those are usually tied to a stream which is expensive to create as well).

We currently test/use CPPuddle in Octo-Tiger, together with HPX-Kokkos. In this use-case, allocating GPU buffers for all sub-grids in advance would have wasted a lot of memory. On the other hand, unified memory would have caused unnecessary GPU to CPU page migrations (as the old input data gets overwritten anyway). Allocating buffers on-the-fly would have blocked the device. Hence, we currently test this buffer management solution!

Tools provided by this repository

  • Allocators that reuse previousely allocated buffers if available (works with normal heap memory, pinned memory, aligned memory, CUDA/HIP device memory, and Kokkos Views). Note that separate buffers do not coexist on a single chunk of continuous memory, but use different allocations.
  • Executor pools and various scheduling policies (round robin, priority queue, multi-gpu), which rely on reference counting to gauge the current load of a executor instead of querying the device itself. Tested with CUDA, HIP and Kokkos executors provided by HPX / HPX-Kokkos.
  • Special Executors/Allocators for on-the-fly work GPU aggregation (using HPX).


  • C++17
  • CMake (>= 3.16)
  • Optional (for the header-only utilities / test): CUDA, Boost, HPX, Kokkos, HPX-Kokkos

The submodules can be used to obtain the optional dependencies which are required for testing the header-only utilities. If these tests are not required, the submodule (and the respective buildscripts in /scripts) can be ignored safely.

Build / Install

Basic build

  cmake -H/path/to/source -B$/path/to/build -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/path/to/install/cppuddle -DCPPUDDLE_WITH_TESTS=OFF -DCPPUDDLE_WITH_COUNTERS=OFF                                                             
  cmake --build /path/to/build --target install  

If installed correctly, CPPuddle can be used in other CMake-based projects via

find_package(CPPuddle REQUIRED)

Recommended build:

  cmake -H/path/to/source -B$/path/to/build -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/path/to/install/cppuddle -DCPPUDDLE_WITH_HPX=ON -DCPPUDDLE_WITH_HPX_AWARE_ALLOCATORS=ON -DCPPUDDLE_WITH_TESTS=OFF -DCPPUDDLE_WITH_COUNTERS=OFF                                                             


Utility library to handle small, reusable pools of both device memory buffers (via allocators) and device executors (with multiple scheduling policies).

License:Boost Software License 1.0


Language:C++ 73.7%Language:CMake 18.9%Language:Shell 5.3%Language:Cuda 2.1%