ParCoreLab's repositories
CPU-Free-model
Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involvement of the CPU beyond the initial kernel launch.
ReuseTracker
A fast and accurate reuse distance analyzer for multi-threaded applications. It leverages existing hardware features in commodity CPUs.
mixed-and-multi-spmv
Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection.
SpTRSV_Framework
The SpTRSV prediction framework is an automated prediction framework for the fastest sparse triangular solve (SpTRSV) algorithm for a given input sparse matrix on a CPU-GPU platform.
Split_SpTRSV
The split execution framework can automatically determine the suitability of an SpTRSV for split-execution, find the appropriate split point, and execute SpTRSV in a split fashion using two SpTRSV algorithms while automatically managing any required inter-platform communication. The model is implemented as a C++/CUDA library supporting multiple CPU-GPU algorithms.
BeyondMoore
BeyondMoore has an ambitious goal to develop a software framework that performs static and dynamic optimizations, issues accelerator-initiated data transfers, and reasons about parallel execution strategies that exploit both processor and memory heterogeneity.
gpu-fusion
GPU fusion code and algorithm
accuracy-verification-microbenchmarks
The microbenchmarks that are used to verify the accuracy of ComDetective.
CPU-Free-Model-Compiler
DaCe - Data Centric Parallel Programming
hpctoolkit-externals
HPCToolkit performance tools: essential third party libraries for hpctoolkit
AMD_IBS_Toolkit
AMD Research Instruction Based Sampling Toolkit
hpctoolkit
HPCToolkit performance tools: measurement and analysis components
snoopie-ucx-tracking-ucx
Modified ucx library to track communications
splash2
Splash 2 Benchmarks
Uniconn
Uniconn is a unified, portable high-level C++ communication library that supports both point-to-point and collective operations across GPU clusters. Uniconn enables seamless switching between backends and APIs (host or device) with minimal or no changes to application code.