ParCoreLab

ParCoreLab's repositories

Snoopie

Multi-GPU communication profiler and visualizer

Language:CNOASSERTION32 4 5

ComScribe

ComScribe is a tool to identify communication among all GPU-GPU and CPU-GPU pairs in a single-node multi-GPU system.

Language:C++BSD-3-Clause2601

CPU-Free-model

Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involvement of the CPU beyond the initial kernel launch.

Language:CudaMIT21 3 4

ReuseTracker

A fast and accurate reuse distance analyzer for multi-threaded applications. It leverages existing hardware features in commodity CPUs.

Language:Shell20 3 3

mixed-and-multi-spmv

Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection.

Language:CudaMIT6 5 2

SpTRSV_Framework

The SpTRSV prediction framework is an automated prediction framework for the fastest sparse triangular solve (SpTRSV) algorithm for a given input sparse matrix on a CPU-GPU platform.

Language:C++NOASSERTION600

The split execution framework can automatically determine the suitability of an SpTRSV for split-execution, find the appropriate split point, and execute SpTRSV in a split fashion using two SpTRSV algorithms while automatically managing any required inter-platform communication. The model is implemented as a C++/CUDA library supporting multiple CPU-GPU algorithms.

Language:C++NOASSERTION400

BeyondMoore

BeyondMoore has an ambitious goal to develop a software framework that performs static and dynamic optimizations, issues accelerator-initiated data transfers, and reasons about parallel execution strategies that exploit both processor and memory heterogeneity.

2 20

ParCoreTools

Language:C++2 30

PES-artifact

Language:C2 30

aCG

GPU-accelerated linear solvers based on the conjugate gradient (CG) method, supporting NVIDIA and AMD GPUs with GPU-aware MPI, NCCL, RCCL or NVSHMEM

Language:CMIT100

gpu-fusion

GPU fusion code and algorithm

Language:CudaMIT1 10

pardnn

Language:C++1 10

.github

Homepage README.

010

accuracy-verification-microbenchmarks

The microbenchmarks that are used to verify the accuracy of ComDetective.

Language:Makefile010

cha-aware-result-parser

Language:C++030

CPU-Free-Model-Compiler

DaCe - Data Centric Parallel Programming

Language:PythonBSD-3-Clause000

hpctoolkit-externals

HPCToolkit performance tools: essential third party libraries for hpctoolkit

Language:ShellNOASSERTION000

parcorelab.github.io

Language:TypeScript020

pes-benchs

Language:C000

AMD_IBS_Toolkit

AMD Research Instruction Based Sampling Toolkit

Language:C010

barnes

Language:C020

gpucommanalyzer

Language:CNOASSERTION020

hpctoolkit

HPCToolkit performance tools: measurement and analysis components

Language:C++000

snoopie-ucx-tracking-ucx

Modified ucx library to track communications

Language:CNOASSERTION000

snoopie-visualiser

03 2

splash2

Splash 2 Benchmarks