There are 58 repositories under high-performance-computing topic.
High-performance TensorFlow library for quantitative finance.
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs via OpenCL.
Training and serving large-scale neural networks with auto parallelization.
A list of awesome compiler projects and papers for tensor computation and deep learning.
Acceleration package for neural networks on multi-core CPUs
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!
Fast Clojure Matrix Library
Linear algebra in TypeScript.
OpenMC Monte Carlo Code
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
100% Vanilla Javascript Multithreading & Parallel Execution Library
A modern, fast, lightweight thread pool library based on C++20
Graphics Processing Units Molecular Dynamics
GraphIt - A High-Performance Domain Specific Language for Graph Analytics