Hengjie Wang's repositories
cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++NOASSERTION000
SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
000
tensorflow-dlrm
TF2 implementation of DLRM (inherited and modified from openrec's initial implementation)
Language:Python000
Language:Python000
MFIX-Exa
MFiX-Exa: A multi-phase flow simulation tool based on MFiX-Classic, incorporating the massively parallel, block-structured adaptive mesh refinement (AMR) functionality of AMReX.
Language:Shell000
amrex
AMReX: Software Framework for Block Structured AMR
Language:C++NOASSERTION000
YHs_Sample
Fork of Yinghan's Code Sample
GPL-3.0000
hypre
Parallel solvers for sparse linear systems featuring multigrid methods.
NOASSERTION000
AMReX-Hydro
AMReX-based hydro routines
NOASSERTION000
strsv
A simple, but fast, triangular solver
000