Jaemin Choi (minitu)

minitu

Geek Repo

Company:@NVIDIA

Location:Santa Clara, California, United States

Home Page:https://www.linkedin.com/in/jaemincs/

Github PK Tool:Github PK Tool


Organizations
UIUC-PPL

Jaemin Choi's repositories

Language:Jupyter NotebookLicense:MITStargazers:2Issues:1Issues:0

baseenv

A fork of Bill Gropp's baseenv (http://wgropp.cs.illinois.edu/projects/software/baseenv.htm)

Language:CStargazers:1Issues:1Issues:0

charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.

Language:C++License:NOASSERTIONStargazers:1Issues:0Issues:0

charming

GPU-resident runtime system based on Charm++ principles

Language:CudaStargazers:1Issues:0Issues:0

hpm

A Heterogeneous Performance Modeling Framework (GPU + MPI)

apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

buggy

A buddy allocator for GPU memory

Language:C++Stargazers:0Issues:1Issues:0

changa

Mirror of UIUC/PPL version of ChaNGa

Language:C++License:GPL-2.0Stargazers:0Issues:0Issues:0

codes

The Co-Design of Exascale Storage Architectures (CODES) simulation framework builds upon the ROSS parallel discrete event simulation engine to provide high-performance simulation utilities and models for building scalable distributed systems simulations

Language:CLicense:NOASSERTIONStargazers:0Issues:0Issues:0

dlrm

An implementation of a deep learning recommendation model (DLRM)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

dumpi-cortex

A fork of https://xgitlab.cels.anl.gov/mdorier/dumpi-cortex

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:0Issues:0

gpu

Contains pieces of GPU related research that are too small to warrant a separate repository.

Language:CStargazers:0Issues:0Issues:0

gpuroofperf-toolkit

A GPU performance prediction toolkit for CUDA programs

Language:CudaLicense:MITStargazers:0Issues:0Issues:0
Language:CudaStargazers:0Issues:2Issues:0

kokkos-tutorials

Tutorials for the Kokkos C++ Performance Portability Programming EcoSystem

Language:C++Stargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

miniFE

MiniFE Finite Element Mini-Application

Language:C++License:LGPL-3.0Stargazers:0Issues:1Issues:0

miniMD

MiniMD Molecular Dynamics Mini-App

Language:C++Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Language:CudaLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

NeMo

NeMo: a toolkit for conversational AI

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ompi

Open MPI main development repository

Language:CLicense:NOASSERTIONStargazers:0Issues:0Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

sst-dumpi

SST DUMPI Trace Library

Language:CLicense:NOASSERTIONStargazers:0Issues:0Issues:0

sw4lite

Testing numerical kernels in SW4

Language:CLicense:NOASSERTIONStargazers:0Issues:0Issues:0

TraceR

Trace Replay and Network Simulation Framework

Language:CLicense:MITStargazers:0Issues:0Issues:0

training

Reference implementations of MLPerf™ training benchmarks

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

triton

Development repository for the Triton language and compiler

License:MITStargazers:0Issues:0Issues:0