Sungjae Lee's starred repositories

msccl

Microsoft Collective Communication Library

Language:C++License:NOASSERTIONStargazers:313Issues:0Issues:0

nccl

Optimized primitives for collective multi-GPU communication

Language:C++License:NOASSERTIONStargazers:3201Issues:0Issues:0

hfcxx

Hartree-Fock C++ code

Language:C++License:GPL-3.0Stargazers:28Issues:0Issues:0

Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

Stargazers:1000Issues:0Issues:0

juga

Juga is a stock price data fetcher

Language:PythonLicense:MITStargazers:5Issues:0Issues:0

dftcxx

C++ based DFT program for educational purposes

Language:C++License:GPL-3.0Stargazers:55Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:8506Issues:0Issues:0

mlx

MLX: An array framework for Apple silicon

Language:C++License:MITStargazers:16903Issues:0Issues:0

deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics

Language:C++License:LGPL-3.0Stargazers:1479Issues:0Issues:0

deepchem

Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology

Language:PythonLicense:MITStargazers:5471Issues:0Issues:0

MegaMolBART

A deep learning model for small molecule drug discovery and cheminformatics based on SMILES

Language:PythonStargazers:145Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:13202Issues:0Issues:0

DL_Compiler

Study Group of Deep Learning Compiler

Stargazers:152Issues:0Issues:0

cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Language:CudaLicense:BSD-3-ClauseStargazers:1680Issues:0Issues:0

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:1877Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:134078Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:5850Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:35244Issues:0Issues:0

metaseq

Repo for external large-scale work

Language:PythonLicense:MITStargazers:6507Issues:0Issues:0

grpc-gateway

gRPC to JSON proxy generator following the gRPC HTTP spec

Language:GoLicense:BSD-3-ClauseStargazers:18160Issues:0Issues:0

backend.ai

Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator support including CUDA GPU, ROCm GPU, TPU, IPU and other NPUs.

Language:PythonLicense:LGPL-3.0Stargazers:513Issues:0Issues:0

marss.dramsim

A branch of marss with DRAMSim hooks

Language:CStargazers:18Issues:0Issues:0

cpuminer

CPU miner for Litecoin and Bitcoin

Language:AssemblyLicense:NOASSERTIONStargazers:2781Issues:0Issues:0

CudaMiner

a CUDA accelerated litecoin mining application based on pooler's CPU miner

Language:CLicense:NOASSERTIONStargazers:692Issues:0Issues:0

darknet

Convolutional Neural Networks

Language:CLicense:NOASSERTIONStargazers:25796Issues:0Issues:0