hongsunjang

Hongsun_Jang's starred repositories

finetune-gpt2xl

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed

Language:PythonMIT42700

gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

Language:PythonNOASSERTION2214600

metaseq

Repo for external large-scale work

Language:PythonMIT643900

Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Language:Python300300

Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.

Language:PythonGPL-3.02300

computer-science

:mortar_board: Path to a free self-taught education in Computer Science!

MIT16756400

SimpleSSD-FullSystem

Open-Source Licensed Educational SSD Simulator for High-Performance Storage and Full-System Evaluations

Language:C++BSD-3-Clause8700

torchrec

Pytorch domain library for recommendation systems

Language:PythonBSD-3-Clause184400

Vitis-AI-Tutorials

MIT36500

SQuant

SQuant [ICLR22]

Language:Python15800

lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Language:Jupyter NotebookApache-2.0690400

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonMIT1012900

open-llms

📋 A list of open LLMs available for commercial use.

Apache-2.01080400

improved-diffusion

Release for Improved Denoising Diffusion Probabilistic Models

Language:PythonMIT310000

ChatGPT-as-a-server

Using ChatGPT as a real backend

Language:Go600

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT619500

KDiskMark

A simple open-source disk benchmark tool for Linux distros

Language:C++GPL-3.0103200

CrystalDiskInfo

Language:C++MIT157200

ssd-benchmark-rs

Super Simple Disk Benchmark - benchmarks the writing performance of your disk

Language:RustGPL-3.04700

sc22-ae

Language:Python1300

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonNOASSERTION180100

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.03442500

Cosmos-OpenSSD

Language:VerilogGPL-3.04000

MQSim

MQSim is a fast and accurate simulator modeling the performance of modern multi-queue (MQ) SSDs as well as traditional SATA based SSDs. MQSim faithfully models new high-bandwidth protocol implementations, steady-state SSD conditions, and the full end-to-end latency of requests in modern SSDs. It is described in detail in the FAST 2018 paper by Arash Tavakkol et al., "MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices" (https://people.inf.ethz.ch/omutlu/pub/MQSim-SSD-simulation-framework_fast18.pdf)

Language:C++MIT27200

Vitis-Tutorials

Vitis In-Depth Tutorials

Language:CMIT117100

GenGNN

Language:C++3500

GCoD

[HPCA 2022] GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design

Language:PythonApache-2.03200

Literatures-on-GNN-Acceleration

A reading list for deep graph learning acceleration.

MIT21400

GenStore

GenStore is the first in-storage processing system designed for genome sequence analysis that greatly reduces both data movement and computational overheads of genome sequence analysis by exploiting low-cost and accurate in-storage filters. Described in the ASPLOS 2022 paper by Mansouri Ghiasi et al. at https://people.inf.ethz.ch/omutlu/pub/GenStore_asplos22-arxiv.pdf

Language:CMIT1200

llama

Inference code for Llama models

Language:PythonNOASSERTION5517800