LiuTaowen-Tony

followers

following

stars

Taowen (Tony)'s repositories

ICHaskellStyleGuide

A Haskell style guide that follows conventions in Imperial College 40009 Computing Practical.

Language:Haskell100

bigscience

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

Language:ShellNOASSERTION000

bin

Language:Python000

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonMIT000

fstxt

Language:Java000

circle-gym

Language:C000

cpufp

A CPU tool for benchmarking the peak of floating points

Language:AssemblyGPL-3.0000

dotfiles

my dotfiles

Language:Shell000

efficient-ml-research-ideas

000

FasterTransformer

Transformer related optimization, including BERT, GPT

Apache-2.0000

file-transfer

000

flash-atten

Language:HTMLBSD-3-Clause000

How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Apache-2.0000

learn-QBAF

Language:PythonMIT000

learn_triton

Language:Python000

LLM_Tree_Search

The official implementation of paper: Alphazero-like Tree-Search can guide large language model decoding and training

Language:Python000

Megatron-LLaMA

Best practice for training LLaMA models in Megatron-LM

NOASSERTION000

memory-efficient-attention-pytorch

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Language:PythonMIT000

microxcaling

PyTorch emulation library for Microscaling (MX)-compatible data formats

Language:PythonMIT000

omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.

Language:PythonApache-2.0000

paper_reading

A shared paper reading repository for people in the group

000

please

a command line copilot

Language:Python000

QBAF-jdfm

Language:PythonMIT000

QBAF-prune

Language:Python000

QPytorch

Language:PythonMIT000

QPytorch_result_removed

Language:PythonMIT000

Retriever

Retriever-0.1B

Language:Python000

tora

000

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Apache-2.0000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Apache-2.0000