Beast code in Giters

Yufeng Li's repositories

onnx

Open Neural Network Exchange

Language:C++Apache-2.0100

bitsandbytes

8-bit CUDA functions for PyTorch

Language:PythonMIT000

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++NOASSERTION000

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonApache-2.0000

docker_files

000

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.0000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause000

mmperf

MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.

Language:C++Apache-2.0000

onnxruntime

ONNX Runtime: cross-platform, high performance scoring engine for ML models

Language:C++MIT000

llama

Inference code for LLaMA models

NOASSERTION000

neural-speed

An innovation library for efficient LLM inference via low-bit quantization and sparsity

Apache-2.0000

optimum

🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

Language:PythonApache-2.0000

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:C++NOASSERTION000

triton

Development repository for the Triton language and compiler

MIT000

tutorials

Tutorials for creating and using ONNX models

Language:Jupyter NotebookMIT000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.0000

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonMIT000

Windows-Machine-Learning

Samples for Windows ML.

MIT000