Yufeng Li (yufenglee)

yufenglee

Geek Repo

Company:@Microsoft

Location:Sunnyvale, CA

Github PK Tool:Github PK Tool


Organizations
microsoft

Yufeng Li's repositories

onnx

Open Neural Network Exchange

Language:C++License:Apache-2.0Stargazers:1Issues:0Issues:0

bitsandbytes

8-bit CUDA functions for PyTorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

mmperf

MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

onnxruntime

ONNX Runtime: cross-platform, high performance scoring engine for ML models

Language:C++License:MITStargazers:0Issues:0Issues:0

llama

Inference code for LLaMA models

License:NOASSERTIONStargazers:0Issues:0Issues:0

neural-speed

An innovation library for efficient LLM inference via low-bit quantization and sparsity

License:Apache-2.0Stargazers:0Issues:0Issues:0

optimum

🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

triton

Development repository for the Triton language and compiler

License:MITStargazers:0Issues:0Issues:0

tutorials

Tutorials for creating and using ONNX models

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Windows-Machine-Learning

Samples for Windows ML.

License:MITStargazers:0Issues:0Issues:0