IlyasMoutawwakil

followers

following

stars

@huggingface

Paris, France

ilyasmoutawwakil.github.io

Organizations

Chouafa

huggingface

Ilyas Moutawwakil's starred repositories

privateGPT

Interact with your documents using the power of GPT, 100% privately, no data leaks

Language:PythonApache-2.049730 443 992

uBlock

uBlock Origin - An efficient blocker for Chromium and Firefox. Fast and lean.

Language:JavaScriptGPL-3.044283 905 3445

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

Language:LLVMNOASSERTION26537 595 71744

onnx

Open standard for machine learning interoperability

Language:PythonApache-2.017150 436 2733

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonApache-2.014750 107 939

candle

Minimalist ML framework for Rust

Language:RustApache-2.014164 150 598

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Language:PythonMIT8373 69 193

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonApache-2.08277 100 1157

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.07251 84 1488

insanely-fast-whisper

Language:Jupyter NotebookApache-2.06871 62 175

GPU-Puzzles

Solve puzzles. Learn CUDA.

Language:Jupyter NotebookMIT5277 29 27

diffusion-models-class

Materials for the Hugging Face Diffusion Models Course

Language:Jupyter NotebookApache-2.03299 101 22

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonMIT3176 30 327

exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Language:PythonMIT3164 35 352

TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Language:PythonBSD-3-Clause2400 69 1418

text-embeddings-inference

A blazing fast inference solution for text embeddings models

Language:RustApache-2.02206 27 177

docquery

An easy way to extract information from documents

Language:PythonMIT1669 24 46

mteb

MTEB: Massive Text Embedding Benchmark

Language:PythonApache-2.01589 8 305

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++Apache-2.01576 31 605

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonMIT1371 11 311

llm-vscode

LLM powered development for VSCode

Language:TypeScriptApache-2.01162 22 80

cuda-python

CUDA Python Low-level Bindings

Language:PythonNOASSERTION801 30 61

attention_sinks

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

Language:PythonApache-2.0645 12 29

quanto

A pytorch Quantization Toolkit

Language:PythonApache-2.0613 8 65

OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Language:PythonMIT607 17 68

SpQR

Language:PythonApache-2.0514 22 21

hydra-moe

Language:Python406 23 11

NEFTune

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Language:PythonMIT345 11 14

onnxscript

ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.

Language:PythonMIT239 27 464

scrape-open-llm-leaderboard

Scrape and export data from the Open LLM Leaderboard.

Language:PythonMIT3500