Ilyas Moutawwakil (IlyasMoutawwakil)

IlyasMoutawwakil

Geek Repo

Company:@huggingface

Location:Paris, France

Home Page:ilyasmoutawwakil.github.io

Github PK Tool:Github PK Tool


Organizations
Chouafa
huggingface

Ilyas Moutawwakil's starred repositories

gpu-benches

collection of benchmarks to measure basic GPU capabilities

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:181Issues:0Issues:0

llm-perf-backend

The backend behind the LLM-Perf Leaderboard

Language:PythonLicense:Apache-2.0Stargazers:11Issues:0Issues:0

py-txi

A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.

Language:PythonLicense:Apache-2.0Stargazers:29Issues:0Issues:0

optimum-amd

AMD related optimizations for transformer models

Language:Jupyter NotebookLicense:MITStargazers:39Issues:0Issues:0

uBlock

uBlock Origin - An efficient blocker for Chromium and Firefox. Fast and lean.

Language:JavaScriptLicense:GPL-3.0Stargazers:44847Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7057Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:847Issues:0Issues:0

scrape-open-llm-leaderboard

Scrape and export data from the Open LLM Leaderboard.

Language:PythonLicense:MITStargazers:37Issues:0Issues:0

OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Language:PythonLicense:MITStargazers:633Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7714Issues:0Issues:0

NEFTune

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Language:PythonLicense:MITStargazers:351Issues:0Issues:0

GPU-Puzzles

Solve puzzles. Learn CUDA.

Language:Jupyter NotebookLicense:MITStargazers:5450Issues:0Issues:0

text-embeddings-inference

A blazing fast inference solution for text embeddings models

Language:RustLicense:Apache-2.0Stargazers:2352Issues:0Issues:0

optimum-quanto

A pytorch quantization backend for optimum

Language:PythonLicense:Apache-2.0Stargazers:681Issues:0Issues:0

attention_sinks

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

Language:PythonLicense:Apache-2.0Stargazers:651Issues:0Issues:0

llm-vscode

LLM powered development for VSCode

Language:TypeScriptLicense:Apache-2.0Stargazers:1184Issues:0Issues:0

diffusion-models-class

Materials for the Hugging Face Diffusion Models Course

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3430Issues:0Issues:0

cuda-python

CUDA Python Low-level Bindings

Language:PythonLicense:NOASSERTIONStargazers:832Issues:0Issues:0

onnxscript

ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.

Language:PythonLicense:MITStargazers:254Issues:0Issues:0

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

Language:LLVMLicense:NOASSERTIONStargazers:27196Issues:0Issues:0

mteb

MTEB: Massive Text Embedding Benchmark

Language:PythonLicense:Apache-2.0Stargazers:1683Issues:0Issues:0

onnx

Open standard for machine learning interoperability

Language:PythonLicense:Apache-2.0Stargazers:17369Issues:0Issues:0

exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Language:PythonLicense:MITStargazers:3310Issues:0Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:4169Issues:0Issues:0

private-gpt

Interact with your documents using the power of GPT, 100% privately, no data leaks

Language:PythonLicense:Apache-2.0Stargazers:53130Issues:0Issues:0
Language:PythonStargazers:408Issues:0Issues:0

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Language:PythonLicense:MITStargazers:8541Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:515Issues:0Issues:0

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonLicense:MITStargazers:1503Issues:0Issues:0

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:15255Issues:0Issues:0