Ivan Fursov's starred repositories
chatbot-ui
AI chat for every model.
path-to-senior-engineer-handbook
All the resources you need to get to Senior Engineer and beyond
PhotoMaker
PhotoMaker
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
mlx-examples
Examples in the MLX framework
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
flutter_chat_ui
Actively maintained, community-driven chat UI implementation with an optional Firebase BaaS.
sd-wav2lip-uhq
Wav2Lip UHQ extension for Automatic1111
LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
llm-autoeval
Automatically evaluate your LLMs in Google Colab
lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
LLaMa2lang
Convenience scripts to finetune (chat-)LLaMa3 and other models for any language
C4_200M-synthetic-dataset-for-grammatical-error-correction
This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences from C4 using a tagged corruption model. The approach and the dataset are described in more detail by Stahlberg and Kumar (2021) (https://www.aclweb.org/anthology/2021.bea-1.4/)
faster-SadTalker-API
The API server version of the SadTalker project. Runs in Docker, 10 times faster than the original!
openai_trtllm
OpenAI compatible API for TensorRT LLM triton backend