win4r

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0865900

web-llm

High-performance In-browser LLM Inference Engine

Language:TypeScriptApache-2.01361700

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonApache-2.0464200

mistral.rs

Blazingly fast LLM inference.

Language:RustMIT445200

Vision-language-models-VLM

vision language models finetuning notebooks & use cases (paligemma - florence .....)

Language:Jupyter Notebook400

DB-GPT-Hub

A repository that contains models, datasets, and fine-tuning techniques for DB-GPT, with the purpose of enhancing model performance in Text-to-SQL

Language:PythonMIT144400

notebooks

Collection of notebook guides created by the Brev.dev team!

Language:Jupyter NotebookMIT166700

VARAG

Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine

Language:Python34900

onnx

Open standard for machine learning interoperability

Language:PythonApache-2.01792700

copa

CoPa: LLM Prompting Templating

Language:TypeScriptMIT2600

dataloader

DataLoader is a generic utility to be used as part of your application's data fetching layer to provide a consistent API over various backends and reduce requests to those backends via batching and caching.

Language:JavaScriptMIT1289500

datasetGPT

A command-line interface to generate textual and conversational datasets with LLMs.

Language:Python29300

NotionNext

使用 NextJS + Notion API 实现的，支持多种部署方案的静态博客，无需服务器、零门槛搭建网站，为Notion和所有创作者设计。 (A static blog built with NextJS and Notion API, supporting multiple deployment options. No server required, zero threshold to set up a website. Designed for Notion and all creators.)

Language:JavaScriptMIT787500

win4r

Charles Ching's starred repositories

openai-realtime-console

ultravox

pocketpal-ai

InternVL

LongWriter

koboldcpp

TensorRT-LLM

web-llm

lmdeploy

mistral.rs

Vision-language-models-VLM

DB-GPT-Hub

notebooks

VARAG

onnx

copa

dataloader

datasetGPT

NotionNext

llama-stack

harbor

agenta

FlashRAG

moshi

GRIN-MoE

Qwen2.5

Controllable-RAG-Agent

GenAI_Agents

auto-cot

tree-of-thoughts