IMOKURI

SUGIYAMA Yoshio's starred repositories

k8s-dra-driver

Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes

Language:GoApache-2.017700

FlashRAG

⚡FlashRAG: A Python Toolkit for Efficient RAG Research

Language:PythonMIT74600

mesop

Language:PythonApache-2.0222700

optimum-nvidia

Language:PythonApache-2.082600

nos

Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elastic quotas - Effortless optimization at its finest!

Language:GoApache-2.058700

llm-on-openshift

Resources, demos, recipes,... to work with LLMs on OpenShift with OpenShift AI or Open Data Hub.

Language:DockerfileApache-2.06700

mistral-finetune

Language:PythonApache-2.0225800

langroid

Harness LLMs with Multi-Agent Programming

Language:PythonMIT181100

chat_templates

Chat Templates for 🤗 HuggingFace Large Language Models

Language:Jinja29000

llama-inference

experiments with inference on llama

Language:Python10200

intro-llm-rag

LLM Models and RAG Hands-on guide

Language:PythonMIT19300

ragapp

The easiest way to use Agentic RAG in any enterprise

Language:TypeScriptApache-2.0239200

ChatRTX

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

Language:PythonNOASSERTION246900

openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend

Language:RustMIT10500

tensorrtllm_backend

The Triton TensorRT-LLM Backend

Language:PythonApache-2.055400

cognita

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

Language:PythonApache-2.0269100

unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language:HTMLApache-2.0714700

ts-comments.nvim

Tiny plugin to enhance Neovim's native comments

Language:LuaApache-2.024100

nvim-best-practices

Collection of DOs and DON'Ts for modern Neovim Lua plugin development

CC0-1.019600

GPU-Benchmarks-on-LLM-Inference

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Language:Jupyter Notebook53400

LLMenv

Language:Shell200

e-learning

400

ez-cheat

Unofficial Cheat Book for HPE Ezmeral Products

200

DeepSpeedFugaku

Language:PythonNOASSERTION12000

tegon

Tegon is an open-source, AI-first alternative to Jira, Linear

Language:TypeScriptMIT75100

IC-Light

More relighting!

Language:PythonApache-2.0372300

timesfm

TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.

Language:PythonApache-2.0285400

llm.nvim

LLM powered development for Neovim

Language:LuaApache-2.059500

llm-ls

LSP server leveraging LLMs for code completion (and more?)

Language:RustApache-2.050100

llama.cpp

LLM inference in C/C++

Language:C++MIT6005000