SUGIYAMA Yoshio's starred repositories
k8s-dra-driver
Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes
llm-on-openshift
Resources, demos, recipes,... to work with LLMs on OpenShift with OpenShift AI or Open Data Hub.
chat_templates
Chat Templates for 🤗 HuggingFace Large Language Models
llama-inference
experiments with inference on llama
intro-llm-rag
LLM Models and RAG Hands-on guide
openai_trtllm
OpenAI compatible API for TensorRT LLM triton backend
tensorrtllm_backend
The Triton TensorRT-LLM Backend
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
ts-comments.nvim
Tiny plugin to enhance Neovim's native comments
nvim-best-practices
Collection of DOs and DON'Ts for modern Neovim Lua plugin development
GPU-Benchmarks-on-LLM-Inference
Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?