local-inference

There are 3 repositories under local-inference topic.

SJTU-IPADS / PowerInfer
High-speed Large Language Model Serving for Local Deployment
large-language-models llama llm llm-inference local-inference
Language:C++ 8378
efeslab / fiddler
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
llm llm-inference mixtral-8x7b mixture-of-experts local-inference
Language:Python 240
BorjaOteroFerreira / IALab-Suite
Tool for test diferents large language models without code.
flask-api llm llm-inference chat-application inference-api large-language-models llama-cpp-python llama2 api-rest llama2-7b llamacpp mixtral-8x7b local-inference
Language:Python 19
JetInfero
tinyBigGAMES / JetInfero
Local LLM Inference Library
ai-inference c-cpp library llama-cpp local-inference pascal procedural-api win64
Language:Pascal 12
michaelborck-education / study-buddy
Study Buddy is a desktop application that provides AI tutoring without requiring internet access or accounts
ai-tutor css desktop-application electron javascript local-inference privacy-focused typescript offline-application
Language:TypeScript 8
yas-sim / openvino-llm-chatbot-rag
LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).
chatbot llm openvino rag retrieval-augmented-generation dolly2 edge-computing edge-inference llama2 intel langchain natural-language-processing offline cloud-free local-inference huggingface neural-chat
Language:Python 7
Raxephion / AuraGen-AuraFlow-WebUI
Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.
ai-image-generator diffusers generative-ai gradio image-generation local-inference open-source python stable-diffusion text-to-image webui auraflow low-vram
Language:Python 3
aTh1ef / ai-debate-agents
Verify claims using AI agents that debate using scraped evidence and local language models.
agentic-ai autonomous-agents beautifulsoup claim-verification langgraph lm-studio local-inference phi-4-mini private-llm python qwen scraping evidence-based-ai
Language:Python 1
nazago / meeting-minutes-generator
Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated
langchain-python llm-inference local-inference meeting-minutes ollama speech-to-text summarization whisper
Language:Python 0
nazago / rag-llm-local
script which performs RAG and use a local LLM for Q&A
langchain-python llm local-inference ollama rag
Language:Python 0
cuiyuheng / nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
ai local-inference
Language:Python

local-inference

SJTU-IPADS / PowerInfer

efeslab / fiddler

BorjaOteroFerreira / IALab-Suite

tinyBigGAMES / JetInfero

michaelborck-education / study-buddy

yas-sim / openvino-llm-chatbot-rag

Raxephion / AuraGen-AuraFlow-WebUI

aTh1ef / ai-debate-agents

nazago / meeting-minutes-generator

nazago / rag-llm-local

cuiyuheng / nexa-sdk