cpu-inference

There are 0 repository under cpu-inference topic.

kennethleungty / Llama-2-Open-Source-LLM-CPU-Inference
Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
cpu cpu-inference deep-learning faiss langchain large-language-models llm machine-learning natural-language-processing nlp python sentence-transformers transformers c-transformers open-source-llm chatgpt document-qa language-models llama llama-2
Language:Python 967
CoderLSF / fast-llama
Runs LLaMA with Extremely HIGH speed
cpu-inference inference-engine llama llama2
Language:C++ 93
rbitr / llm.f90
LLM inference in Fortran
ai chatbot cpu-inference language-model llama llama2 llamacpp llm transformer mamba state-space-model phi-2
Language:Fortran 63
homelab
jozsefszalma / homelab
The bare metal in my basement
ai bare-metal cpu-inference hardware-hacking hobby-project homelab machine-learning deep-learning gpu server gigabyte hpc-systems supermicro unraid cse-825 g292-z20
21
lucienhuangfu / eLLM
eLLM Infers LLM on CPUs in Real Time
llm-infernece cpu-inference rust-llm llama deep-research deep-thinking context-engineering
Language:Rust 10
yybit / pllm
Portable LLM - A rust library for LLM inference
llm cpu-inference llama2 aigc
Language:Rust 9
laelhalawani / gguf_llama
Wrapper for simplified use of Llama2 GGUF quantized models.
cpu-inference gguf llama llama2 llamacpp quantization
Language:Python 7
codito / arey
Simple large language model playground app
ai cli cpu-inference gguf large-language-models llama2 llamacpp llm local-model mistral
Language:Rust 6
JohnClaw / chatllm.v
V-lang api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm gemma ggml llama llm-inference mistral qwen v-lang vlang cpu-inference inference llm llms chatbot phi3 quantization
Language:C 6
lahcenkh / rag-network-docs
Privacy-focused RAG chatbot for network documentation. Chat with your PDFs locally using Ollama, Chroma & LangChain. CPU-only, fully offline.
ai cpu-inference network-programming python3 rag-chatbot vector-database-embedding
Language:Python 6
JohnClaw / chatllm.cs
C# api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm csharp gemma ggml int8 int8-inference int8-quantization llama llm-inference mistral qwen cpu-inference inference llm llms
Language:C# 4
JohnClaw / chatllm.vb
VB.NET api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm cpu-inference gemma ggml int8 int8-inference int8-quantization llama llm-inference mistral qwen vb-net vbnet
Language:Visual Basic .NET 4
BjornMelin / local-llm-workbench
🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.
cpu-inference cuda gpu-acceleration inference-optimization llama-cpp llm-benchmarking llm-deployment local-llm model-management model-quantization context-window-scaling hybrid-inference ollama-optimization wsl-ai-setup
Language:Shell 3
JohnClaw / chatllm.nim
Nim api-wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatllm gemma ggml llama llm-inference mistral nim nim-lang nim-language qwen cpu-inference inference llm llms chatbot nimlang phi quantization
Language:C 3
Nishant1998 / PlantAi
PlantAi is a ResNet-based CNN model trained on the PlantVillage dataset to classify plant leaf images as healthy or diseased. This repository includes PyTorch training code, tools to convert the model to TensorFlow Lite (TFLite) for deployment, and an Android app integrating the model for real-time leaf disease detection from camera images.
android java agriculture-ai cnn computer-vision cpu-inference deep-learning deep-neural-networks image-classification onnx pytoch real-time-inference resnet tflight
Language:Java 3
chinese-soup / cbot-telegram-whisper
Simple bot that transcribes Telegram voice messages. Powered by go-telegram-bot-api & whisper.cpp Go bindings.
bot golang openai whisper whisper-cpp whispercpp cpu-inference speech-recognition speech-to-text
Language:Go 2
JohnClaw / chatllm.rs
rust api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatbot chatllm cpu-inference gemma ggml inference llama llm llm-inference llms mistral quantization qwen rust
Language:Rust 2
JohnClaw / chatllm.d
D-lang api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatbot chatllm cpu-inference d-lang d-language dlang gemma ggml inference llama llm llm-inference llms mistral quantization qwen
Language:D 1
JohnClaw / chatllm.kt
kotlin api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatbot chatllm cpu-inference gemma ggml inference kotlin llama llm llm-inference llms mistral quantization qwen
Language:C 1
JohnClaw / chatllm.lua
lua api wrapper for llm-inference chatllm.cpp
api-wrapper bindings chatbot chatllm cpu-inference gemma ggml inference llama llm llm-inference llms lua luajit mistral quantization qwen
Language:Lua 1
JohnClaw / gemma-2-2b-it.cs
gemma-2-2b-it int8 cpu inference in one file of pure C#
cpu-inference csharp gemma gemma2 inference inference-engine int8 int8-inference int8-quantization llm llm-inference llm-serving llms quantization gemma2-2b-it model-serving
Language:C# 1
JohnClaw / llama-3.2-1b.vb
llama 3.2 1b fp16 cpu inference in one file of pure VB.NET
cpu-inference fp16 inference inference-engine llama llama3 llm llm-inference vb-net vbnet visual-basic-dot-net visual-basic-net basic-programming llm-serving llms llama3-2 model-serving
Language:Visual Basic .NET 1
JohnClaw / qwen3.java
Java-port of qwen3.c
cpu-inference inference inference-engine java java-ports llm llm-inference llm-serve llm-serving llms q8 quantization qwen qwen3
Language:Java 1
SIYAKS-ARES / survival-with-llms
The Ark Project: Selecting the perfect AI model to reboot civilization from a 64GB USB drive. Comprehensive analysis of open-source LLMs under extreme constraints, with final recommendation: Meta Llama 3.1 70B Instruct (Q6_K GGUF). Includes interactive tools, detailed comparisons, and complete implementation guide for offline deployment.
cpu-inference gguf llama-cpp llm meta-llama mixtral model-quantization offline-ai open-source-ai qwen survival-technology civilization-reboot
Language:HTML 1
bhimanbaghel / llama-streamlit-app
🤖 AI Text Completion App built with Streamlit and Llama-3.2-1B. Generate creative text completions with an intuitive web interface. GPU & CPU optimized, easy to deploy, perfect for content creation and AI experimentation.
ai cpu-inference huggingface llama machine-learning nlp python streamlit streamlit-webapp text-generation transformers webapp
Language:Python
NeuroTinkerLab / local-rag-chat-with-foundry
Un sistema RAG per chattare con documenti locali usando Foundry e modelli LLM su CPU
ai cpu-inference document-chat llm local-ai rag-chatbot
Language:Python
ukkit / llama-chat
Lightweight web UI for llama.cpp with dynamic model switching, chat history & markdown support. No GPU required. Perfect for local AI development.
chat-ui cpu-inference llama-cpp local-ai model-switching ai-assistant conversation-history developer-tools javascript llm-frontend offline-ai open-source-ai performance-optimized private-ai python-flask self-hosted sqlite web-interface markdown-chat
Language:Shell