model-inference-service

There are 0 repository under model-inference-service topic.

BentoML
bentoml / BentoML
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
model-serving mlops llmops generative-ai llm-inference model-inference-service inference-platform deep-learning llm-serving machine-learning python multimodal ml-engineering llm ai-inference
Language:Python 8187
bentoml / CLIP-API-service
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
ai-applications clip cloud-native mlops model-inference model-inference-service model-serving openai-clip
Language:Jupyter Notebook 65
bentoml / transformers-nlp-service
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
llm mlops model-deployment model-inference-service model-serving nlp nlp-machine-learning online-inference transformer llmops
Language:Python 45
ksm26 / Efficiently-Serving-LLMs
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
batch-processing deep-learning-techniques inference-optimization machine-learning-operations model-acceleration model-inference-service model-serving optimization-techniques performance-enhancement scalability-strategies server-optimization text-generation large-scale-deployment serving-infrastructure
Language:Jupyter Notebook 17
spirabr / SPIRA-serving-predictor-v1
SPIRA Serving Predictor v1 by @daitamae and @vitorguidi
mlops model-inference-service model-serving python
Language:Python 2