codelion / optillm

Optimizing inference proxy for LLMs

codelion/optillm Issues

Is there any possibility we align some interest?
Closed 18 days ago1
Using llama-server issue with 'no_key' API key
Closed 18 days ago1
Scripts to reproduce benchmark results
Closed 18 days ago1
Implement cot decoding with llama.cpp
Updated 19 days ago4
Request for Reference Citations for CoT Prompting Methods
Closed 21 days ago1
(MOA) Fails with "List Index Out of Range" Error on OpenAI-Compatible Ollama API Endpoint
Updated 21 days ago6
Implement routing
Closed 23 days ago1
I can see cot_decode method has implemented, but we can't use it with the proxy.
Closed a month ago13
When I tried the optillm with my own openai API compatible hosted model I get this error
Closed a month ago6
Add a lighting template for running optillm
Updated a month ago1
Integration with Gemini 1.5 models
Closed a month ago2
token counting
Closed a month ago2
[Question]: Which paper is mcts.py based on?
Closed a month ago1
Can't install z3-solver, is it possible to support lean4?
Closed a month ago20
Error processing request: litellm.AuthenticationError: AuthenticationError
Closed a month ago2
Add support for logging with --log=debug
Closed a month ago1
Add support for sympy in solver approach
Closed a month ago1
Possible error in calculate_confidence() logic for cot_decoding.py
Closed a month ago1
Add support to pass slug as extra_body argument instead of prefix of model name
Closed a month ago
Response text missing when using third-party AI frontend with local endpoint
Closed a month ago4
Streaming, Context, Port & Proxy vs Library
Closed a month ago7
Clarification: proxy or library for cot_decoding??
Closed a month ago2
Change api-key to optillm-api-key
Closed a month ago
use with llama.cpp
Closed a month ago8
Flask import fails
Closed 2 months ago5
Create a gradio based GUI to compare different approaches
Updated 2 months ago
Support AzureOpenAI client
Closed 2 months ago1
Gsm8k bad test
Closed 2 months ago1
Minimal working MCTS example
Closed 2 months ago3
Too many tokens
Closed 2 months ago2
initial_query both in system message and user message
Closed 2 months ago2