rustformers / llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models

Home Page:https://docs.rs/llm/latest/llm/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add classifier-free guidance

philpax opened this issue · comments

llama.cpp has recently developed support for CFG:

We should mirror this support. I'm not sure how well it will apply to the other models; I haven't investigated too deeply into this.

Make sure to remove the smooth_factor and the last log_softmax in order to remain consistent with llama.cpp's and HF's implementation ( ggerganov/llama.cpp#2280 )

I'm going to try to look at how to add this to llm-samplers. It will need the CFG logits though, so llm will need to handle that itself. I guess it can be supplied as a sampler resource similar to the RNG and last tokens. I'd like to figure out a more general way to handle resources but in the worst case I can just add another type of resource to that trait.

llama.cpp CFG sampler for reference (doesn't look too complicated): https://github.com/ggerganov/llama.cpp/blob/b19edd54d51cef5e3616c18b1d0d8626895b2cba/llama.cpp#L2709-L2740

On the llm side it looks like you have to maintain a guidance context and run the model for both contexts every token — so using CFG means evaluating the model is twice as slow (also, I think you need two K/V caches). Main relevant sections from llama.cpp's main example: https://github.com/ggerganov/llama.cpp/blob/b19edd54d51cef5e3616c18b1d0d8626895b2cba/examples/main/main.cpp#L208-L215 and https://github.com/ggerganov/llama.cpp/blob/b19edd54d51cef5e3616c18b1d0d8626895b2cba/examples/main/main.cpp#L484-L523