liyucheng09 / Selective_Context

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40% memory and GPU time.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How do we load target LLM?

Li-Muyang opened this issue · comments

Hi, thanks for the very nice work! We're trying to follow up on this topic, but I'm slightly confused about how to load a target LLM such as LLaMa in this code? Currently it seems that the target LLM is hard-coded as ChatGPT.