wangcx18 / llm-vscode-inference-server

An endpoint server for efficiently serving quantized open-source LLMs for code.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Keeps responding back with tokens

cmosguy opened this issue · comments

I keep getting fim tokens when it responds back, am I supposed to scrub this directly in the code or is there some setting that has to be used in the extension for llm-vscode ?

<fim_prefix>
import debugpy


# create a class called car
class Car:
    # create a method called drive
    def drive(self):
        print("driving")


# create an object called my_car
my_car =    <fim_suffix>