moyix / fauxpilot

FauxPilot - an open-source GitHub Copilot server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the context length for fauxpilot?

smith-co opened this issue · comments

What is the context length for faux pilot?

I presume it is 4096?

It is 2048 tokens (one token can represent multiple characters so this is a decent amount of text). This is also the same limit Github Copilot uses, so it mostly works, but the tokenizer used by CodeGen vs Copilot are slightly different, which sometimes causes the VSCode Copilot send a prompt that is too long for CodeGen (but would have been fine for Copilot).

I think that this can be fixed by replacing the vocab.bpe and tokenizer.json files in the .vscode/extensions/github.copilot-[version]/ directory (after backing them up of course) with the ones in this directory. I haven't added this to the README yet though because I haven't had time to test it out, so do let me know if it works!

@moyix for codex the context length is 8,000 tokens. Am I missing something?


Yes – there are multiple Codex models. code-davinci-002 goes up to 8000 tokens; code-cushman-001 goes up to 2048. It's not known exactly which model Copilot is currently using (they changed things in the extension so it doesn't show the engine name any more), but at least when it was released they were using Cushman.

It's also possible that by this point Copilot is using a non-public model that they or OpenAI fine-tuned for them. But you should be able to verify that it uses 2048 tokens by looking at the requests Copilot sends; if you run them through the Codex tokenizer the prompt + tokens requested add up to 2048. You can also look into Copilot's extension.js and find:

let _={maxPromptLength:2048-(0,o.getConfig)(e,o.ConfigKey.SolutionLength)

I suspect they use the smaller model because they care a lot about latency and Cushman is quite a bit faster than DaVinci.