twinny

Are you fed up of all of those so called "free" Copilot alternatives with paywalls and signups? Fear not my developer friend!

Twinny is the most no-nonsense locally hosted (or api hosted) AI code completion plugin for Visual Studio Code and any compatible editors (like VSCodium) designed to work seamlessly with:

Like Github Copilot but 100% free!

Install twinny on the
Visual Studio Code extension marketplace

Latest version v3.11.0

The twinny extension has received an update, bringing it to version 3.11.0. This release is categorized as a minor update, as it may potentially break existing configurations.

One of the key changes in this version is the way API settings are handled. Instead of configuring API settings within the twinny extension itself, providers are now managed through the extension's provider settings. This change streamlines the process of switching between different models, providers, and APIs without the need to access the extension settings directly.

Main features

Fill in the middle code completion

Get AI based suggestions in real time. While coding you can let twinny autocomplete the code as you are typing.

Chat with AI about your code

Through the side bar, have a conversation with your model and get explanations about a function, ask it to write tests, ask for a refactor and much more.

Other features

Works online or offline
Highly configurable api endpoints for fim and chat
Conforms to the OpenAI API standard
Single or multiline fill-in-middle completions
Customisable prompt templates to add context to completions
Generate git commit messages from staged changes (CTRL+SHIFT+T CTRL+SHIFT+G)
Easy installation via vscode extensions marketplace or by downloading and running a binary directly
Customisable settings to change API provider, model name, port number and path
Ollama, llamacpp, oobabooga and LM Studio API compatible
Accept code solutions directly to editor
Create new documents from code blocks
Copy generated code solution blocks
Chat history preserved per workspace

🚀 Getting Started

With Ollama

Install the VS code extension link (or if VSCodium)
Twinny is configured to use Ollama by default as the backend, you can install Ollama here: ollama
Choose your model from the library (eg: codellama:7b)

ollama run codellama:7b-instruct
ollama run codellama:7b-code

Open VS code (if already open a restart might be needed) and press ctr + shift + T to open the side panel.

You should see the 🤖 icon indicating that twinny is ready to use.

See Keyboard shortcuts to start using while coding 🎉

With llama.cpp / LM Studio / Oobabooga / LiteLLM or any other provider.

Install the VS code extension link (or if VSCodium)
Get llama.cpp / LM Studio / Oobabooga / LiteLLM
Download and run the model locally using the chosen provider
Open VS code (if already open a restart might be needed) and press ctr + shift + T to open the side panel.
From the top ⚙️ icon open the settings page and in the Api Provider panel change from ollama to llamacpp (or others respectively).
Update the settings for chat provider, port and hostname etc to be the correct. Please adjust carefully for other providers.
In the left panel you should see the 🤖 icon indicating that twinny is ready to use.
See Keyboard shortcuts to start using while coding 🎉

With other providers

Twinny supports the OpenAI API specification so in theory any provider should work as long as it supports the specification.

The easiest way to use OpenAI API through twinny is to use LiteLLM as your provider as a local proxy, it works seamlessly if configured correctly.

If you find that isn't the case please open an issue with details of how you are having problems.

Note!

The option for chat model name and fim model name are only applicable to Ollama and Oobabooga providers.

Model support

Twinny chat works with any model as long as it can run on your machine or in the cloud and it exposes a OpenAI API compliant endpoint.

Choosing a model is influenced a lot by the machine it will be running, a smaller model will give you a faster response but with a loss in accuracy.

There are two functionalities that twinny are expecting from a model:

Models for Chat

Among LLM models, there are models called "instruct models", which are designed for a question & answer mode of chat.

All instruct models should work for chat generations, but the templates might need editing if using something other than codellama (they need to be updated with the special tokens).

For computers with a good GPU, use: deepseek-coder:6.7b-base-q5_K_M (or any other good instruct model).

Models for FIM (fill in the middle) completions

For FIM completions, you need to use LLM models called "base models". Unlike instruct models, base models will only try to complete your prompt. They are not designed to answer questions.

If using Llama the model must support the Llama special tokens.

For computers with a good GPU, use: deepseek-coder:base or codellama-code (or any other good model that is optimised for code completions).
For slower computers or computers using only CPU, use deepseek-coder:1.3b-base-q4_1 (or any other small base model).

Keyboard shortcuts

Shortcut	Description
`ALT+\`	Trigger inline code completion
`CTRL+SHIFT+/`	Stop the inline code generation
`Tab`	Accept the inline code generated
`CTRL+SHIFT+t`	Open twinny sidebar
`CTRL+SHIFT+t CTRL+SHIT+g`	Generate commit messages from staged changes

Workspace context

In the settings there is an option called useFileContext this will keep track of sessions, keystrokes, visits and recency of visited files in the current workspace. This can be enabled to help improve the quality of completions, it's turned off by default.

Known issues

If the server settings are incorrectly set chat and fim completion will not work, if this is the case please open an issue with your error message.
Sometimes a restart of vscode is required for new settings to take effect, please open an issue if you are having problems with this.
Using file context often causes unreliable completions for FIM because small models get confused when provided with more than one file context.
See open issues on github to see any known issues that are not yet fixed.
LiteLLM fim template needs invetigation

If you have a problem with Twinny or have any suggestions please report them on github issues. Please include your vscode version and OS details in your issue.

Contributing

We are actively looking for contributors who want to help improve the project, if you are interested in helping out please reach out on twitter.

Contributions are welcome please open an issue describing your changes and open a pull request when ready.

This project is under MIT licence, please read the LICENSE file for more information.

Disclaimer

This plugin is provided "as is" and is under active development. This means that at times it may not work fully as expected.

marcusgreen / twinny