Nutlope / together-js Node.js SDK

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool Node SDK

An npm library to run open source LLMs through

Current version


npm i together-ai


Create an account at and add the API key in. Then simply run the code snippet below with your preferred AI model and inputs to get back a reply.

import Together from 'together-ai';

const together = new Together({
  auth: process.env.TOGETHER_API_KEY,

const model = 'mistralai/Mixtral-8x7B-Instruct-v0.1';

const result = await together.inference(model, {
  prompt: 'Suggest some fun winter family activities',
  max_tokens: 700,

Streaming with LLMs

If you want to stream, simply specify stream-tokens: true.

const result = await together.inference('togethercomputer/llama-2-70b-chat', {
  prompt: 'Tell me about the history of the United States',
  max_tokens: 1000,
  stream_tokens: true,

Next.js Chat App with streaming

You can see an example of this library being used in a Next.js chat app here:

The code for the example is also available, including code on how to stream the results of the LLM directly to the frontend:

Filtering responses with Llama Guard

You can now use Llama Guard, an LLM-based input-output safeguard model, with models on the platform. To do this, simply add "safety_model": "Meta-Llama/Llama-Guard-7b".

const result = await together.inference('togethercomputer/llama-2-13b-chat', {
  prompt: 'Tell me about San Francisco',
  max_tokens: 1000,
  safety_model: 'Meta-Llama/Llama-Guard-7b',

Popular Supported Models

This is a non-exhaustive list of popular models that are supported.

  • Mixtral Instruct v0.1 (mistralai/Mixtral-8x7B-Instruct-v0.1)
  • Mistral-7B (mistralai/Mistral-7B-Instruct-v0.1)
  • Llama-2 70B (togethercomputer/llama-2-70b-chat)
  • Llama-2 13B (togethercomputer/llama-2-13b-chat)
  • RedPajama 7B (togethercomputer/RedPajama-INCITE-7B-Chat)
  • OpenOrca Mistral (Open-Orca/Mistral-7B-OpenOrca)
  • Alpaca 7B (togethercomputer/alpaca-7b)

How it works

This library uses the Together Inference Engine, the world's fastest inference stack for open source LLMs. It calls the Inference API, specifically their serverless endpoints product, to enable you to use OSS LLMs quickly and effeciently.

About Node.js SDK


Language:TypeScript 100.0%