Together

togethercomputer

Together's repositories

OpenChatKit

Language:PythonApache-2.08995 122 97

RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Language:PythonApache-2.04397 77 86

stripedhyena

Repository for StripedHyena, a state-of-the-art beyond Transformer architecture

Language:PythonApache-2.0231 5 6

redpajama.cpp

Extend the original llama.cpp repo to support redpajama model.

Language:CMIT117 20

Llama-2-7B-32K-Instruct

Language:PythonApache-2.082 4 3

llamaindex-chatbot

A RAG Chatbot with Next.js, Together.ai and Llama Index

Language:TypeScript38 3 1

together-python

The Official Python Client for Together's API

Language:PythonApache-2.012 5 5

Quick_Deployment_HELM

Language:Python6 20

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonApache-2.05 10

transformers_port

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.05 10

FT_Llama2

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.03 10

UniversalSD

Universal Stable Diffusion Pipeline(s) with Flash Attention

Language:Python3 10

examples

repository of example scripts, notebooks, projects

Language:Jupyter Notebook2 30

together-worker

utilities for workers

Language:Python2 4 1

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1 10

FT_Redpajama

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.01 1 1

H3

Together port to run H3

Language:AssemblyApache-2.01 10

langchain

⚡ Building applications with LLMs through composability ⚡

Language:PythonMIT1 10

llm-awq-ttgi

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonMIT1 10

native_hf_models-slim

Language:Python1 10

together-chat

Streamlit Component, for a Chatbot UI

Language:PythonMIT1 10

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.01 10

vllm-ttgi

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.01 10

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.0010

helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

Language:PythonApache-2.0010

js-eventsource

EventSource client for Node.js and Browser (polyfill)

Language:JavaScriptMIT010

lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

Language:PythonMIT010

openapi

An OpenAPI API specifications file describes an API in its entirety and are typically written in YAML or JSON. including: Endpoints: which are available (/users) and operations on each (GET /users, POST /users) Authentication methods. Operation parameters for each operation (Input and output)

MIT03 2

Sequoia

scalable and robust tree-based speculative decoding algorithm

000

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0010