Together's repositories
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
stripedhyena
Repository for StripedHyena, a state-of-the-art beyond Transformer architecture
redpajama.cpp
Extend the original llama.cpp repo to support redpajama model.
llamaindex-chatbot
A RAG Chatbot with Next.js, Together.ai and Llama Index
together-python
The Official Python Client for Together's API
transformers_port
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
UniversalSD
Universal Stable Diffusion Pipeline(s) with Flash Attention
together-worker
utilities for workers
flash-attention
Fast and memory-efficient exact attention
FT_Redpajama
Transformer related optimization, including BERT, GPT
llm-awq-ttgi
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
together-chat
Streamlit Component, for a Chatbot UI
FasterTransformer
Transformer related optimization, including BERT, GPT
js-eventsource
EventSource client for Node.js and Browser (polyfill)
lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
openapi
An OpenAPI API specifications file describes an API in its entirety and are typically written in YAML or JSON. including: Endpoints: which are available (/users) and operations on each (GET /users, POST /users) Authentication methods. Operation parameters for each operation (Input and output)
Sequoia
scalable and robust tree-based speculative decoding algorithm
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.