OctoAI

octoml

Optimizing machine learning using machine learning

Seattle

OctoAI's repositories

octoml-profile

Home for OctoML PyTorch Profiler

Apache-2.0104 5 2

triton-client-rs

A client library in Rust for Nvidia Triton.

Language:RustApache-2.023 37 2

octoml-llm-qa

A code sample that shows how to use 🦜️🔗langchain, 🦙llama_index and a hosted LLM endpoint to do a standard chat or Q&A about a pdf document

Language:Python17 31 3

relax

A fork of tvm/unity

Language:PythonApache-2.016 70

tvm2onnx

An open-source tool created by OctoML that converts TVM-optimized models to code runnable in ONNX Runtime.

Language:PythonApache-2.014 32 1

dockercon23-octoai

DockerCon 2023 OctoAI AI/ML Workshop GitHub Repo

Language:Jupyter Notebook8 30

qualcomm

Language:CApache-2.08 40 1

mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language:PythonApache-2.05 1 16

octoai-apps

A collection of OctoAI-based demos.

Language:TypeScriptApache-2.05 28 1

hackathon-2023-rag

OctoAI 2023 Llama2 RAG demos

Language:Python400

ck

Language:PythonNOASSERTION3 30

octoai-template-apps

Language:Python3 10

inference

Reference implementations of MLPerf™ inference benchmarks

Language:PythonNOASSERTION2 10

octoai-cartoonizer

Cartoonizer demo for OctoAI compute service launch

Language:Python1 10

octoai-launch-examples

Examples of how to build Generative AI applications powered by the OctoAI compute service.

Language:Jupyter Notebook1 40

octocloud-templates

Language:Python1 270

langchain

⚡ Building applications with LLMs through composability ⚡

Language:PythonMIT010

archived_vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.0000

fastapi-examples

Language:PythonApache-2.0030

homebrew-tap

Homebrew Tap of OctoML products and tools.

Language:RubyApache-2.0040

imagen-python-examples

Language:Python030

octoai-octoshop

OctoAI's OctoShop! Transform photos with the power of words and generative AI!

Language:PythonApache-2.0000

power-dev

Dev repo for power measurement for the MLPerf™ benchmarks

Language:PythonNOASSERTION010

relax-all

A fork of tvm/unity

Language:PythonApache-2.0000

se-tech-screen

Language:Python030

stable-diffusion-webui-docker

Easy Docker setup for Stable Diffusion with user-friendly UI

Language:DockerfileNOASSERTION010

TensorRT-LLM-release

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0010

triton-inference-server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Language:PythonBSD-3-Clause000

use-whisper

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

Language:TypeScriptMIT010

web-llm

Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.

Language:PythonApache-2.0000