Vectorch's repositories
ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
Language:C++Apache-2.0000
chatbot-ui
An open source ChatGPT UI.
Language:TypeScriptMIT000
FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++Apache-2.0000
flash-attention
Fast and memory-efficient exact attention
Language:PythonBSD-3-Clause000
flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:CudaApache-2.0000
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Language:RustApache-2.0000
vcpkg
C++ Library Manager for Windows, Linux, and MacOS
Language:CMakeMIT000
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:PythonNOASSERTION000