There are 1 repository under triton topic.
Efficient Triton Kernels for LLM Training
A service for autodiscovery and configuration of applications running in containers
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.
🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.
Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
LLVM based static binary analysis framework
🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR and High Performance Computing (HPC) projects.
A performance library for machine learning applications.
Ozoz dotfiles for bspwm, i3WM
ClearML - Model-Serving Orchestration and Repository Solution
Resources About Dynamic Binary Instrumentation and Dynamic Binary Analysis
NVIDIA-accelerated, deep learned model support for image space object detection
NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
Deploy DL/ ML inference pipelines with minimal extra code.
Triton implementation of FlashAttention2 that adds Custom Masks.
Three examples of recommendation system pipelines with NVIDIA Merlin and Redis
:whale: Scripps Whale Acoustics Lab :earth_americas: Scripps Acoustic Ecology Lab - Triton with remoras in development
A step-by-step guide to setting up Nvidia GPUs with CUDA support running on Docker (and Compose) containers on NixOS host
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.
Symbolic debugging tool using JonathanSalwan/Triton
COIN Attacks: on Insecurity of Enclave Untrusted Interfaces in SGX - ASPLOS 2020