triton

There are 1 repository under triton topic.

linkedin / Liger-Kernel
Efficient Triton Kernels for LLM Training
finetuning gemma2 llama llama3 llm-training llms mistral phi3 triton triton-kernels
Language:Python 4267
ELS-RD / kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
cuda cuda-kernel pytorch transformer triton
Language:Jupyter Notebook 1551
TritonDataCenter / containerpilot
A service for autodiscovery and configuration of applications running in containers
consul containerpilot containers docker joyent orchestration service-discovery triton
Language:Go 1130
thu-ml / SageAttention
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
attention inference-acceleration llm quantization cuda triton video-generation
Language:Cuda 906
JonathanSalwan / Tigress_protection
Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.
deobfuscation llvm reverse-engineering solution-tigress-challenge symbolic-execution taint-analysis tigress tigress-protections triton
Language:LLVM 821
coderonion / awesome-llm-and-aigc
🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.
chatgpt gpt computer-vision large-language-models llm awesome-list llama aigc langchain hugging-face sora openai datasets yolo triton cuda vlm vla deepseek deepseek-v3
580
BobMcDear / attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
cuda deep-learning machine-learning pytorch triton openai openai-triton
Language:Python 511
JafarAkhondali / acer-predator-turbo-and-rgb-keyboard-linux-module
Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series
turbo acer predator helios triton led linux rgb rgb-led hacktoberfest
Language:C 409
FlagOpen / FlagGems
FlagGems is an operator library for large language models implemented in Triton Language.
pytorch triton triton-kernels
Language:Python 407
d4em0n / exrop
Automatic ROPChain Generation
rop exploitdev ctf binary-exploitation reverse-engineering exploit-development pwn triton rop-chain rop-gadgets rop-exploitation symbolic-execution
Language:Python 280
opendilab / DI-hpc
OpenDILab RL HPC OP Lib, including CUDA and Triton kernel
reinforcement-learning cuda hpc lstm pytorch triton
Language:Python 225
rkinas / triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
cuda triton
Language:Python 224
SQLab / symgdb
SymGDB - symbolic execution plugin for gdb
gdb gdb-plugin symbolic-execution triton
Language:Python 215
Colton1skees / Dna
LLVM based static binary analysis framework
analysis binary deobfuscation instruction-semantics lifter program-analysis triton llvm llvm-ir static-analysis x86 x86-64
Language:C++ 209
coderonion / awesome-cuda-triton-hpc
🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR and High Performance Computing (HPC) projects.
cuda cublas tensorrt awesome llm gpu blas pytorch hpc gemm llama cudnn triton tensorrt-llm cutlass mlir blis tvm deepseek ptx
191
trident
kakaobrain / trident
A performance library for machine learning applications.
ai library performance triton deep-learning python pytorch machine-learning
Language:Python 183
mmsaeed509 / bspwm-dots
Ozoz dotfiles for bspwm, i3WM
archlinux bspwm polybar rofi arch dotfiles i3wm linux acer helios neovim predator triton turbo neofetch exodiaos
Language:Shell 164
clearml / clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
machine-learning mlops devops deep-learning kubernetes ai clearml model-serving serving serving-pytorch-models serving-ml tensorflow-serving triton triton-inference-server
Language:Python 141
novioleo / Savior
(WIP)The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework for algorithm service that ensures reliability, high concurrency and scalability of services.
workflow deeplearning deployment triton rpa distributed
Language:Python 137
alphaSeclab / DBI-Stuff
Resources About Dynamic Binary Instrumentation and Dynamic Binary Analysis
dynamorio drmemory intelpin valgrind triton dynamic-binary-instrumentation dynamic-binary-analysis frida manticore adbi qbdi
132
NVIDIA-ISAAC-ROS / isaac_ros_object_detection
NVIDIA-accelerated, deep learned model support for image space object detection
ros2 object-detection inference deep-learning nvidia triton machine-learning tensorrt ros2-humble ros gpu jetson
Language:C++ 132
NVIDIA-ISAAC-ROS / isaac_ros_dnn_inference
NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
ros dnn tensorrt triton triton-inference-server tensorrt-inference tao deeplearning deep-learning nvidia ai ros2-humble ros2 gpu jetson
Language:C++ 109
notAI-tech / fastDeploy
Deploy DL/ ML inference pipelines with minimal extra code.
deep-learning tensorflow-serving tf-serving pytorch serving falcon gevent docker model-deployment model-serving http-server gunicorn torchserve triton-inference-server python triton triton-server inference-server streaming-audio websocket
Language:Python 96
alexzhang13 / flashattention2-custom-mask
Triton implementation of FlashAttention2 that adds Custom Masks.
attention attention-mechanism cuda-kernels deep-learning flash-attention flash-attention-2 triton triton-lang
Language:Python 91
triton / triton
Triton Operating System
atomic atomic-updates declarative-language linux linux-distribution nix nixos nixpkgs operating-system packages systemd triton
Language:Nix 66
ergrelet / triton-bn
Binary Ninja plugin that can be used to apply Triton's dead store eliminitation pass on basic blocks or functions.
binary-ninja binary-ninja-plugin cpp deobfuscation reverse-engineering triton
Language:C++ 58
redis-developer / redis-nvidia-recsys
Three examples of recommendation system pipelines with NVIDIA Merlin and Redis
recommendation recommender-system dlrm redis triton vector-database vector-search nvidia-merlin
Language:PureBasic 57
hyperai / triton-cn
Triton Documentation in Chinese Simplified / Triton 中文文档
chinese-simplified deep-learning documentation gpu machine-learning nvidia openai translation triton
Language:TypeScript 52
kyegomez / EXA-1
An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries!
artificial-intelligence gpt4 kosmos multimodal multimodality pytorch dataset jax large-dataset large-language-models multimodal-data pytorch-implementation triton
Language:Jupyter Notebook 42
MarineBioAcousticsRC / Triton
:whale: Scripps Whale Acoustics Lab :earth_americas: Scripps Acoustic Ecology Lab - Triton with remoras in development
mbarc triton whale scripps detection classification machine-learning marine-species acoustics sound audio-processing
Language:MATLAB 41
suvash / nixos-nvidia-cuda-python-docker-compose
A step-by-step guide to setting up Nvidia GPUs with CUDA support running on Docker (and Compose) containers on NixOS host
nixos nvidia nvidia-docker nvidia-cuda nvidia-smi docker-compose python deep-learning pytorch jax tensorflow triton
Language:Dockerfile 39
DeepAuto-AI / hip-attention
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
attention attention-mechanism openai-triton sub-quadratic-attention triton hip-attention
Language:Python 35
Lallapallooza / fast-audiomentations
⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.
audio audio-augmentation audio-data-augmentation audio-effects augmentations data-augmentation dsp gpu machine-learning python pytorch triton
Language:Python 33
dame-cell / Triformer
Transformers components but in Triton
gpt2 transformer-architecture triton
Language:Python 31
cosine0 / amphitrite
Symbolic debugging tool using JonathanSalwan/Triton
triton symbolic-execution reverse-engineering debugging-tool wrapper
Language:Python 25
mustakimur / COIN-Attacks
COIN Attacks: on Insecurity of Enclave Untrusted Interfaces in SGX - ASPLOS 2020
sgx-sdk intel-sgx triton symbolic-execution
Language:C++ 25

triton

linkedin / Liger-Kernel

ELS-RD / kernl

TritonDataCenter / containerpilot

thu-ml / SageAttention

JonathanSalwan / Tigress_protection

coderonion / awesome-llm-and-aigc

BobMcDear / attorch

JafarAkhondali / acer-predator-turbo-and-rgb-keyboard-linux-module

FlagOpen / FlagGems

d4em0n / exrop

opendilab / DI-hpc

rkinas / triton-resources

SQLab / symgdb

Colton1skees / Dna

coderonion / awesome-cuda-triton-hpc

kakaobrain / trident

mmsaeed509 / bspwm-dots

clearml / clearml-serving

novioleo / Savior

alphaSeclab / DBI-Stuff

NVIDIA-ISAAC-ROS / isaac_ros_object_detection

NVIDIA-ISAAC-ROS / isaac_ros_dnn_inference

notAI-tech / fastDeploy

alexzhang13 / flashattention2-custom-mask

triton / triton

ergrelet / triton-bn

redis-developer / redis-nvidia-recsys

hyperai / triton-cn

kyegomez / EXA-1

MarineBioAcousticsRC / Triton

suvash / nixos-nvidia-cuda-python-docker-compose

DeepAuto-AI / hip-attention

Lallapallooza / fast-audiomentations

dame-cell / Triformer

cosine0 / amphitrite

mustakimur / COIN-Attacks