iree-org / iree-nvgpu

iree-org/iree-nvgpu Issues

[Epic] Production integration of cuBLAS, cuDNN, and Triton
Closed a year ago6
[cuDNN] Add lowering from cudnn.relu + relu custom call
Closed a year ago2
[Triton] Compute capability for hal.executable.source must match IREE cuda SM version
Updated a year ago
[Triton] Construct stream.executable from Triton functions
Updated a year ago
[Triton] Use nested passes to lower Triton to HAL executable
Updated a year ago
[cuDNN] Use semaphores to synchronize before/after cuDNN operations
Updated a year ago
[cuDNN] cuDNN module should create a dedicated CUDA stream
Updated a year ago
[Triton] Add minimal matmul example (no auto tuning)
Updated a year ago
[Triton] Triton executables should support shared memory
Updated a year ago
[Triton] Do not use temp files for passing PTX to HAL executable
Updated a year ago
[Triton] Block dimension should be inferred from the Triton function
Updated a year ago
[Triton] num-warps and num-stages should be a property of triton.executable.export
Updated a year ago
[cuDNN] Use destination-passing style @cudnn.execute API
Closed a year ago
Set up continuous integration for OpenXLA Nvgpu project
Updated a year ago3
[RFC] First class Triton support in OpenXLA Nvgpu
Updated a year ago6
[RFC] Integration with cuDNN via IREE compiler/runtime plugins
Updated a year ago1