ptx

There are 0 repository under ptx topic.

less_slow.cpp
ashvardanian / less_slow.cpp
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
google-benchmark cpp cpp17 gcc llvm benchmark cpp-programming tutorial tutorials assembly hpc avx512 coroutines ranges cpp20 assembly-language linux-kernel cuda io-uring ptx
Language:C++ 1870
m4rs-mt / ILGPU
ILGPU JIT Compiler for high-performance .Net GPU programs
amd cil compiler cpu cuda dotnet gpgpu gpgpu-computing gpu ilgpu intel jit kernels msil nvidia opencl parallel ptx
Language:C# 1629
tpoisonooo / how-to-optimize-gemm
row-major matmul optimization
gemm-optimization armv7 arm64 cuda cuda-kernel ptx vulkan int4
Language:C++ 684
coderonion / awesome-cuda-and-hpc
🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.
cuda cublas tensorrt awesome llm gpu blas pytorch hpc gemm llama cudnn triton tensorrt-llm cutlass mlir tvm deepseek ptx vlm
397
SunsetQuest / CudaPAD
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
cuda cuda-programming gpu nvidia ptx ptx-utils windows
Language:C# 124
zamaudio / ptformat
Free software file format parser for Avid ProTools sessions
protools session interoperability ardour ptf ptx
Language:C++ 81
ProjectPhysX / PTXprofiler
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
cuda gpu gpu-acceleration gpu-computing gpu-programming hpc opencl profiler ptx ptx-utils roofline-model sycl nvidia nvidia-cuda nvidia-gpu
Language:C++ 56
Energinet-SimTools / MTB
Energinets Model Testbench. Automate gridcompliance studies in PSCAD and Powerfactory.
powerfactory powersystem-simulation powersystems pscad renewable-energy solar-energy wind-energy gridcompliance ptx green-transition power-electronics power2x high-voltage hvdc powergrid dcc generator rfg
Language:Python 46
bikrammajhi / 100-days-of-GPU
This is my 🔥 100 Days of GPU — a wild, hands-on journey through CUDA/CUTLASS kernels, Triton spells, and PTX sorcery.
cuda nsight-compute ptx triton cutlass mojo thunderkittens
Language:HTML 34
lennyerik / cutransform
CUDA kernels in any language supported by LLVM
cuda gpgpu nvidia c llvm rust gpu-compute llvm-ir ptx zig
Language:Rust 30
wu-kan / GoPTX
GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving
compile data-hazard gpu ilp kernel-fusion ptx warp-stall
Language:HTML 18
jjfumero / tornadovm-examples
Set of examples written for hardware acceleration via TornadoVM
tornadovm gpuprogramming fpga-programming gpu fpga parallel-computing spirv jvm java opencl ptx
Language:Java 17
rust-accel / nvptx
Compile Rust into PTX
build-tool llvm ptx
Language:Rust 14
jhson989 / cuda-ptx
Inline PTX Assembly in CUDA example
cuda ptx parallel-computing matrix-multiplication
Language:Cuda 13
akrolik / rNdN
Optimizing GPU compiler and database system for NVIDIA hardware
ptx cuda gpu compiler horseir sass assembler database sql
Language:C++ 12
VeriBlock / nodecore-pow-cuda-miner
VeriBlock CUDA PoW Miner
cuda ptx vblake veriblock
Language:Cuda 9
sdiehl / gpu-offload
Compile MLIR to PTX and execute it on NVIDIA GPUs
cuda gpu mlir ptx
Language:Jupyter Notebook 8
madssakre / blOCh
Bloch's equations and Optimal Control for MRI and NMR applications
mri nmr ptx parallel-computing rf-pulses
Language:MATLAB 6
dabosch / FastPtx
FastPtx: a python pTx pulse design tool for freely optimizing RF and gradient pulses with autodifferentiation
mri ptx pulse-design
Language:Python 5
gkrls / vscode-ptx-syntax
Visual Studio Code extension with PTX assembly syntax support
ptx vscode-extension syntax-highlighting
4
DefTruth / PTX-ISA-8.2-zh
🎉持续更新：CUDA 12.2 PTX-ISA-8.2学习笔记，部分中文翻译 + 个人理解 + 内联汇编示例，讲解CUDA 12.2 PTX-ISA-8.2 汇编指令；进行中.....
asm cpp cuda ptx
3
minchao / go-ptx
公共運輸整合資訊流通服務平臺（Public Transport Data eXchange，PTX）的非官方 Golang 用戶端程式庫
ptx go-library swagger
Language:Go 3
romnn / nvbit-rs
Rust bindings to the NVIDIA NVBIT binary instrumentation API
cuda ffi gpgpu instrumentation nvbit nvidia profiling ptx rust sass tracing
Language:Rust 3
bestpika / bike
youbike ubike ptx
Language:HTML 2
danwyvra / linux-ptx
compile linux to PTX / nvidia ISA and run on a single GPU core sequentially or parallelize the code and run on multiple; check feasibility
compiler gpu linux llvm nvidia ptx research
2
witty3235 / ptformat
Free software file format parser for Avid ProTools sessions
ptf protools avid ptx
Language:C++ 2
maawad / PTX_BCHT
Bucketed Cuckoo hash set written in PTX and JIT-compiled.
cuckoo hash ptx gpu cuda hashset
Language:C++ 1
MetaMachines / mm-ptx-py
PTX Inject and Stack PTX for Python
cuda gpu-programming nvidia ptx ptxinject stackptx
Language:C 1
parnox / unsloth-notes
Unsloth Puzzle 2-16. Notes and indications of progress. Currently: 25 points
ptx puzzle pytorch triton unsloth
1
strboul / kwic-ts
kwic permuted-index ptx text-processing typescript
Language:TypeScript 1
cs550-epfl / report
EPFL CS-550 project report
cuda gpu memory-consistency ptx simt formal-verification
Language:TeX 0
cs550-epfl / review
Review of the paper A Formal Analysis of the NVIDIA PTX Memory Consistency Model
cuda formal-verification gpu memory-consistency ptx simt
Language:TeX 0
jinekgames / PTX4CPU
PTX interpreter which lets you run CUDA code on CPU
cuda emulator interpreter ptx gpgpu gpgpu-computing
Language:C++ 0
JNPRAutomate / junos-software-upgrades
Repository built for community contributions for upgrading Junos OS.
acx ex mx nfx ptx qfx srx
Language:Jinja 0
MetaMachines / mm-ptx
PTX Inject and Stack PTX
assembly c c99 cpp cuda gpu machine-learning non-linear-dynamics ptx scientific-computing
Language:C
Nagharjun17 / MLIR-to-PTX-CUDA
Creating an MLIR dialect that fuses Addition + ReLU, lowers to NVVM and LLVM IR and generates PTX to run the kernel on CUDA GPU
cpp cuda deep-learning llvm mlir ptx
Language:C++

ptx

ashvardanian / less_slow.cpp

m4rs-mt / ILGPU

tpoisonooo / how-to-optimize-gemm

coderonion / awesome-cuda-and-hpc

SunsetQuest / CudaPAD

zamaudio / ptformat

ProjectPhysX / PTXprofiler

Energinet-SimTools / MTB

bikrammajhi / 100-days-of-GPU

lennyerik / cutransform

wu-kan / GoPTX

jjfumero / tornadovm-examples

rust-accel / nvptx

jhson989 / cuda-ptx

akrolik / rNdN

VeriBlock / nodecore-pow-cuda-miner

sdiehl / gpu-offload

madssakre / blOCh

dabosch / FastPtx

gkrls / vscode-ptx-syntax

DefTruth / PTX-ISA-8.2-zh

minchao / go-ptx

romnn / nvbit-rs

bestpika / bike

danwyvra / linux-ptx

witty3235 / ptformat

maawad / PTX_BCHT

MetaMachines / mm-ptx-py

parnox / unsloth-notes

strboul / kwic-ts

cs550-epfl / report

cs550-epfl / review

jinekgames / PTX4CPU

JNPRAutomate / junos-software-upgrades

MetaMachines / mm-ptx

Nagharjun17 / MLIR-to-PTX-CUDA