There are 3 repositories under acceleration topic.
The mouse and trackpad utility for Mac.
A BVH implementation to speed up raycasting and enable spatial queries against three.js meshes.
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
A developer friendly approach for sensors in React Native
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Reduce CPU usage by non-blocking async loop and psychologically speed up in JavaScript
The CDN for developers.
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Display-agnostic acceleration of macOS applications using external GPUs.
The New Official Aparapi: a framework for executing native Java and Scala code on the GPU.
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
A high speed stepper library for Atmega 168/328p (nano), Atmega32u4, Atmega 2560, ESP32, ESP32S2, ESP32S3, ESP32C3, ESP32C6, Atmel SAM Due, Raspberry pi pico and pico 2
volksdep is an open-source toolbox for deploying and accelerating PyTorch, ONNX and TensorFlow models with TensorRT.
[NeurIPS 2022, T-PAMI 2023] Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
A .NET library for hardware-accelerated, high performance, immediate mode rendering via Direct2D.
Pytorch implementation of our paper accepted by CVPR 2020 (Oral) -- HRank: Filter Pruning using High-Rank Feature Map
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
A reading list for deep graph learning acceleration.
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
Benchmarking suite to evaluate 🤖 robotics computing performance. Vendor-neutral. ⚪Grey-box and ⚫Black-box approaches.
A robot-specific processing unit. Contains CPUs, FPGAs and GPUs and maps ROS efficiently to them for best performance.
A free, header-only C++ swarming (flocking) library for real-time applications
VDI Stream Client is a very tiny, low latency and GPU accelerated client to connect to Windows running Parsec Host.
A package for processing signals recorded using wearable sensors, such as Electrocardiogram (ECG), Photoplethysmogram (PPG), Electrodermal activity (EDA) and 3-axis acceleration (ACC).
Curated list of awesome material on optimization techniques to make artificial intelligence faster and more efficient 🚀
Repository to track the progress in model compression and acceleration
Intel Homomorphic Encryption Acceleration Library for FPGAs, including open source implementation of FPGA kernels for accelerating NTT, INTT, Keyswitch and Dyadic Multiplication modular arithmetic operations, FPGA runtime, and host APIs for connecting to third-party homomorphic encryption libraries.
Experimental VP9 codec support for vdpau-va-driver (NVIDIA VDPAU-VAAPI wrapper) and chromium-vaapi
Kernel module for mouse acceleration on Linux!