There are 4 repositories under simd-parallelism topic.
PyTurboJPEG is a highly optimized Python wrapper of libjpeg-turbo (TurboJPEG API) which supports x86 and ARM architecture.
DSL for SIMD Sorting on AVX2 & AVX512
Two-dimensional flow solver with GUI using vortex particle and boundary element methods
n-body-simulation performance test suite
GPU-accelerated 3D vortex methods solver with easy GUI
A High Performance C# wrapper that allows you to get the benefits of SIMD Intrinsics on List<T>.
System benchmarks over JVM with JMH - SIMD (superscalar processing), Branch prediction, False sharing.
EinsteinDB is a Hybrid memory system consisting of DRAM and Non-Volatile Memory configured to persist data fast.
(experiments with) pragma-based SIMD C++ types
Optimizing convolution function using ARM's NEON Intrinsics
This repository lists 4 problems solved using C. Each problem has its own serial and parallel implementations. For the latter, the OpenMP API was utilized.
Image filters using SSE Instructions (Streaming SIMD Extensions) of Intel® x86-64 Architecture.
8x speedup of 1D Haar-Transform using intel SIMD intrinsics
An implementation of dot product using CUDA, x64, and SIMD using the integer data type (32-bits) in C Language.
A fast and simple c# hex-decode function using AVX2 and SSSE3 Intel intrinsics.
In this project we change the code of the SmithWaterman algorithm to achive parallel computing with different ways. University project for the course "Parallel Processing". Course Code: CEID_NY408
Computing a function when only its inverse is known, using Newson-Raphson method for 1D,2D,3D arrays in parallel.
AVX SIMD accelerated Julia fractal explorer, 7 beautiful sets
deep learning convolutional neural network implemented with SIMD acceleration (auto-vectorization)
"Byteslice: Pushing the envelop of main memory data processing with a new storage layout" (SIGMOD'15)
C & Assembly optimized version of the Stochastic Gradient Descent x SoftSVM x Polynomial Kernel Method algorithm
CMap2 Top Coder Data Science Marathon Match
Examples of Distributed-Memory Programming with MPI
High Performance Computing exercises