There are 3 repositories under avx-512 topic.
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks
A curated list of awesome SIMD frameworks, libraries and software
A general purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).
(REOS) Radar and ElectroOptical Simulation Framework written in Fortran.
The fastest Run-Length-Encoding on the Planet (for x64)
Algorithms for matrix matrix multiplication, dgemm, AVX-256, AVX-512
Benchmark to show which is the fastest memcpy.
Utility that was used to generate initial Go AVX-512 encoder test suite.
Running GPGPU-like kernels on CPU with auto-vectorization for SSE/AVX/AVX512 SIMD Architectures
Document Level Sentiment Analysis is an End-to-End deep learning workflow using Hugging Face transformers API to do a "classification" task at document level, to analyze the sentiment of input document containing English sentences or paragraphs.
A generic and efficient SIMD implementation of MSB Radix Sort with separate key and payload datastreams that supports arbitrary key and payload data types written in C++ accompanied by a bachelor's thesis.
Implementation of Hierarchy Oblivious Algorithms
Data for Intel Xeon-Phi server used in PyAF tests
Projects and annotations used to learn x64 assembly.
An implementation of Google's Encoded Polyline algorithm in AVX512 because why not. Perhaps the fastest and least portable polyline encoder out there?
Fast Fourier Transform implementation though x86 AVX-512 SIMD extension
Matilda is a library to repeatedly multiply a constant matrix with a variable vector
Experimental speed-oriented DEFLATE implementation, based on AVX-512
The Tomato Patch FFT is the fastest FFT in the world- but it is by no means efficient.
Zbynek's various C and C++ experiments
Design of the Fast-Orbit Feedback correction for SESAME's accelerator
Some loose performance experiments with Agner Fog's VCL