There are 5 repositories under avx2 topic.
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window
Fast inference engine for Transformer models
Implementations of SIMD instruction sets for systems which don't natively support them.
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, Redpanda, YDB and StarRocks
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension, LoongArch64, POWER. Part of Node.js, WebKit/Safari, Ladybird, Chromium, Cloudflare Workers and Bun.
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
C++ template library for high performance SIMD based sorting algorithms
Native Go version of HighwayHash with optimized assembly implementations on Intel and ARM. Able to process over 10 GB/sec on a single core on Intel CPUs - https://en.wikipedia.org/wiki/HighwayHash
Library for specialized dense and sparse matrix operations, and deep learning primitives.
Fastest Integer Compression
C++ SIMD Noise Library
SIMD-accelerated UTF-8 validation for Rust.
Repo to serve AVX2 Windows builds of Thorium. https://github.com/Alex313031/Thorium/
🚀 Fast C/C++ bit population count library
SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html
Agenium Scale vectorization library for CPUs and GPUs
Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!
TurboRLE-Fastest Run Length Encoding
SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification
Chromium browser compiled with the Clang/LLVM compiler.
Examples of C# code compiled to GPU by hybridizer
Boost SIMD
Node.js implementation of HighwayHash, Google's fast and strong hash function
Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.