WojciechMula / sse4-strstr

SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification

Home Page:http://0x80.pl/articles/simd-strfind.html

Repository from Github https://github.comWojciechMula/sse4-strstrRepository from Github https://github.comWojciechMula/sse4-strstr

SIMD-friendly algorithms for substring searching

Sample programs for article "SIMD-friendly algorithms for substring searching" (http://0x80.pl/articles/simd-strfind.html).

The root directory contains C++11 procedures implemented using intrinsics for SSE, SSE4, AVX2, AVX512F, AVX512BW and ARM Neon (both ARMv7 and ARMv8).

The subdirectory original contains 32-bit programs with inline assembly, written in 2008 for another article.


To run unit and validation tests type make test_ARCH, to run performance tests type make run_ARCH. Value ARCH selectes the CPU architecture:

  • sse4,
  • avx2,
  • avx512f,
  • avx512bw,
  • arm,
  • aarch64.

Performance results

The subdirectory results contains raw timings from various computers.


SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification


License:BSD 2-Clause "Simplified" License


Language:C++ 81.3%Language:C 13.4%Language:Makefile 3.8%Language:Python 1.5%Language:Shell 0.0%