ashvardanian / StringZilla

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖

Home Page:https://ashvardanian.com/posts/stringzilla/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Broader benchmarks

ashvardanian opened this issue · comments

Every heuristic has its weaknesses. Current benchmarks could be more helpful in understanding them. The bench.py should be changed to allow command-line arguments for various patterns, and if those aren't provided, it should, by default, cover a diverse set of use cases, printing final results into the console.

I've separated the benchmarks into separate categories - similarity functions, search operations, basic class interfaces and so on. They will support both user-provided input as text file (will be tokenized with 6 ASCII whitespace characters used as delimiters), as well as synthetic runs.