Performance: PGO applicability
zamazan4ik opened this issue · comments
Description
I suggest adding PGO (+ BOLT) support to the project. According to my local tests, I have reached with sky-bench -q1000000 -r20
the following results (local runs on Apple Macbook M1 Pro):
Without PGO (default release build):
===========RESULTS===========
SET 140810.697866/sec
UPDATE 146618.948096/sec
GET 137469.544488/sec
=============================
With PGO:
===========RESULTS===========
SET 153383.657977/sec
UPDATE 156045.351506/sec
GET 159065.787814/sec
=============================
Results are a little bit unstable but the PGO version is always more performant than non-PGO. BOLT (LLVM BOLT) also could help here but I haven't tested it yet.
Suggested solutions
- Optimize provided to the users' binaries with PGO and BOLT (if any)
- Write a note in the project about optimizing Skytable with PGO to gain even more performance
More PGO-related benchmark results (including many databases like Redis, PostgreSQL, ClickHouse) are available here - https://github.com/zamazan4ik/awesome-pgo .
Thanks for the note. We can surely incorporate this into the build pipeline (although it'll need a bunch of changes).
@ohsayan Do you have updates regarding PGO in Skytable?
Since the integration into the build pipelines could take some time, I suggest at least writing a note somewhere in the documentation regarding PGO and Skytable. So the users/maintainers will be aware of this way to achieve better performance with Skytable and will be able to recompile Skytable according to their workloads. Here are some examples:
- ClickHouse: https://clickhouse.com/docs/en/operations/optimizing-performance/profile-guided-optimization
- Databend: https://databend.rs/doc/contributing/pgo
- Vector: https://vector.dev/docs/administration/tuning/pgo/
- Nebula: https://docs.nebula-graph.io/3.5.0/8.service-tuning/enable_autofdo_for_nebulagraph/
- GCC: Official docs, section "Building with profile feedback" (even AutoFDO build is supported)
- Clang: