Within the scope of a research project, conducted under the guidance of Massimo Coluzzi, a framework in Java was developed for benchmarking state-of-the-art consistent hashing algorithms. In order to explore the performance of these algorithms using different languages, I was tasked with creating a version in C++. This version currently encompasses only a partial selection of the technologies considered in the ISIN framework. However, the tool has been designed adaptively to easily implement new algorithms and benchmarks. The software remains consistent with the behavior of its Java counterpart. It is designed to operate via a command-line interface and to save benchmark results in a CSV file, utilizing YAML files for parameter configuration.
These algorithms play a crucial role in distributed systems and represent some of the most famous methods in state-of-the-art of consistent hashing:
-
Jump by Lamping et al. (2014)
-
Anchor by Mendelson et al. (2020)
-
Dx by Dong et al. (2021)
-
Power by Leu et al. (2023)
-
Memento by Coluzzi et al. (2023)
Some engines are not yet implemented in this project but are already configured for execution in the benchmark routine:
-
Ring by Karger et al. (1997)
-
Rendezvous by Thaler et al. (1998)
-
Multi-probe by Appleton et al. (2015)
-
Maglev by Eisenbud et al. (2016)
As outlined in "Consistently Faster: A Survey and Fair Comparison of Consistent Hashing Algorithms" by Coluzzi et al. (2023), here is a concise overview of the benchmarks utilized:
-
Balance: The ability of the algorithm to spread the keys evenly across the cluster nodes.
-
Initialization Time: The time the algorithm requires to initialize its internal structure.
-
Lookup Time: The time the algorithm needs to find the node a given key belongs to.
-
Memory Usage: The amount of memory the algorithm uses to store its internal structure.
-
Monotonicity: The ability of the algorithm to move the minimum amount of resources when the cluster scales.
-
Resize Balance: The ability of the algorithm to keep its balance after adding or removing nodes.
-
Resize Time: The time the algorithm requires to reorganize its internal structure after adding or removing nodes.
The format of the configuration file is described in detail in the configs/template.yaml
file. The tool will use the configs/default.yaml
file that represents the default configuration if no configuration file is provided.
Figure 1 shows a UML sequence diagram to explain how the benchmark routine procedure works.
Figure 1: Exploring the control flow of the benchmark routine. |
-
Clone the repository and navigate to the cloned repository:
git clone https://github.com/robertovicario/cpp-consistent-hashing-algorithms.git cd cpp-consistent-hashing-algorithms
-
Run repository setup:
-
vcpkg:
# Ensure scripts has executable permissions: # chmod +x repo.sh ./repo.sh
-
CMake:
# Ensure scripts has executable permissions: # chmod +x cmake.sh ./cmake.sh
-
-
Build the project with Ninja:
cd build ninja
-
Start the framework:
-
Default configuration:
./main
-
Custom configuration:
./main <your_config>.yaml
-
-
Navigate to
build/tmp/
and check theresults.csv
file.
-
Insert the algorithm name into any configuration file located in
configs/
. -
Implement your algorithm in
Algorithms/your_algo/
. Keep in mind that the system employs C++ templates to integrate the algorithms into the loop. -
Integrate a new execution routine into
Main.cpp
. Append a newelse if
branch and incorporate your engine using:/* * NEW_ALGORITHM */ execute<YourEngine>(handler, yaml, "your_algo");
If your engine requires additional parameters, include them as follows:
/* * NEW_ALGORITHM */ execute<YourEngine>(handler, yaml, "your_algo", param1, param2, ..., paramN);
-
Insert the benchmark name into any configuration file located in
configs/
. -
Implement the benchmark in
Benchmarks/
. Note that the system utilizes C++ templates for benchmark integration into the loop. -
Integrate a new benchmark routine into
Benchmarks/Routine.hpp
. Append a newelse if
branch and incorporate your engine using:/* * NEW_BENCHMARK */ printInfo(l, algorithm, benchmark, hashFunction, initNodes, iterationsRun); results[l] = computeYourBenchmark<Engine>(yaml, algorithm, initNodes, args...);
This project is distributed under GNU General Public License version 3. You can find the complete text of the license in the project repository.
Important
- java-consistent-hashing-algorithms:
- Author: SUPSI-DTI-ISIN
- License: GNU General Public License version 3
- Source: GitHub Repository
- cpp-anchorhash:
- Author: anchorhash
- License: The MIT License
- Source: GitHub Repository
- DxHash:
- Author: ChaosD
- License: none
- Source: GitHub Repository
- Supervisor: Amos Brocco @slashdotted
- Student: Roberto Vicario @robertovicario