robertovicario / cpp-consistent-hashing-algorithms

C++ implementations of the most popular and best performing consistent hashing algorithms for non-peer-to-peer contexts.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cpp-consistent-hashing-algorithms

Overview

Within the scope of a research project, conducted under the guidance of Massimo Coluzzi, a framework in Java was developed for benchmarking state-of-the-art consistent hashing algorithms. In order to explore the performance of these algorithms using different languages, I was tasked with creating a version in C++. This version currently encompasses only a partial selection of the technologies considered in the ISIN framework. However, the tool has been designed adaptively to easily implement new algorithms and benchmarks. The software remains consistent with the behavior of its Java counterpart. It is designed to operate via a command-line interface and to save benchmark results in a CSV file, utilizing YAML files for parameter configuration.

Algorithms

These algorithms play a crucial role in distributed systems and represent some of the most famous methods in state-of-the-art of consistent hashing:

Some engines are not yet implemented in this project but are already configured for execution in the benchmark routine:

Benchmarks

As outlined in "Consistently Faster: A Survey and Fair Comparison of Consistent Hashing Algorithms" by Coluzzi et al. (2023), here is a concise overview of the benchmarks utilized:

  • Balance: The ability of the algorithm to spread the keys evenly across the cluster nodes.

  • Initialization Time: The time the algorithm requires to initialize its internal structure.

  • Lookup Time: The time the algorithm needs to find the node a given key belongs to.

  • Memory Usage: The amount of memory the algorithm uses to store its internal structure.

  • Monotonicity: The ability of the algorithm to move the minimum amount of resources when the cluster scales.

  • Resize Balance: The ability of the algorithm to keep its balance after adding or removing nodes.

  • Resize Time: The time the algorithm requires to reorganize its internal structure after adding or removing nodes.

Configuration

The format of the configuration file is described in detail in the configs/template.yaml file. The tool will use the configs/default.yaml file that represents the default configuration if no configuration file is provided.

Control Flow

Figure 1 shows a UML sequence diagram to explain how the benchmark routine procedure works.

1
Figure 1: Exploring the control flow of the benchmark routine.

Instructions

  1. Clone the repository and navigate to the cloned repository:

    git clone https://github.com/robertovicario/cpp-consistent-hashing-algorithms.git
    cd cpp-consistent-hashing-algorithms
  2. Run repository setup:

    • vcpkg:

      # Ensure scripts has executable permissions:
      # chmod +x repo.sh
      ./repo.sh
    • CMake:

      # Ensure scripts has executable permissions:
      # chmod +x cmake.sh
      ./cmake.sh
  3. Build the project with Ninja:

    cd build
    ninja
  4. Start the framework:

    • Default configuration:

      ./main
    • Custom configuration:

      ./main <your_config>.yaml
  5. Navigate to build/tmp/ and check the results.csv file.

Contributing

Adding New Algorithms

  1. Insert the algorithm name into any configuration file located in configs/.

  2. Implement your algorithm in Algorithms/your_algo/. Keep in mind that the system employs C++ templates to integrate the algorithms into the loop.

  3. Integrate a new execution routine into Main.cpp. Append a new else if branch and incorporate your engine using:

    /*
    * NEW_ALGORITHM
    */
    execute<YourEngine>(handler, yaml, "your_algo");

    If your engine requires additional parameters, include them as follows:

    /*
    * NEW_ALGORITHM
    */
    execute<YourEngine>(handler, yaml, "your_algo", param1, param2, ..., paramN);

Adding New Benchmarks

  1. Insert the benchmark name into any configuration file located in configs/.

  2. Implement the benchmark in Benchmarks/. Note that the system utilizes C++ templates for benchmark integration into the loop.

  3. Integrate a new benchmark routine into Benchmarks/Routine.hpp. Append a new else if branch and incorporate your engine using:

    /*
     * NEW_BENCHMARK
     */
     printInfo(l, algorithm, benchmark, hashFunction, initNodes, iterationsRun);
     results[l] = computeYourBenchmark<Engine>(yaml, algorithm, initNodes, args...); 

Licence

This project is distributed under GNU General Public License version 3. You can find the complete text of the license in the project repository.

Important

Credits

Contacts

About

C++ implementations of the most popular and best performing consistent hashing algorithms for non-peer-to-peer contexts.

License:GNU General Public License v3.0


Languages

Language:C++ 95.3%Language:CMake 3.1%Language:C 1.0%Language:Shell 0.7%