ydb-platform / ydb-ansible

Ansible playbooks for YDB cluster deployment and maintenance

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deploying YDB cluster with Ansible

Ansible playbooks supporting the deployment of YDB clusters into VM or baremetal servers.

Currently, the playbooks support the following scenarios:

  • the initial deployment of YDB static (storage) nodes;
  • YDB database creation;
  • the initial deployment of YDB dynamic (database) nodes;
  • adding extra YDB dynamic nodes to the YDB cluster;
  • updating cluster configuration file and TLS certificates, with automatic rolling restart.

The following scenarios are yet to be implemented (TODO):

  • configuring extra storage devices within the existing YDB static nodes;
  • adding extra YDB static nodes to the existing cluster;
  • removing YDB dynamic nodes from the existing cluster.

Current limitations:

  • supported python interpreter version on managed should must be >= 3.7
  • configuration file customization depends on the support of automatic actor system threads management, which requires YDB version 23.1.26.hotfix1 or later;
  • the cluster configuration file has to be manually created;
  • there are no examples for configuring the storage nodes with different disk layouts (it seems to be doable by defining different ydb_disks values for different host groups).

Ansible Collection - ydb_platform.ydb

Documentation for the collection.

Playbook configuration settings

Default configuration settings are defined in the group_vars/all file as a set of Ansible variables. An example file is provided. Different playbook executions may require different variable values, which can be accomplished by specifying extra JSON-format files and passing those files in the command line.

The meaning and format of the variables used are specified in the table below.

Variable Meaning
ydb_libidn_archive Enable the installation of custom-built libidn for RHEL, AlmaLinux or Rocky Linux.
ydb_libidn_archive_unpack_options Extra flags to be passed to tar for unpacking custom-built libidn package. Default value: ['--strip-component=1']
ydb_archive YDB server binary package in .tar.gz format
ydb_archive_unpack_options Extra flags to be passed to tar for unpacking the YDB server binaries. Default value: ['--strip-component=1']
ydb_config The name of the cluster configuration file within the files subdirectory (without the actor_system_config snippet!)
ydb_tls_dir Path to the local directory with the TLS certificates and keys, as generated by the sample script, or following the filename convention used by the sample script
ydb_domain The name of the root domain hosting the databases, value Root is used in the YDB documentation
ydb_disks Disk layout of storage nodes, defined as ydbd_static in the hosts file. Defined as list of structures having the following fields:
name - physical device name (like /dev/sdb or /dev/vdb);
label - the desired YDB data partition label, as used in the cluster configuration file (like ydb_disk_1)
ydb_dynnodes Set of dynamic nodes to be ran on each host listed as ydbd_dynamic in the hosts file. Defined as list of structures having the following fields:
dbname - name of the YDB database handled by the corresponding dynamic node;
instance - dynamic node service instance name, allowing to distinguish between multiple dynamic nodes for the same database running in the same host;
offset - integer number 0-N, used as the offset for the standard network port numbers (0 means using the standard ports).
ydb_brokers List of host names running the YDB static nodes, exactly 3 (three) host names must be specified
ydb_cores_static Number of cores to be used by thread pools of the static nodes
ydb_cores_dynamic Number of cores to be used by thread pools of the dynamic nodes
ydb_dbname Database name, for database creation, dynamic nodes deployment and dynamic nodes rolling restart
ydb_pool_kind YDB default storage pool kind, as specified in the static nodes configuration file in the storage_pool_types.kind field
ydb_database_groups Initial number of storage groups in the newly created database
ydb_dynnode_restart_sleep_seconds Number of seconds to sleep after startup of each dynamic node during the rolling restart.

Installing the YDB cluster using Ansible

Overall installation is performed according to the official instruction, with several steps automated with Ansible. The steps below are adopted for the Ansible-based process:

  1. Review the system requirements, and prepare the YDB hosts. Ensure that SSH access and sudo-based root privileges are available.

  2. Prepare the TLS certificates, the provided sample script may be used for automation of this step.

  3. Download the YDB server distribution. It is better to use the latest binary version available.

  4. Ensure that you have Python 3.8 or later installed on all hosts of the cluster.

  5. Configure the passwordless SSH access to all hosts of the cluster.

  6. Configure the priviledge escalation on all hosts of the cluster, such as passwordless sudo for the user account with the SSH access.

  7. Install ansible-core version 2.11-2.15. Ansible 2.10 or older is not supported.

  8. Install the required YDB Ansible collections from Github:

    ansible-galaxy collection install git+https://github.com/ydb-platform/ydb-ansible.git

    Alternatively, download the current releases of Ansible collections for YDB, community.general and Prometheus, and install the collections from the archives:

    ansible-galaxy collection install community.general-X.Y.Z.tar.gz
    ansible-galaxy collection install prometheus-prometheus-X.Y.Z.tar.gz
    ansible-galaxy collection install ydb-ansible-X.Y.tar.gz
  9. In the new subdirectory, create the ansible.cfg file using the provided example.

  10. Create the files and files/certs directories, and put the TLS keys and certificates there. If the certificates were generated using the provided helper script, the CA/certs/YYYY-MM-DD_hh-mm-ss subdirectory should typically be copied as files/certs.

  11. Create the inventory/50-inventory.yaml, inventory/99-inventory-vault.yaml files. These files contain the host list, installation configuration and secrets to be used. The example files are provided: inventory.yaml, inventory-vault.yaml.

  12. Create the Ansible Vault password file as ansible_vault_password_file, with the password to protect the sensible secrets.

  13. Encrypt inventory/99-inventory-vault.yaml with ansible-vault encrypt inventory/99-inventory-vault.yaml command. To edit this file use command ansible-vault edit inventory/99-inventory-vault.yaml.

  14. Prepare the cluster configuration file according to the instructions in the documentation, and save as files/config.yaml. Omit the actor_system_config section - it will be added automatically.

  15. Create the setup playbook based on the provided example. Customize the required actions as needed.

  16. Deploy the YDB cluster by running the playbook with the following command:

    ansible-playbook setup_playbook.yaml

Updating the cluster configuration files

To update the YDB cluster configuration files (ydbd-config.yaml, TLS certificates and keys) using the Ansible playbook, the following actions are necessary:

  1. Ensure that the hosts file contains the current list of YDB cluster nodes, both static and dynamic.
  2. Ensure that the configuration variable ydbd_config in the group_vars/all file points to the desired YDB server configuration file.
  3. Ensure that the configuration variable ydbd_tls_dir points to the directory containing the desired TLS key and certificate files for all the nodes within the YDB cluster.
  4. Apply the updated configuration to the cluster by running the run-update-config.sh script. Ensure that the playbook has been completed successfully, and diagnose and fix execution errors if they happen.

Notes:

  1. Please take into account that rolling restart is performed node by node, and for a large cluster the process may consume a significant amount of time.
  2. For Certificate Authority (CA) certificate rotation, at least two separate configuration updates are needed:
    • first to deploy the ca.crt file, containing both new and old CA certificates;
    • second to deploy the fresh server keys and certificates signed by the new CA certificate.

What is actually done by the playbooks?

Actions executed for installing YDB nodes

  1. libaio or libaio1 is installed, depending on the operating system
  2. chrony is installed and enabled to ensure time synchronization
  3. jq is installed to support some scripting logic used in the playbooks
  4. YDB user group and user is created
  5. YDB installation directory is created
  6. YDB server software binary package is unpacked into the YDB installation directory
  7. YDB client package automatic update checks are disabled for the YDB user, to avoid extra messages from client commands.
  8. YDB TLS certificates and keys are copied to each server
  9. YDB cluster configuration file is copied to each server
  10. Transparent huge pages (THP) are enabled on each server, which is implemented by the creation, activation and start of the corresponding systemd service.

Actions executed for the initial deployment of YDB storage cluster

  1. Installation actions are executed.
  2. For each disk configured, it is checked for the existing YDB data. If none found, disk is completely re-partitioned, and obliterated. For the existing YDB data, no changes are made.

    WARNING: the safety checks do not work for YDB disks using non-default encryption keys. DATA LOSS IS POSSIBLE if the encryption is actually used. Probably an enhancement is needed to support the encryption key to be specified in the deployment option.

  3. ydbd-storage.service is created and configured as the systemd service.
  4. ydbd-storage.service is started, and the playbook waits for static nodes to come up.
  5. YDB blobstorage configuration is applied with the ydbd admin blobstorage init command.
  6. The playbook waits for the completion of YDB storage initialization.
  7. The initial password for the root user is configured according to contents of the files/secret file.

Actions executed for YDB dynamic nodes deployment

  1. Installation actions are executed.
  2. For each database configured, the list of YDB dynnode systemd services are created and configured.
  3. YDB dynnode services are started.

Actions executed for the configuration update

  1. YDB TLS certificates and keys are copied to each server.
  2. YDB cluster configuration file is copied to each server.
  3. Rolling restart is performed for YDB storage nodes, node by node, checking for the YDB storage cluster to become healthy after the restart of each node.
  4. Rolling restart is performed for YDB database nodes, server by server, restarting all nodes sitting in the single server at a time, and waiting for the specified number of seconds after each server's nodes restart.

About

Ansible playbooks for YDB cluster deployment and maintenance

License:Apache License 2.0


Languages

Language:Python 93.9%Language:Dockerfile 3.0%Language:Shell 2.1%Language:Makefile 1.0%