larrabee / clickhouse-backup

Tool for easy ClickHouse backup and restore with S3 support

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

clickhouse-backup

Build Status Telegram Docker Image

Tool for easy ClickHouse backup and restore with S3 support

Features

  • Easy creating and restoring backups of all or specific tables
  • Most efficient AWS S3 uploading and downloading with streaming archiving and extracting
  • Support of incremental backups on S3

Compatibility

ClickHouse: above 1.1.54390

S3 providers:

  • Minio
  • AWS
  • Mail.Cloud
  • Yandex.Cloud (don't work)

Download

  • Grab the latest binary from the releases page and decompress with:
tar -zxvf clickhouse-backup.tar.gz
  • Or use the official tiny Docker image and run it like:
docker run --rm -it --network host -v "/var/lib/clickhouse:/var/lib/clickhouse" \
   -e CLICKHOUSE_PASSWORD=password -e S3_ACCESS_KEY=access_key -e S3_SECRET_KEY=secret \
   alexakulov/clickhouse-backup --help
  • Or get from the sources:
go get github.com/AlexAkulov/clickhouse-backup

Usage

NAME:
   clickhouse-backup - Tool for easy backup of ClickHouse with S3 support

USAGE:
   clickhouse-backup <command> [--dry-run] [-t, --tables=<db>.<table>] <backup_name>

VERSION:
   unknown

DESCRIPTION:
   Run as root or clickhouse user

COMMANDS:
     tables          Print list of tables and exit
     list            Print list of backups and exit
     freeze          Freeze all or specific tables
     create          Create new backup of all or specific tables
     upload          Upload backup to s3
     download        Download backup from s3 to backup folder
     restore-schema  Create databases and tables from backup metadata
     restore-data    Copy data to 'detached' folder and execute ATTACH
     default-config  Print default config and exit
     clean           Clean backup data from shadow folder
     help, h         Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --config FILE, -c FILE  Config FILE name. (default: "/etc/clickhouse-backup/config.yml")
   --dry-run               [DEPRECATED] Only show what should be uploaded or downloaded but don't actually do it. May still perform S3 requests to get bucket listings and other information though (only for file transfer commands)
   --help, -h              show help
   --version, -v           print the version

Default Config

All options can be overwritten via environment variables

clickhouse:
  username: default            # CLICKHOUSE_USERNAME
  password: ""                 # CLICKHOUSE_PASSWORD
  host: localhost              # CLICKHOUSE_HOST
  port: 9000                   # CLICKHOUSE_PORT
  data_path: ""                # CLICKHOUSE_DATA_PATH
  skip_tables:                 # CLICKHOUSE_SKIP_TABLES
    - system.*
s3:
  access_key: ""               # S3_ACCESS_KEY
  secret_key: ""               # S3_SECRET_KEY
  bucket: ""                   # S3_BUCKET
  endpoint: ""                 # S3_ENDPOINT
  region: us-east-1            # S3_REGION
  acl: private                 # S3_ACL
  force_path_style: false      # S3_FORCE_PATH_STYLE
  path: ""                     # S3_PATH
  disable_ssl: false           # S3_DISABLE_SSL
  disable_progress_bar: false  # DISABLE_PROGRESS_BAR
  part_size: 5242880           # S3_PART_SIZE
  backups_to_keep_local: 0     # BACKUPS_TO_KEEP_LOCAL
  backups_to_keep_s3: 0        # BACKUPS_TO_KEEP_S3
  compression_level: 1         # S3_COMPRESSION_LEVEL
  # supported: 'tar', 'lz4', 'bzip2', 'gzip', 'sz', 'xz'
  compression_format: lz4      # S3_COMPRESSION_FORMAT

ATTENTION!

Never change files permissions in /var/lib/clickhouse/backup. This path contains hard links. Permissions on all hard links to the same data on disk are always identical. That means that if you change the permissions/owner/attributes on a hard link in backup path, permissions on files with which ClickHouse works will be changed too. That might lead to data corruption.

Examples

Simple cron script for daily backup and uploading

#!/bin/bash
BACKUP_NAME=my_backup_$(date -u +%Y-%m-%dT%H-%M-%S)
clickhouse-backup create $BACKUP_NAME
clickhouse-backup upload $BACKUP_NAME

Ansible script for backup sharded cluster

You can use this playbook for daily backup of sharded cluster. On the first day of month full backup will be uploaded and increment on the other days. Use https://healthchecks.io for monitoring creating and uploading of backups.

- hosts: clickhouse-cluster
  become: yes
  vars:
    healthchecksio_clickhouse_backup_id: "get on https://healthchecks.io"
    healthchecksio_clickhouse_upload_id: "..."
  roles:
    - clickhouse-backup
  tasks:
    - block:
        - uri: url="https://hc-ping.com/{{ healthchecksio_clickhouse_backup_id }}/start"
        - set_fact: backup_name="{{ lookup('pipe','date -u +%Y-%m-%d') }}-{{ clickhouse_shard }}"
        - set_fact: yesterday_backup_name="{{ lookup('pipe','date --date=yesterday -u +%Y-%m-%d') }}-{{ clickhouse_shard }}"
        - set_fact: current_day="{{ lookup('pipe','date -u +%d') }}"
        - name: create new backup
          shell: "clickhouse-backup create {{ backup_name }}"
          register: out
        - debug: var=out.stdout_lines
        - uri: url="https://hc-ping.com/{{ healthchecksio_clickhouse_backup_id }}"
      rescue:
        - uri: url="https://hc-ping.com/{{ healthchecksio_clickhouse_backup_id }}/fail"
    - block:
        - uri: url="https://hc-ping.com/{{ healthchecksio_clickhouse_upload_id }}/start"
        - name: upload full backup
          shell: "clickhouse-backup upload {{ backup_name }}"
          register: out
          when: current_day == '01'
        - name: upload diff backup
          shell: "clickhouse-backup upload {{ backup_name }} --diff-from {{ yesterday_backup_name }}"
          register: out
          when: current_day != '01'
        - debug: var=out.stdout_lines
        - uri: url="https://hc-ping.com/{{ healthchecksio_clickhouse_upload_id }}"
      rescue:
        - uri: url="https://hc-ping.com/{{ healthchecksio_clickhouse_upload_id }}/fail"

About

Tool for easy ClickHouse backup and restore with S3 support

License:MIT License


Languages

Language:Go 98.5%Language:Shell 0.7%Language:Dockerfile 0.7%