clickhouse-backup restore can't handle those backups created with use_embedded_backup_restore: true
frankwg opened this issue · comments
I created a clickhouse cluster with the following yaml with clickhouse-operator.
apiVersion: v1
kind: Secret
metadata:
name: clickhouse-backup-config
stringData:
config.yml: |
general:
remote_storage: s3
log_level: debug
restore_schema_on_cluster: "{cluster}"
allow_empty_backups: true
backups_to_keep_remote: 3
clickhouse:
use_embedded_backup_restore: true
embedded_backup_disk: backups
timeout: 4h
api:
listen: "0.0.0.0:7171"
create_integration_tables: true
s3:
acl: private
endpoint: http://s3-backup-minio:9000
bucket: clickhouse
path: backup/shard-{shard}
access_key: backup-access-key
secret_key: backup-secret-key
force_path_style: true
disable_ssl: true
debug: true
---
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: one-sidecar-embedded
spec:
defaults:
templates:
podTemplate: clickhouse-backup
dataVolumeClaimTemplate: data-volume
configuration:
profiles:
default/distributed_ddl_task_timeout: 14400
files:
config.d/backup_disk.xml: |
<clickhouse>
<storage_configuration>
<disks>
<backups>
<type>local</type>
<path>/var/lib/clickhouse/backups/</path>
</backups>
</disks>
</storage_configuration>
<backups>
<allowed_disk>backups</allowed_disk>
<allowed_path>backups/</allowed_path>
</backups>
</clickhouse>
settings:
# to allow scrape metrics via embedded prometheus protocol
prometheus/endpoint: /metrics
prometheus/port: 8888
prometheus/metrics: true
prometheus/events: true
prometheus/asynchronous_metrics: true
# need install zookeeper separately, look to https://github.com/Altinity/clickhouse-operator/tree/master/deploy/zookeeper/ for details
zookeeper:
nodes:
- host: zookeeper
port: 2181
session_timeout_ms: 5000
operation_timeout_ms: 5000
clusters:
- name: default
layout:
# 2 shards one replica in each
shardsCount: 2
replicas:
- templates:
podTemplate: pod-with-backup
- templates:
podTemplate: pod-clickhouse-only
templates:
volumeClaimTemplates:
- name: data-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
podTemplates:
- name: pod-with-backup
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8888'
prometheus.io/path: '/metrics'
# need separate prometheus scrape config, look to https://github.com/prometheus/prometheus/issues/3756
clickhouse.backup/scrape: 'true'
clickhouse.backup/port: '7171'
clickhouse.backup/path: '/metrics'
spec:
securityContext:
runAsUser: 101
runAsGroup: 101
fsGroup: 101
containers:
- name: clickhouse-pod
image: clickhouse/clickhouse-server
command:
- clickhouse-server
- --config-file=/etc/clickhouse-server/config.xml
- name: clickhouse-backup
image: altinity/clickhouse-backup:master
imagePullPolicy: IfNotPresent
command:
- bash
- -xc
- "/bin/clickhouse-backup server"
# require to avoid double scraping clickhouse and clickhouse-backup containers
ports:
- name: backup-rest
containerPort: 7171
volumeMounts:
- name: config-volume
mountPath: /etc/clickhouse-backup/config.yml
subPath: config.yml
volumes:
- name: config-volume
secret:
secretName: clickhouse-backup-config
- name: pod-clickhouse-only
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8888'
prometheus.io/path: '/metrics'
spec:
securityContext:
runAsUser: 101
runAsGroup: 101
fsGroup: 101
containers:
- name: clickhouse-pod
image: clickhouse/clickhouse-server
command:
- clickhouse-server
- --config-file=/etc/clickhouse-server/config.xml
I created a distributed table with the following DDL:
CREATE TABLE test_local_048 ON CLUSTER 'default' (a UInt32)
Engine = ReplicatedMergeTree('/clickhouse/{installation}/tables/{shard}/{database}/{table}', '{replica}')
PARTITION BY tuple()
ORDER BY a
CREATE TABLE test_distr_048 ON CLUSTER 'default' AS test_local_048
Engine = Distributed('default', default, test_local_048, a%2)
INSERT INTO test_distr_048 select * from numbers(100)
After a backup was created by running clickhouse-backup create <backup-name>
, it was created under the /var/lib/clickhouse/backups folder.
I got strconv.Atoi: parsing \"CREATE\": invalid syntax
, while restoring that backup with the clickhouse-backup restore <backup-name>
command. It looked like the restore command didn't respect the use_embedded_backup_restore: true
option.
However, the same backup can be restored using RESTORE all FROM Disk('backups', '<backup-name>')
in clickhouse-client.
After a backup was created by running clickhouse-backup create , it was created under the /var/lib/clickhouse/backups folder.
could you share
LOG_LEVEL=debug clickhouse-backup create <backup-name>
not sure backup created successfully for
<disks>
<backups>
<type>local</type>
<path>/var/lib/clickhouse/backups/</path>
</backups>
</disks>
also could you share
ls -la /var/lib/clickhouse/backups/<backup_name>/
it could be related to #730
chi-one-sidecar-embedded-default-0-0-0:/$ LOG_LEVEL=debug clickhouse-backup create debugging0
2024/04/04 13:23:42.200255 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/04/04 13:23:42.203700 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2024/04/04 13:23:42.203838 info SELECT metadata_path FROM system.tables WHERE database = 'system' AND metadata_path!='' LIMIT 1; logger=clickhouse
2024/04/04 13:23:42.211477 info SELECT name, engine FROM system.databases WHERE NOT match(name,'^(system|INFORMATION_SCHEMA|information_schema|_temporary_and_external_tables)$') logger=clickhouse
2024/04/04 13:23:42.215615 info SHOW CREATE DATABASE `default` logger=clickhouse
2024/04/04 13:23:42.219106 info SELECT name, count(*) as is_present FROM system.settings WHERE name IN (?, ?) GROUP BY name with args [show_table_uuid_in_table_create_query_if_not_nil display_secrets_in_show_and_select] logger=clickhouse
2024/04/04 13:23:42.225177 info SELECT name FROM system.databases WHERE engine IN ('MySQL','PostgreSQL','MaterializedPostgreSQL') logger=clickhouse
2024/04/04 13:23:42.227788 info SELECT countIf(name='data_path') is_data_path_present, countIf(name='data_paths') is_data_paths_present, countIf(name='uuid') is_uuid_present, countIf(name='create_table_query') is_create_table_query_present, countIf(name='total_bytes') is_total_bytes_present FROM system.columns WHERE database='system' AND table='tables' logger=clickhouse
2024/04/04 13:23:42.233083 info SELECT database, name, engine , data_paths , uuid , create_table_query , coalesce(total_bytes, 0) AS total_bytes FROM system.tables WHERE is_temporary = 0 AND NOT has(arrayMap(x->lower(x), ['GenerateRandom']), lower(engine)) ORDER BY total_bytes DESC SETTINGS show_table_uuid_in_table_create_query_if_not_nil=1 logger=clickhouse
2024/04/04 13:23:42.280989 info SELECT metadata_path FROM system.tables WHERE database = 'system' AND metadata_path!='' LIMIT 1; logger=clickhouse
2024/04/04 13:23:42.285216 info SELECT count() as cnt FROM system.columns WHERE database='system' AND table='functions' AND name='create_query' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/04/04 13:23:42.339508 info SELECT name, create_query FROM system.functions WHERE create_query!='' logger=clickhouse
2024/04/04 13:23:42.355040 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2024/04/04 13:23:42.357820 info SELECT countIf(name='type') AS is_disk_type_present, countIf(name='free_space') AS is_free_space_present, countIf(name='disks') AS is_storage_policy_present FROM system.columns WHERE database='system' AND table IN ('disks','storage_policies') logger=clickhouse
2024/04/04 13:23:42.407005 info SELECT d.path, any(d.name) AS name, any(d.type) AS type, min(d.free_space) AS free_space, groupUniqArray(s.policy_name) AS storage_policies FROM system.disks AS d LEFT JOIN (SELECT policy_name, arrayJoin(disks) AS disk FROM system.storage_policies) AS s ON s.disk = d.name GROUP BY d.path logger=clickhouse
2024/04/04 13:23:42.451946 info BACKUP TABLE `default`.`test_local_048`, TABLE `default`.`test_distr_048` TO Disk(?,?) with args [backups debugging0] logger=clickhouse
2024/04/04 13:23:42.469008 info SELECT sum(total_bytes) AS backup_data_size FROM system.tables WHERE concat(database,'.',name) IN ('default.test_local_048', 'default.test_distr_048') logger=clickhouse
2024/04/04 13:23:42.474343 debug calculate parts list from embedded backup disk backup=debugging0 logger=backuper operation=create
2024/04/04 13:23:42.475401 info SELECT value FROM `system`.`build_options` where name='VERSION_DESCRIBE' logger=clickhouse
2024/04/04 13:23:42.477973 info done backup=debugging0 duration=278ms logger=backuper operation=create_embedded
2024/04/04 13:23:42.478262 info clickhouse connection closed logger=clickhouse
chi-one-sidecar-embedded-default-0-0-0:/$ ls -la /var/lib/clickhouse/backups/debugging0
total 8
drwxr-x--- 4 clickhou clickhou 70 Apr 4 13:23 .
drwxr-x--- 3 clickhou clickhou 24 Apr 4 13:23 ..
-rw-r----- 1 clickhou clickhou 1767 Apr 4 13:23 .backup
drwxr-x--- 3 clickhou clickhou 21 Apr 4 13:23 data
drwxr-x--- 3 clickhou clickhou 21 Apr 4 13:23 metadata
-rw-r----- 1 clickhou clickhou 717 Apr 4 13:23 metadata.json
thanks for reporting
looks like modern version of clickhouse works good for create
please share
LOG_LEVEL=debug clickhouse-backup restore debugging0
?
chi-one-sidecar-embedded-default-0-0-0:/$ LOG_LEVEL=debug clickhouse-backup restore debugging0
2024/04/04 14:14:00.981255 info clickhouse connection prepared: tcp://localhost:9000 run ping logger=clickhouse
2024/04/04 14:14:00.985541 info clickhouse connection open: tcp://localhost:9000 logger=clickhouse
2024/04/04 14:14:00.985644 info SELECT value FROM `system`.`build_options` where name='VERSION_INTEGER' logger=clickhouse
2024/04/04 14:14:00.991900 info SELECT countIf(name='type') AS is_disk_type_present, countIf(name='free_space') AS is_free_space_present, countIf(name='disks') AS is_storage_policy_present FROM system.columns WHERE database='system' AND table IN ('disks','storage_policies') logger=clickhouse
2024/04/04 14:14:00.998127 info SELECT d.path, any(d.name) AS name, any(d.type) AS type, min(d.free_space) AS free_space, groupUniqArray(s.policy_name) AS storage_policies FROM system.disks AS d LEFT JOIN (SELECT policy_name, arrayJoin(disks) AS disk FROM system.storage_policies) AS s ON s.disk = d.name GROUP BY d.path logger=clickhouse
2024/04/04 14:14:01.014604 info SELECT count() AS is_macros_exists FROM system.tables WHERE database='system' AND name='macros' SETTINGS empty_result_for_aggregation_by_empty_set=0 logger=clickhouse
2024/04/04 14:14:01.021993 info SELECT macro, substitution FROM system.macros logger=clickhouse
2024/04/04 14:14:01.028082 info CREATE DATABASE IF NOT EXISTS `default` ON CLUSTER 'default' ENGINE = Atomic with args [[]] logger=clickhouse
2024/04/04 14:14:01.136811 warn open /var/lib/clickhouse/backups/debugging0/data/default/test_distr_048: no such file or directory logger=getTableListByPatternLocal
2024/04/04 14:14:01.138450 info SELECT engine FROM system.databases WHERE name = 'default' logger=clickhouse
2024/04/04 14:14:01.142753 info DROP TABLE IF EXISTS `default`.`test_local_048` ON CLUSTER 'default' NO DELAY logger=clickhouse
2024/04/04 14:14:01.340092 info SELECT engine FROM system.databases WHERE name = 'default' logger=clickhouse
2024/04/04 14:14:01.345654 info DROP TABLE IF EXISTS `default`.`test_distr_048` ON CLUSTER 'default' NO DELAY logger=clickhouse
2024/04/04 14:14:01.567933 info clickhouse connection closed logger=clickhouse
2024/04/04 14:14:01.568016 error strconv.Atoi: parsing "CREATE": invalid syntax