Why shows b.getTablesForUploadDiffRemote return error: not found on remote storage
hueiyuan opened this issue · comments
Description
Our backup suddenly failed and shows :
{"command":"upload --diff-from-remote=\"shardshard2-increment-20240404222727\" --resumable=1 shardshard2-increment-20240405005102","status":"error","start":"2024-04-05 02:16:44","finish":"2024-04-05 02:16:45","error":"b.getTablesForUploadDiffRemote return error: \"shardshard2-increment-20240404222727\" not found on remote storage"
But we have checked remote storage indeed have shardshard2-increment-20240404222727
this backup with backup list command. The command result:
{"name":"shardshard2-increment-20240404222727","created":"2024-04-04 23:47:29","size":920897893739,"location":"remote","required":"shardshard2-increment-20240404210213","desc":"zstd, regular"}
So does have any ideas about this problem?
By the way, now we rerun watch command to recover backup process....
print-config
general:
remote_storage: s3
max_file_size: 0
disable_progress_bar: true
backups_to_keep_local: 0
backups_to_keep_remote: 50
log_level: debug
allow_empty_backups: false
download_concurrency: 2
upload_concurrency: 2
use_resumable_state: true
restore_schema_on_cluster: ""
upload_by_part: true
download_by_part: true
restore_database_mapping: {}
retries_on_failure: 3
retries_pause: 30s
watch_interval: 30m
full_interval: 24h
watch_backup_name_template: shard{shard}-{type}-{time:20060102150405}
sharded_operation_mode: ""
cpu_nice_priority: 15
io_nice_priority: idle
retriesduration: 30s
watchduration: 30m0s
fullduration: 24h0m0s
clickhouse:
username: xxxxx
password: xxxxxx
host: localhost
port: 9000
disk_mapping: {}
skip_tables:
- system.*
- INFORMATION_SCHEMA.*
- information_schema.*
- _temporary_and_external_tables.*
skip_table_engines: []
timeout: 5m
freeze_by_part: false
freeze_by_part_where: ""
use_embedded_backup_restore: false
embedded_backup_disk: ""
backup_mutations: true
restore_as_attach: false
check_parts_columns: true
secure: false
skip_verify: false
sync_replicated_tables: false
log_sql_queries: true
config_dir: /etc/clickhouse-server/
restart_command: exec:systemctl restart clickhouse-server
ignore_not_exists_error_during_freeze: true
check_replicas_before_attach: true
tls_key: ""
tls_cert: ""
tls_ca: ""
max_connections: 8
debug: false
s3:
access_key: ""
secret_key: ""
bucket: ipp-clickhouse-backup-prod
endpoint: ""
region: us-west-2
acl: private
assume_role_arn: arn:aws:iam::xxxx:role/backup-role
force_path_style: true
path: backup/chi-shard-backup
object_disk_path: tiered-backup
disable_ssl: false
compression_level: 1
compression_format: zstd
sse: ""
sse_kms_key_id: ""
sse_customer_algorithm: ""
sse_customer_key: ""
sse_customer_key_md5: ""
sse_kms_encryption_context: ""
disable_cert_verification: false
use_custom_storage_class: false
storage_class: STANDARD
custom_storage_class_map: {}
concurrency: 9
part_size: 0
max_parts_count: 2000
allow_multipart_download: false
object_labels: {}
request_payer: ""
check_sum_algorithm: ""
debug: true
gcs:
credentials_file: ""
credentials_json: ""
credentials_json_encoded: ""
bucket: ""
path: ""
object_disk_path: ""
compression_level: 1
compression_format: tar
debug: false
force_http: false
endpoint: ""
storage_class: STANDARD
object_labels: {}
custom_storage_class_map: {}
client_pool_size: 24
cos:
url: ""
timeout: 2m
secret_id: ""
secret_key: ""
path: ""
compression_format: tar
compression_level: 1
debug: false
api:
listen: 0.0.0.0:7171
enable_metrics: true
enable_pprof: false
username: ""
password: ""
secure: false
certificate_file: ""
private_key_file: ""
ca_cert_file: ""
ca_key_file: ""
create_integration_tables: true
integration_tables_host: ""
allow_parallel: false
complete_resumable_after_restart: true
ftp:
address: ""
timeout: 2m
username: ""
password: ""
tls: false
skip_tls_verify: false
path: ""
object_disk_path: ""
compression_format: tar
compression_level: 1
concurrency: 24
debug: false
sftp:
address: ""
port: 22
username: ""
password: ""
key: ""
path: ""
object_disk_path: ""
compression_format: tar
compression_level: 1
concurrency: 24
debug: false
azblob:
endpoint_schema: https
endpoint_suffix: core.windows.net
account_name: ""
account_key: ""
sas: ""
use_managed_identity: false
container: ""
path: ""
object_disk_path: ""
compression_level: 1
compression_format: tar
sse_key: ""
buffer_size: 0
buffer_count: 3
max_parts_count: 256
timeout: 4h
debug: false
custom:
upload_command: ""
download_command: ""
list_command: ""
delete_command: ""
command_timeout: 4h
commandtimeoutduration: 4h0m0s
Which clickhouse-backup
version do you use?
could you upgrade to 2.4.35 ?
Need more logs from clickhouse-backup
container to understand what's wrong
@Slach
I want to confirm additional question, our backup is the sidecar in clickhouse server pod.(Just like this example) And we found when we try to update config of clickhouse-backup, the statefulset do not apply and update. About this, do you have any comment?
This is a different question, provide more context
Is your clickhouse-backup configuration defined as a separate ConfigMap
?
Do you use clickhouse-operator
or install clickhouse-server
some different way?
By default, kubernetes have time period before kubelet
upgrade configmap
inside pod
look details https://www.perplexity.ai/search/why-kubernetes-dont-u.h.fDuVT22JOO4ZqWEfug
This is a different question, provide more context Is your clickhouse-backup configuration defined as a separate
ConfigMap
? Do you useclickhouse-operator
or installclickhouse-server
some different way?By default, kubernetes have time period before
kubelet
upgradeconfigmap
inside pod look details https://www.perplexity.ai/search/why-kubernetes-dont-u.h.fDuVT22JOO4ZqWEfug
We use Altinity/clickhouse-operator to build clickhouse and sidecar for clickhouse-backup, so do not define additional ConfigMap for this.
How did you change configuration in this case? Did you use env
section?
could yuou share kind: ClickHouseInstallation
manifest without sensitive credentials?