seaweedfs / seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Filer with leveldb3 causes data loss; leveldb2 works fine

vmihailenco opened this issue · comments

When using multiple Filer+S3 with leveldb3, the whole bucket can suddenly disappear after a bunch of files are deleted. Simply replacing it with leveldb2 makes the issue unreproducible.

There is nothing in the logs except that master receives notifications that a bunch of volumes are empty/removed when the bucket disappears. Sometimes the bucket is fully gone, sometimes it is just a missing folder. The volume seems to be gone too so it looks like filer decides that it is time to delete the bucket....

I was able to reproduce the issue by adding/removing a ClickHouse partition back and forth using S3.

Servers have a good network and cluster.check reports no issues.

version 8000GB 3.63 54d7748a4a54d94a31ce04d05db801faeff4f690 linux amd64

weed master -ip={{ weed_domain }} -ip.bind=0.0.0.0 -mdir=./master -defaultReplication=001 -volumePreallocate -disableHttp

weed filer -s3 -s3.config=/etc/seaweedfs/s3_config.json -master={{ masters | join(',') }}

weed volume -mserver={{ masters | join(',') }} -ip={{ weed_domain }} -ip.bind=0.0.0.0 -port={{ port }} -dir={{ dir }} -index=leveldb -max=0 -idleTimeout=60 -dataCenter=dc1 -rack=rack1

Filer config:

[filer.options]
recursive_delete = true
#max_file_name_length = 255

[leveldb3]
enabled = true
dir = "./filerldb3"

Hard to tell if that is enough to reproduce the issue, but I'd like to also confirm few things:

  1. Is it okay that all filers logs are full of these messages (repeats every second):
I0309 11:20:57.355525 filer_grpc_server_sub_meta.go:130 read on disk filer:XXX.XXX.5.82:8888@XXX.XXX.5.82:63030 local subscribe / from 2024-03-09 10:20:37.530843927 +0000 UTC
I0309 11:20:57.355620 filer_grpc_server_sub_meta.go:149 read in memory filer:XXX.XXX.5.82:8888@XXX.XXX.5.82:63030 local subscribe / from 2024-03-09 10:20:37.530843927 +0000 UTC
I0309 11:20:58.414201 filer_grpc_server_sub_meta.go:296 + local listener filer:XXX.XXX.187.112:8888@XXX.XXX.187.112:13910 clientId -532674855 clientEpoch 16650
I0309 11:20:58.414227 filer_grpc_server_sub_meta.go:117  + filer:XXX.XXX.187.112:8888@XXX.XXX.187.112:13910 local subscribe / from 2024-03-09 08:41:49.725816183 +0000 UTC clientId:-532674855
I0309 11:20:58.414238 filer_grpc_server_sub_meta.go:112 disconnect filer:XXX.XXX.187.112:8888@XXX.XXX.187.112:45252 local subscriber / clientId:-532674855
I0309 11:20:58.414256 filer_grpc_server_sub_meta.go:312 - local listener filer:XXX.XXX.187.112:8888@XXX.XXX.187.112:45252 clientId -532674855 clientEpoch 16647
  1. Is it okay to have orphan entries reported by volume.fsck? I have several filers, but they should all be in sync because I am not writing anything to them for few minutes.

    I get orphan entries even with leveldb2 just by adding/removing a ClickHouse partition via S3... Unlike with filerldb3, it does not seem to cause any issues...

  2. It looks like all the filers have all the data. Is is possible to change that? E.g. have a dedicated filer for the bucket without subscribing to other filers...

  3. Is there any way to check if filers are in sync?

I have also encountered this issue, and now I have switched to leveldb2.