seaweedfs / seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Space used on a HDD volumes but fs.configure says SSD storage requested.

vincib opened this issue · comments

Describe the bug

We have a test cluster with 3 SSD servers (master + volume on SSD disks IP 201/202/203) and 3 HDD servers (volume only, IP 211/212/213 on the snapshot below)

The first machine have a filer too, with s3 enabled, and the filers on leveldb
We created a s3 bucket / collection named test1, and setup fs.configure to store everything on SSD, via
fs.configure -collection test1 -locationPrefix '/' -disk ssd -apply
Then we uploaded a bunch of video files from a peertube using s3cmd :
s3cmd --verbose put --recursive fdbfb556-5345-4e01-91aa-e0c5c5d45049 s3://test1/

We found our video files in newly created volumes in the SSD volume servers, but a little file have been created in HDD volumes for this collection too. I'm not sure it's normal :/

please find below our master after the upload :

Screenshot_20240417_120520

and the HDD volume server status :

Screenshot_20240417_120535

System Setup

  • command lines :

weed master -mdir=/data/master -peers=10.10.4.201:9333,10.10.4.202:9333,10.10.4.203:9333 -defaultReplication=001 -resumeState -volumePreallocate -volumeSizeLimitMB=1000
on 201/202/203 :
weed volume -dir=/data/volume -disk=ssd -max=0 -index=memory -minFreeSpace=10GiB -mserver=10.10.4.201:9333,10.10.4.202:9333,10.10.4.203:9333
on 211/212/213 :
weed volume -dir=/data/volume -disk=hdd -max=0 -index=memory -minFreeSpace=10GiB -mserver=10.10.4.201:9333,10.10.4.202:9333,10.10.4.203:9333
on the main filer:
weed filer -master=10.10.4.201:9333,10.10.4.202:9333,10.10.4.203:9333 -maxMB=20 -s3 -s3.allowDeleteBucketNotEmpty=false -s3.config=/etc/seaweedfs/s3.json -s3.domainName=test1.octos3.fr

  • OS version : Debian 12 / Bookworm
  • weed version 30GB 3.64 b74e808 linux amd64
[filer.options]
recursive_delete = false
[leveldb2]
enabled = true
dir = "./filerldb2"

Expected behavior

since fs.configure tells seaweedfs to store everything on collection "test1" into ssd drives, I expect it to not create any volume with any files on hdd volume servers

Additional context

The created file is very small compared to our video files, and it seems to contain metadata from s3 protocol.

I launched this to see what they are :

weed export -dir /data/volume -volumeId 25 -o t3.tar -collection test1
I0417 12:10:53.031776 volume_loading.go:91 readSuperBlock volume 25 version 3
tar -xvf t3.tar 

the files are accessible here for a while: https://benjamin.sonntag.fr/download/seaweed/

If I do a fsck I found that they are marked as orphans:

weed shell
> volume.fsck
total 11 directories, 13 files
dataNode:10.10.4.211:8080	volume:25	entries:2	orphan:2	100.00%	105781B
dataNode:10.10.4.212:8080	volume:25	entries:2	orphan:2	100.00%	105781B
Total		entries:40	orphan:4	10.00%	211562B

Thanks for the detailed report. The data on the HDD volumes are metadata change logs.