seaweedfs / seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Master panic: runtime error: index out of range [0] with length 0

abionics opened this issue · comments

Describe the bug
Master crushed after running for about two weeks. This happens in second time during this month, so the problem is real. I run seaweedfs in docker using docker-compose like this. Logs:

Master:

I0328 18:22:34.009207 master_grpc_server.go:203 master see deleted volume 115600 from volume:8080
I0328 18:22:34.009227 master_grpc_server.go:203 master see deleted volume 115597 from volume:8080
I0328 18:22:34.009368 volume_layout.go:405 Volume 115596 becomes writable
I0328 18:22:34.009413 volume_layout.go:405 Volume 115597 becomes writable
I0328 18:22:34.009450 volume_layout.go:405 Volume 115598 becomes writable
I0328 18:22:34.009486 volume_layout.go:405 Volume 115599 becomes writable
I0328 18:22:34.009521 volume_layout.go:405 Volume 115600 becomes writable
I0328 18:22:34.009556 volume_layout.go:405 Volume 115601 becomes writable
panic: runtime error: index out of range [0] with length 0

goroutine 510258003 [running]:
github.com/seaweedfs/seaweedfs/weed/topology.(*VolumeLocationList).Head(...)
        /go/src/github.com/seaweedfs/seaweedfs/weed/topology/volume_location_list.go:31
github.com/seaweedfs/seaweedfs/weed/server.(*MasterServer).Assign(0xc000263500, {0xc00570a420?, 0x569d26?}, 0xc00570a420)
        /go/src/github.com/seaweedfs/seaweedfs/weed/server/master_grpc_server_assign.go:99 +0xacb
github.com/seaweedfs/seaweedfs/weed/pb/master_pb._Seaweed_Assign_Handler({0x23112c0?, 0xc000263500}, {0x29f7d20, 0xc004a90d50}, 0xc009512480, 0x0)
        /go/src/github.com/seaweedfs/seaweedfs/weed/pb/master_pb/master_grpc.pb.go:534 +0x169
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00063ec00, {0x29f7d20, 0xc004a90c90}, {0x2a05340, 0xc006c81d40}, 0xc006fa2480, 0xc00066b980, 0x3cb39b8, 0x0)
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1372 +0xe03
google.golang.org/grpc.(*Server).handleStream(0xc00063ec00, {0x2a05340, 0xc006c81d40}, 0xc006fa2480)
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1783 +0xfec
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1016 +0x59
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 499162669
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1027 +0x115
panic: runtime error: index out of range [0] with length 0

goroutine 510257991 [running]:
github.com/seaweedfs/seaweedfs/weed/topology.(*VolumeLocationList).Head(...)
        /go/src/github.com/seaweedfs/seaweedfs/weed/topology/volume_location_list.go:31
github.com/seaweedfs/seaweedfs/weed/server.(*MasterServer).Assign(0xc000263500, {0xc00570aa50?, 0x569d26?}, 0xc00570aa50)
        /go/src/github.com/seaweedfs/seaweedfs/weed/server/master_grpc_server_assign.go:99 +0xacb
github.com/seaweedfs/seaweedfs/weed/pb/master_pb._Seaweed_Assign_Handler({0x23112c0?, 0xc000263500}, {0x29f7d20, 0xc004a91320}, 0xc009512480, 0x0)
        /go/src/github.com/seaweedfs/seaweedfs/weed/pb/master_pb/master_grpc.pb.go:534 +0x169
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00063ec00, {0x29f7d20, 0xc004a91260}, {0x2a05340, 0xc006c81d40}, 0xc0006c3560, 0xc00066b980, 0x3cb39b8, 0x0)
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1372 +0xe03
google.golang.org/grpc.(*Server).handleStream(0xc00063ec00, {0x2a05340, 0xc006c81d40}, 0xc0006c3560)
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1783 +0xfec
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1016 +0x59
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 499162669
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1027 +0x115
panic: runtime error: index out of range [0] with length 0

goroutine 510257974 [running]:
github.com/seaweedfs/seaweedfs/weed/topology.(*VolumeLocationList).Head(...)
        /go/src/github.com/seaweedfs/seaweedfs/weed/topology/volume_location_list.go:31
github.com/seaweedfs/seaweedfs/weed/server.(*MasterServer).Assign(0xc000263500, {0xc0049c91e0?, 0x569d26?}, 0xc0049c91e0)
        /go/src/github.com/seaweedfs/seaweedfs/weed/server/master_grpc_server_assign.go:99 +0xacb
github.com/seaweedfs/seaweedfs/weed/pb/master_pb._Seaweed_Assign_Handler({0x23112c0?, 0xc000263500}, {0x29f7d20, 0xc005703a40}, 0xc00998af80, 0x0)
        /go/src/github.com/seaweedfs/seaweedfs/weed/pb/master_pb/master_grpc.pb.go:534 +0x169
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00063ec00, {0x29f7d20, 0xc0057039b0}, {0x2a05340, 0xc006c81d40}, 0xc007797c20, 0xc00066b980, 0x3cb39b8, 0x0)
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1372 +0xe03
google.golang.org/grpc.(*Server).handleStream(0xc00063ec00, {0x2a05340, 0xc006c81d40}, 0xc007797c20)
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1783 +0xfec
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1016 +0x59
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 499162669
        /go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1027 +0x115

Volume:

I0328 18:22:33.323433 volume_grpc_client_to_master.go:179 volume server volume:8080 adds volume 115600
I0328 18:22:33.324040 store.go:166 In dir /data adds volume:115601 collection:bucker-name replicaPlacement:000 ttl:
I0328 18:22:33.586617 volume_loading.go:142 loading memory index /data/bucker-name_115601.idx to memory
I0328 18:22:33.598831 store.go:170 add volume 115601
I0328 18:22:33.598911 volume_grpc_client_to_master.go:179 volume server volume:8080 adds volume 115601
I0328 18:22:33.599586 store.go:166 In dir /data adds volume:115602 collection:bucker-name replicaPlacement:000 ttl:
I0328 18:22:34.089188 volume_grpc_client_to_master.go:71 heartbeat to master:9333 error: rpc error: code = Unavailable desc = error reading from server: EOF
I0328 18:22:35.372315 volume_loading.go:142 loading memory index /data/bucker-name_115602.idx to memory
I0328 18:22:35.373149 store.go:170 add volume 115602
I0328 18:22:39.094108 volume_grpc_client_to_master.go:106 SendHeartbeat to master:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp: lookup master on 127.0.0.11:53: server misbehaving"
I0328 18:22:39.094248 volume_grpc_client_to_master.go:71 heartbeat to master:9333 error: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp: lookup master on 127.0.0.11:53: server misbehaving"
I0328 18:22:44.096552 volume_grpc_client_to_master.go:106 SendHeartbeat to master:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp: lookup master on 127.0.0.11:53: server misbehaving"
I0328 18:22:44.096593 volume_grpc_client_to_master.go:71 heartbeat to master:9333 error: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp: lookup master on 127.0.0.11:53: server misbehaving"

Filer:

W0311 13:24:15.997530 filer_server.go:153 skipping default store dir in ./filerldb2
I0311 13:24:15.998137 leveldb2_store.go:42 filer store leveldb2 dir: /data/filerldb2
I0311 13:24:15.998210 file_util.go:27 Folder /data/filerldb2 Permission: -rwxr-xr-x
I0311 13:24:16.499824 filer.go:169 existing filer.store.id = -46194862
I0311 13:24:16.499850 configuration.go:28 configured filer store to leveldb2
I0311 13:24:16.501191 master_client.go:20 the cluster has 1 filer
I0311 13:24:16.501232 filer.go:123 172.25.0.4:8888 aggregate from peers [172.25.0.4:8888]
I0311 13:24:16.502264 meta_aggregator.go:92 loopSubscribeToOneFiler read 172.25.0.4:8888 start from 2024-03-11 13:23:16.50121336 +0000 UTC 1710163396501213360
I0311 13:24:16.503386 meta_aggregator.go:103 subscribing remote 172.25.0.4:8888 meta change: connecting to peer filer 172.25.0.4:8888: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.25.0.4:18888: connect: connection refused"
I0311 13:24:16.504995 filer.go:289 Start Seaweed Filer 8000GB 3.62 59b8af99b at 172.25.0.4:8888
I0311 13:24:17.320386 filer_grpc_server_sub_meta.go:296 +  listener s3@172.25.0.5:41590 clientId 866174809 clientEpoch 2
I0311 13:24:17.320413 filer_grpc_server_sub_meta.go:36  s3@172.25.0.5:41590 starts to subscribe /etc/ from 2024-03-11 13:24:17.318311566 +0000 UTC
I0311 13:24:18.236936 meta_aggregator.go:92 loopSubscribeToOneFiler read 172.25.0.4:8888 start from 2024-03-11 13:23:16.50121336 +0000 UTC 1710163396501213360
I0311 13:24:18.239082 meta_aggregator.go:189 subscribing remote 172.25.0.4:8888 meta change: 2024-03-11 13:23:16.50121336 +0000 UTC, clientId:1402634012
I0311 13:24:18.240326 filer_grpc_server_sub_meta.go:296 + local listener filer:172.25.0.4:8888@172.25.0.4:39710 clientId -1402634012 clientEpoch 1
I0311 13:24:18.240350 filer_grpc_server_sub_meta.go:117  + filer:172.25.0.4:8888@172.25.0.4:39710 local subscribe / from 2024-03-11 13:23:16.50121336 +0000 UTC clientId:-1402634012
I0311 13:24:18.240374 filer_grpc_server_sub_meta.go:130 read on disk filer:172.25.0.4:8888@172.25.0.4:39710 local subscribe / from 2024-03-11 13:23:16.50121336 +0000 UTC
I0311 13:24:18.247311 filer_grpc_server_sub_meta.go:149 read in memory filer:172.25.0.4:8888@172.25.0.4:39710 local subscribe / from 2024-03-11 13:23:16.50121336 +0000 UTC
I0313 02:49:09.251258 common.go:75 response method:PUT URL:/buckets/bucker-name/some-file-1.jpg with httpStatus:499 and JSON:{"error":"unexpected EOF"}
I0313 02:49:56.316191 common.go:75 response method:PUT URL:/buckets/bucker-name/some-file-2.jpg with httpStatus:499 and JSON:{"error":"unexpected EOF"}
I0313 21:07:28.684044 common.go:75 response method:PUT URL:/buckets/bucker-name/some-file-3.jpg with httpStatus:499 and JSON:{"error":"unexpected EOF"}

S3:

I0311 13:24:14.301675 s3.go:206 wait to connect to filer filer:8888 grpc address filer:18888
I0311 13:24:15.304054 s3.go:206 wait to connect to filer filer:8888 grpc address filer:18888
I0311 13:24:16.306594 s3.go:206 wait to connect to filer filer:8888 grpc address filer:18888
I0311 13:24:17.311954 s3.go:202 S3 read filer buckets dir: /buckets
I0311 13:24:17.311979 s3.go:209 connected to filer filer:8888 grpc address filer:18888
I0311 13:24:17.314197 s3api_circuit_breaker.go:35 s3 circuit breaker not configured
I0311 13:24:17.318607 s3.go:354 Start Seaweed S3 API Server 8000GB 3.62 59b8af99b at http port 8333
E0313 02:49:09.252440 s3api_object_handlers.go:500 post to filer: Put "http://filer:8888/buckets/bucker-name/some-file-1.jpg": read tcp 172.25.0.5:8333->172.208.113.75:44994: i/o timeout
E0313 02:49:56.316233 s3api_object_handlers.go:500 post to filer: Put "http://filer:8888/buckets/bucker-name/some-file-2.jpg": read tcp 172.25.0.5:8333->172.208.113.75:47182: i/o timeout
E0313 21:07:28.683984 s3api_object_handlers.go:500 post to filer: Put "http://filer:8888/buckets/bucker-name/some-file-3.jpg": read tcp 172.25.0.5:8333->172.174.109.45:53622: i/o timeout

filer.toml:

[leveldb2]
enabled = true
dir = "/data/filerldb2"

webdav is removed from docker-compose

System Setup

  • Host OS: Ubuntu 22.04.1 LTS
  • Inside docker-compose, all images (master, volume, filer, s3) are chrislusf/seaweedfs:3.62_large_disk
  • weed version: version 8000GB 3.62 59b8af9 linux amd64
  • Content of filer.toml - default, see above

Additional context
In addition, seaweedfs fails after first restart. It start only after second restart. Unfortunately, I don't save logs of this restarts

It looks like there's a race going on here. For the year above, if the list were empty, then you should get an error. But at some point the link to the pogla list will change to empty. fast fix 5436

Thank you for fast response!