git clone into fuse mount fails with `inflate: data stream error`
satwell opened this issue · comments
Describe the bug
A git clone
of large remote repositories into a SeaweedFS FUSE mount reliably fails with this error:
error: inflate: data stream error (unknown compression method)
fatal: serious inflate inconsistency
Here are a few repos I've found that fail:
- https://gitlab.gnome.org/GNOME/glib.git
- https://github.com/u-boot/u-boot.git
- git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
System Setup
-
List the command line to start "weed master", "weed volume", "weed filer", "weed s3", "weed mount".
weed server -dir=/srv/data/weedvol -master.port=9333 -volume.port=8080 -master.volumeSizeLimitMB=4096 -s3 -filer=true -volume.minFreeSpace=10 -volume.max=0
weed mount -filer=weedserver:8888 -dir=/home/satwell/mnt/tmp -filer.path=/
-
OS version: Debian 12 on both server and client
-
output of
weed version
:version 30GB 3.64 b74e8082bac408138be99e128b8c28fd19eca7a6 linux amd64
-
if using filer, show the content of
filer.toml
[filer.options]
recursive_delete = false
[leveldb2]
enabled = true
dir = "/srv/data/filerldb2"
Expected behavior
Expected git clone
to complete successfully. This works fine for smaller git repos that I've tried cloning.
Screenshots
Full git command and output:
halo:~/mnt/tmp% git clone --bare https://gitlab.gnome.org/GNOME/glib.git
Cloning into bare repository 'glib.git'...
remote: Enumerating objects: 211140, done.
remote: Counting objects: 100% (2144/2144), done.
remote: Compressing objects: 100% (271/271), done.
remote: Total 211140 (delta 1941), reused 2064 (delta 1873), pack-reused 208996
Receiving objects: 100% (211140/211140), 92.15 MiB | 6.06 MiB/s, done.
error: inflate: data stream error (unknown compression method)
fatal: serious inflate inconsistency
error: inflate: data stream error (unknown compression method)
error: inflate: data stream error (unknown compression method)
fatal: fetch-pack: invalid index-pack output
Please help to verify the fix.
Hello! Have the same issue.
@chrislusf it seems it still reproduces. I use 3.67 version in Kubernetes.
please share reproducing steps and logs.
- Create kind cluster:
cat <<EOF | kind create cluster --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
EOF
- Install seaweedfs-operator:
git clone https://github.com/seaweedfs/seaweedfs-operator.git
helm install seaweedfs-operator ./seaweedfs-operator/deploy/helm -f - <<EOF
image:
registry: ghcr.io
repository: seaweedfs/seaweedfs-operator
tag: latest
serviceMonitor:
enabled: false
webhook:
enabled: false
EOF
- Create seaweed resource and wait until all components up:
kubectl apply -f - <<EOF
apiVersion: seaweed.seaweedfs.com/v1
kind: Seaweed
metadata:
name: seaweedfs-storage
namespace: default
spec:
image: chrislusf/seaweedfs:3.67
volumeServerDiskCount: 1
master:
replicas: 1
volumeSizeLimitMB: 1024
volume:
replicas: 3
requests:
storage: 5Gi
filer:
replicas: 2
s3: true
config: |
[leveldb2]
enabled = true
dir = "/data/filerldb2"
EOF
- Install seaweed-csi-driver:
https://github.com/seaweedfs/seaweedfs-csi-driver.git
helm install seaweedfs-csi-driver ./seaweedfs-csi-driver/deploy/helm/seaweedfs-csi-driver -f - <<EOF
seaweedfsFiler: seaweedfs-storage-filer:8888
EOF
- Create ReadWriteMany PVC:
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rwm-pvc
namespace: default
spec:
storageClassName: seaweedfs-storage
accessModes:
- ReadWriteMany
resources:
requests:
storage: 3Gi
EOF
- Create Deployment and mount the PVC:
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-pvc
labels:
app: test-pvc
spec:
selector:
matchLabels:
app: test-pvc
replicas: 1
template:
metadata:
labels:
app: test-pvc
spec:
containers:
- name: test-pvc
image: alpine/git
command:
- sleep
- "99999"
imagePullPolicy: IfNotPresent
volumeMounts:
- name: rwm-pvc
mountPath: /mnt/test-pvc
volumes:
- name: rwm-pvc
persistentVolumeClaim:
claimName: rwm-pvc
EOF
- Try to clone quite large git repo to the PVC:
kubectl exec -it deploy/test-pvc -- /bin/sh -c 'git clone https://github.com/u-boot/u-boot.git /mnt/test-pvc/u-boot'
You will get:
Cloning into '/mnt/test-pvc/u-boot'...
remote: Enumerating objects: 999011, done.
remote: Counting objects: 100% (8605/8605), done.
remote: Compressing objects: 100% (5412/5412), done.
remote: Total 999011 (delta 3195), reused 8229 (delta 3103), pack-reused 990406
Receiving objects: 100% (999011/999011), 294.84 MiB | 9.19 MiB/s, done.
error: inflate: data stream error (unknown compression method)
fatal: serious inflate inconsistency
fatal: fetch-pack: invalid index-pack output
command terminated with exit code 128
In the same time if you clone to ephemeral storage:
kubectl exec -it deploy/test-pvc -- /bin/sh -c 'git clone https://github.com/u-boot/u-boot.git /tmp/u-boot'
Cloning into '/tmp/u-boot'...
remote: Enumerating objects: 999011, done.
remote: Counting objects: 100% (8605/8605), done.
remote: Compressing objects: 100% (5411/5411), done.
remote: Total 999011 (delta 3195), reused 8230 (delta 3104), pack-reused 990406
Receiving objects: 100% (999011/999011), 294.83 MiB | 10.08 MiB/s, done.
Resolving deltas: 100% (790494/790494), done.
Updating files: 100% (31982/31982), done.