GoogleCloudPlatform / gcsfuse

A user-space file system for interacting with Google Cloud Storage

Home Page:https://cloud.google.com/storage/docs/gcs-fuse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Creating files does not result in them being created in bucket

kierenj opened this issue · comments

Describe the issue
Relevant parts of Dockerfile:

FROM node:20

RUN apt-get update && apt-get install -y \
    curl \
    gnupg \
    lsb-release \
    tini && \
    gcsFuseRepo=gcsfuse-`lsb_release -c -s` && \
    echo "deb http://packages.cloud.google.com/apt $gcsFuseRepo main" | \
    tee /etc/apt/sources.list.d/gcsfuse.list && \
    curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
    apt-key add - && \
    apt-get update && \
    apt-get install -y gcsfuse && \
    apt-get clean

And the gcsfuse_run.sh script:

#!/usr/bin/env bash
set -eo pipefail

# Create mount directory for service
mkdir -p $MNT_DIR

echo "Mounting GCS Fuse."
gcsfuse --debug_gcs --debug_fuse $BUCKET $MNT_DIR 
echo "Mounting completed."

# Start the application
node main.js &

# Exit immediately when one of the background processes terminate.
wait -n

Am deploying/running via Cloud Build and Cloud Run.

The container allows websocket access to a bash instance.

Good:

  • Any files or folders in the bucket correctly show in the filesystem.
  • Deleting a folder in the filesystem deletes it in the bucket, too
  • Creating a folder in the filesystem creates it in the bucket, too

Bad:

  • Creating a file in the filesystem.. it appears there, but never appears in the bucket

After running with --debug_fuse --debug_fs --debug_gcs --debug_http , using a shell to echo hello > world:
image

System (please complete the following information):

  • OS: FROM node:20
  • Platform Cloud Run (Docker)
  • Version latest? followed instructions in docs

@kierenj Thanks for reaching out!

The logs provided make it clear why you are not seeing the locally created files in the bucket. We sync the local file content to the GCS bucket only when we receive a Sync/FlushFile call from the kernel. This is not the case in your situation.

I have tried to reproduce this issue on our side. On a normal GCP VM, this works fine.

I have also tried to reproduce this issue on a node:20 container, but the behavior is the same. I am getting Sync/FlushFile call in the last.

This is the bash version I tried on bash --version

GNU bash, version 5.2.15(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Below is the gcsfuse logs, I am getting while executing echo test > world6

time="27/09/2023 04:25:18.188337" severity=TRACE msg="fuse_debug: Op 0x00000004        connection.go:500] -> Error: \"function not implemented\""
time="27/09/2023 04:25:18.188740" severity=TRACE msg="fuse_debug: Op 0x00000006        connection.go:416] <- GetInodeAttributes (inode 1, PID 22)"
time="27/09/2023 04:25:18.188868" severity=TRACE msg="fuse_debug: Op 0x00000006        connection.go:498] -> OK ()"
time="27/09/2023 04:25:45.705396" severity=TRACE msg="fuse_debug: Op 0x00000008        connection.go:416] <- LookUpInode (parent 1, name \"world6\", PID 22)"
time="27/09/2023 04:25:45.705594" severity=TRACE msg="gcs: Req              0x1: <- StatObject(\"world6/\")"
time="27/09/2023 04:25:45.705642" severity=TRACE msg="gcs: Req              0x2: <- StatObject(\"world6\")"
time="27/09/2023 04:25:45.863562" severity=TRACE msg="gcs: Req              0x1: -> StatObject(\"world6/\") (157.978837ms): gcs.NotFoundError: storage: object doesn't exist"
time="27/09/2023 04:25:46.439932" severity=TRACE msg="gcs: Req              0x2: -> StatObject(\"world6\") (734.298566ms): gcs.NotFoundError: storage: object doesn't exist"
time="27/09/2023 04:25:46.440038" severity=TRACE msg="fuse_debug: Op 0x00000008        connection.go:500] -> Error: \"no such file or directory\""
time="27/09/2023 04:25:46.440179" severity=TRACE msg="fuse_debug: Op 0x0000000a        connection.go:416] <- CreateFile (parent 1, name \"world6\", PID 22)"
time="27/09/2023 04:25:46.440550" severity=TRACE msg="fuse_debug: Op 0x0000000a        connection.go:498] -> OK (inode 2)"
time="27/09/2023 04:25:46.440713" severity=TRACE msg="fuse_debug: Op 0x0000000c        connection.go:416] <- FlushFile (inode 2, PID 22)"
time="27/09/2023 04:25:46.440802" severity=TRACE msg="gcs: Req              0x3: <- StatObject(\"world6\")"
time="27/09/2023 04:25:46.583685" severity=TRACE msg="gcs: Req              0x3: -> StatObject(\"world6\") (142.879291ms): gcs.NotFoundError: storage: object doesn't exist"
time="27/09/2023 04:25:46.587154" severity=TRACE msg="gcs: Req              0x4: <- CreateObject(\"world6\")"
time="27/09/2023 04:25:47.191094" severity=TRACE msg="gcs: Req              0x4: -> CreateObject(\"world6\") (603.923384ms): OK"
time="27/09/2023 04:25:47.191305" severity=TRACE msg="fuse_debug: Op 0x0000000c        connection.go:498] -> OK ()"
time="27/09/2023 04:25:47.191569" severity=TRACE msg="fuse_debug: Op 0x0000000e        connection.go:416] <- WriteFile (inode 2, PID 0, handle 0, offset 0, 5 bytes)"
time="27/09/2023 04:25:47.191646" severity=TRACE msg="fuse_debug: Op 0x00000010        connection.go:416] <- SetInodeAttributes (inode 2, PID 22, mtime 2023-09-27 04:25:47.184831699 +0000 UTC)"
time="27/09/2023 04:25:47.191766" severity=TRACE msg="gcs: Req              0x5: <- UpdateObject(\"world6\")"
time="27/09/2023 04:25:47.399903" severity=TRACE msg="gcs: Req              0x5: -> UpdateObject(\"world6\") (208.127097ms): OK"
time="27/09/2023 04:25:47.400000" severity=TRACE msg="fuse_debug: Op 0x00000010        connection.go:498] -> OK ()"
time="27/09/2023 04:25:47.400092" severity=TRACE msg="gcs: Req              0x6: <- Read(\"world6\", <nil>)"
time="27/09/2023 04:25:47.897447" severity=TRACE msg="gcs: Req              0x6: -> Read(\"world6\", <nil>) (497.380853ms): OK"
time="27/09/2023 04:25:47.897612" severity=TRACE msg="fuse_debug: Op 0x0000000e        connection.go:498] -> OK ()"
time="27/09/2023 04:25:47.898307" severity=TRACE msg="fuse_debug: Op 0x00000012        connection.go:416] <- FlushFile (inode 2, PID 22)"
time="27/09/2023 04:25:47.898455" severity=TRACE msg="gcs: Req              0x7: <- StatObject(\"world6\")"
time="27/09/2023 04:25:48.124697" severity=TRACE msg="gcs: Req              0x7: -> StatObject(\"world6\") (226.233284ms): OK"
time="27/09/2023 04:25:48.124760" severity=TRACE msg="gcs: Req              0x8: <- CreateObject(\"world6\")"
time="27/09/2023 04:25:48.651607" severity=TRACE msg="gcs: Req              0x8: -> CreateObject(\"world6\") (526.822411ms): OK"
time="27/09/2023 04:25:48.651788" severity=TRACE msg="fuse_debug: Op 0x00000012        connection.go:498] -> OK ()"
time="27/09/2023 04:25:48.651926" severity=TRACE msg="fuse_debug: Op 0x00000014        connection.go:416] <- ReleaseFileHandle (PID 0)"
time="27/09/2023 04:25:48.652077" severity=TRACE msg="fuse_debug: Op 0x00000014        connection.go:498] -> OK ()"
time="27/09/2023 04:25:52.686854" severity=TRACE msg="fuse_debug: Op 0x00000016        connection.go:416] <- GetInodeAttributes (inode 1, PID 118)"

As you can notice FlushFile call in the logs.

I believe the root cause of the issue is that the Sync/Flush/CloseFile call is not being executed at the end file creation. To test this, you can run the following command: strace echo test > world. In my case, I see a couple of close calls. Could you please check if you see close calls in the strace output? If we do not see close calls, then we should file an issue on the cloud-run platform.

newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=3052896, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 3052896, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7692200000
close(3)                                = 0
newfstatat(1, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0
write(1, "test\n", 5)                   = 5
close(1)                                = 0
close(2) 

Regards,
Prince Kumar.

Hi,

With this in my startup script:

echo "* STRACE START *"
strace echo test > world
echo "* STRACE END *"

Here's the output:

* STRACE START *
execve("/usr/bin/echo", ["echo", "test"], 0x3ef0b7d91d28 /* 13 vars */) = 0
brk(NULL)                               = 0x29ef50114000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3e572e750000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=23458, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 23458, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3e572e74a000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220s\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, AT_EMPTY_PATH) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3e572e569000
mmap(0x3e572e58f000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x3e572e58f000
mmap(0x3e572e6e4000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x3e572e6e4000
mmap(0x3e572e737000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x3e572e737000
mmap(0x3e572e73d000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3e572e73d000
close(3)                                = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3e572e566000
arch_prctl(ARCH_SET_FS, 0x3e572e566740) = 0
set_tid_address(0x3e572e566a10)         = 22
set_robust_list(0x3e572e566a20, 24)     = 0
rseq(0x3e572e567060, 0x20, 0, 0x53053053) = -1 ENOSYS (Function not implemented)
mprotect(0x3e572e737000, 16384, PROT_READ) = 0
mprotect(0x29ef50112000, 4096, PROT_READ) = 0
mprotect(0x3e572e785000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x3e572e74a000, 23458)           = 0
getrandom("\x64\xf0\x6c\x0c\x72\x93\x07\x4d", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x29ef50114000
brk(0x29ef50135000)                     = 0x29ef50135000
newfstatat(1, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0
write(1, "test\n", 5)                   = 5
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++
* STRACE END *

Ah - I guess that's stdout it's writing to and closing. Do I need to start/run this in a different way?

Ohh, yes.

Could you please try strace touch world7?

brk(0x55c0c91a2000)                     = 0x55c0c91a2000
openat(AT_FDCWD, "world7", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
dup2(3, 0)                              = 0
close(3) 

In my case, first it opens a file and then it calls Close().

All I get is this:

execve("/usr/bin/touch", ["touch", "world7"], 0x3ef923ed9d28 /* 13 vars */) = 0
brk(NULL)                               = 0x2a967b9ba000

The file exists though, so I suspect this is something to do with running strace in an .sh file?

Ah - if I use strace -f (which apparently 'follows' forks), output includes this:

image

Thanks for the response @kierenj !

I am also getting the similar output.

root@10884398fc7f:/gcs# strace touch world9
execve("/usr/bin/touch", ["touch", "world9"], 0x7ffe7ba249c8 /* 10 vars */) = 0
brk(NULL)                               = 0x55d758a80000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe05387c000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=23578, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 23578, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe053876000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220s\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, AT_EMPTY_PATH) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fe053695000
mmap(0x7fe0536bb000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7fe0536bb000
mmap(0x7fe053810000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7fe053810000
mmap(0x7fe053863000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7fe053863000
mmap(0x7fe053869000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fe053869000
close(3)                                = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe053692000
arch_prctl(ARCH_SET_FS, 0x7fe053692740) = 0
set_tid_address(0x7fe053692a10)         = 286
set_robust_list(0x7fe053692a20, 24)     = 0
rseq(0x7fe053693060, 0x20, 0, 0x53053053) = 0
mprotect(0x7fe053863000, 16384, PROT_READ) = 0
mprotect(0x55d758879000, 4096, PROT_READ) = 0
mprotect(0x7fe0538ae000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7fe053876000, 23578)           = 0
getrandom("\x91\xde\x73\xc3\x16\x58\x30\xbf", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x55d758a80000
brk(0x55d758aa1000)                     = 0x55d758aa1000
openat(AT_FDCWD, "world9", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
dup2(3, 0)                              = 0
close(3)                                = 0
utimensat(0, NULL, NULL, 0)             = 0
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

Also fusermount --version in my case:

root@10884398fc7f:/gcs# fusermount --version
fusermount version: 2.9.9

I am forwarding this to cloud run team for further investigation. - Why is GCSFuse not getting FlushFile call in your case?

Thanks for you patience and understanding!!

@kierenj Do we see files on GCS bucket when we create file using touch command?

@kierenj Thanks for the patience!

Did you follow this doc to deploy the container?

gcloud run deploy filesystem-app --source . \
    --execution-environment gen2 \
    --allow-unauthenticated \
    --service-account fs-identity \
    --update-env-vars BUCKET=BUCKET_NAME

What's the --execution-environment you used while deploying the container?

If you are using gen1, could you please check with gen2?

Thanks,
Prince.

Hi @kierenj,

we are not able to reproduce the issue with --execution-environment gen2, but gen1 definitely reproduces it. Is this issue reproducible every time? If yes, can you please provide us with the exact steps to reproduce?

Note: If you are using Default value, this would be using gen1.

Thanks,
Ashmeen

Ah, you are right.

I saw this in the docs:

Important: Cloud Run jobs automatically use the second generation execution environment

(https://cloud.google.com/run/docs/about-execution-environments)

...and read this as gen2 being default. This is in the docs, under "Configure Services" --> "Environment" --> "Configure Environment".

This is my first time using Cloud Run, so I'm now guessing that Jobs are something else?

Shouldn't that note/docs be under the "Execute background jobs" section of the docs?

To me, "jobs" just seemed like a generic term for "workload".

Either way, specifying gen2 works perfectly, thank you. Personally I think the docs could be made clearer in this area: I saw that I needed gen2 when I started, then saw that note, and made an assumption. It's my error for sure, but maybe this..

image

...could be replaced with a table:

Workload type gen1 gen2
Services Default via override
Background jobs Unavailable Default

?

Hi @kierenj,

Thank you for the suggestion. I will forward it to the Cloud Run team. As the primary issue has been resolved, I will close this bug. Please do not hesitate to reopen it if you encounter any further issues.

Thanks,
Ashmeen