C2FO / vfs

Pluggable, extensible virtual file system for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update s3 backend to NOT write to memory before saving.

funkyshu opened this issue · comments

Is your feature request related to a problem? Please describe.
Currently, when an s3.File invokes Write(), it writes to a byte array. This array is only written to the s3manager.UploadInput.Body (io.Reader), when Close() is called. This may have been done to facilitate writes after seeks however seek currently only seems to apply in a read context.
This has 2 consequences:

  1. We risk unbounded memory inflation on large files.
  2. io.Copy won't directly stream, causing a delay before actual upload to s3 occurs.

Describe the solution you'd like
When Write() is called, we should setup io.Pipe() to facility writer-to-reader action for the s3manager.UploadInput.Body.
We should initialize a temp file.
We should use io.MultiWriter to create a new writer that writes to both the PipeWriter and to the tempfile.
Close() should close the PipeWriter, sending an eof and finalizing the s3manager.Uploader Upload(), as well deleting the tempfile (except in the case below).

// Seek() is called after some Write()
If write is called after seek, we only use the tempfile to write to. The s3manager.Uploader Upload() should be cancelled (via context.Context cancellation) such that it doesn't commit it's Upload to S3 (yet).
When Close is called, the os tempfile should be uploaded to S3 via a new s3manager.Uploader Upload().
The tempfile should be removed

Describe alternatives you've considered
N/A