Update s3 backend to NOT write to memory before saving.
funkyshu opened this issue · comments
Is your feature request related to a problem? Please describe.
Currently, when an s3.File
invokes Write()
, it writes to a byte array. This array is only written to the s3manager.UploadInput.Body
(io.Reader), when Close() is called. This may have been done to facilitate writes after seeks however seek currently only seems to apply in a read context.
This has 2 consequences:
- We risk unbounded memory inflation on large files.
- io.Copy won't directly stream, causing a delay before actual upload to s3 occurs.
Describe the solution you'd like
When Write()
is called, we should setup io.Pipe() to facility writer-to-reader action for the s3manager.UploadInput.Body
.
We should initialize a temp file.
We should use io.MultiWriter to create a new writer that writes to both the PipeWriter and to the tempfile.
Close() should close the PipeWriter, sending an eof and finalizing the s3manager.Uploader
Upload(), as well deleting the tempfile (except in the case below).
// Seek() is called after some Write()
If write is called after seek, we only use the tempfile to write to. The s3manager.Uploader
Upload() should be cancelled (via context.Context cancellation) such that it doesn't commit it's Upload to S3 (yet).
When Close is called, the os tempfile should be uploaded to S3 via a new s3manager.Uploader
Upload().
The tempfile should be removed
Describe alternatives you've considered
N/A