C2FO / vfs

Pluggable, extensible virtual file system for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using gs.File.CopyToFile after reading the file results in 0 byte file at the destination

aucampia opened this issue · comments

This is because the same reader is used for copying and for reading, so by the time you get to CopyToFile the reader is already at the end of the file.

Probably the simplest solution to this fix will be to just Seek(0,0) before doing the copy.

As a workaround the seek can be done in user code.

Any inputs on possible resolutions is welcome, I think it would be good to just call Seek(0, 0) and I will prepare a patch for that.

Is the behavior different for S3? OS?

I think the same problem will occur if the copy happens from S3 to non S3, and OS to non OS. But I can confirm copying from OS to OS works fine regardless of whether the file is read first.

The difference when it comes to google storage is that copying from gs to gs somehow does use reader. Which maybe should not happen.

I will submit unit test shortly for this, which may make it a bit easier to see what is up.

I think the safest resolution to this, which won't not affect anyone relying on state, is to create a secondary reader for the temp file.

But this will have to be fixed in other back-ends also for cases where where the file is copied to another back-end.

So this is a consequence of the fact that we chose vfs.File to implement io.Reader, io.Writer, I.Closer, io.Seeker. For whose we must keep state of cursor position.

However, I think it's fair to say that our intention for the MoveToFile, MoveToLocation, CopyToFile, CopyToLocation functions was that they were meant to always copy/move ALL of the contents of a give file such that Seek(0, 0) was assumed.

The question we're facing here is whether calling one of the move/copy functions should 1.) alter the state of the cursor position (Close is always called at the end of each so the position would return to 0 afterwards), or 2.) that the state of the cursor position used by io.XXX functions should remain independent of the Move/Copy functions.

Internal to C2FO code, I'd say we've never intentionally used both io.XXX and Move/Copy functions at the same time. I'm struggling a bit understand what desirable scenario there might be to do that. Should we instead simply prohibit (throw error) Move/Copy functions when cursor is not already 0,0? @Roosh513

You solution is of using an independent handle seems logical but what happens after a move operation. It's likely the original io.xxx handle will fail.

Should we instead simply prohibit (throw error) Move/Copy functions when cursor is not already 0,0? @Roosh513

This may be a good interim option, at the very least it would make people aware that something is happening which they likely don't expect. I think even in such a case subsequent io.XXX operations will have surprising results to users.

Will wait for more input, at the moment I am also working around this by setting cursor to 0,0.

Maybe the best option in long term is to have separate File and FileHandle concepts, and File being moveable, and FileHandle provider io.XXX operations. This will break compatibility though.

Probably gs.File.MoveToFile() and/or gs.File.Delete() should reset gs.File.tempFile?

@aucampia Sorry for taking so long. After meeting with the other devs here, we decide for now to return an error after Move/Copy functions.
We found the following to be true:

  • It was our original intent that one would use the io.Read, Seek, Write, Close functions independently of the Move/Copy functions.
  • Close() is an integral function in most implementations to "commit" a write to the file system. There are several reasons for this but partially to accommodate Seek. Because of this, it makes little sense to have an open handle before a copy/move.
  • In cases where we've done this in our own internal codebase, we've always follow the rule that we should first call Close(). See example:
    file, err := vfssimple.NewFile("s3://somebucket/path/to/file.text")
    _, err :+ file.Write([]byte("write some text")
    //commit the writes to s3
    err := file.Close()

    //only now should I call MoveToFile()
    err := file.MoveToFile(otherfile)

Adding an error if cursor is not at 0,0 should actually add safety to our own code, besides making it unambiguous to a new consumer of the library as what is happening.

Thanks for the input, I will look into this tonight to see the best way to implement this in code.