C2FO / vfs

Pluggable, extensible virtual file system for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add support for listing all files in a location, including files in "subdirectories"

aucampia opened this issue · comments

Currently List() only returns files directly under a specific location. For example, with Google Cloud Storage (gs), if I have f0.txt, f1.txt, d0/f0.txt, d0/f1.txt in location loc, doing loc.List() only returns f0.txt, f1.txt, not files under d0/. It would be nice to have a way to find files with arbitrary depth.

Possible names for the method could be Location.ListAll(), any other suggestions would be appreciated.

This is something C2FO hasn't really had a use case for but I'm definitely open to improving it. Personally I've never liked the trio of List functions we provide. It might make more sense to have a generic ListFunc() function that allows some preset constant functions to passed as well as some user provided function. An API might look like:

    dirFiles := myLoc.ListFunc(utils.FileFilter)
    subdirs := myLoc.ListFunc(utils.DirFilter)
    recursiveFiles := myLoc.ListFunc(utils.AllRecursiveFiles)
    etc...

The trick is trying to remain efficient when you don't really intend to do a recursive calls. Also, obviously os is going to recurse differently than S3 (which doesn't need to recurse at all).
Our rigid rules on that a URI ending / is a directory (note that we have terrible support for alternate dir delimiters like Window's \) is actually is helpful in determining type (Location vs File). This could be helpful recursion (without having to stat in os for instance to determine type).

Thanks for the inputs, I am not sure when/if I will work on this, we also do not have a use case for this now but may have in future and will work on it if we need it.

You could potentially also add pagination to avoid bogging down the system. Recursive calls are very handy but also if the call returns 50,000 objects it's a bottle neck.