guard / listen

The Listen gem listens to file modifications and notifies you about the changes.

Home Page:https://rubygems.org/gems/listen

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Listener does not follow symlinks

akerbos opened this issue · comments

Consider the following structure:

main
|- sub1
|  |- file1
|- sub2 --> somewhere
   |- file2

where sub2 is a symlink to some other (read- and writable) directory. Assume we watch main and ignore nothing.

While changes to file1 are dutifully reported, changes to file2 are not.

With this scenario you aren't watching one path any more, so it's not going to be sane to make consistent across backends.

...which does bring up a good point that it should probably be documented somewhere at the least.

Well, if I follow the symlink via cd, pwd will suggest that I am in a subfolder of main.

In any case, afaik listen can do multiple watch directories, so maybe it can automatically detect and treat symlinks this way? (Beware the cycles!)

Yeah, we should be able to detect symlink and automatically watch the symlinked directories. 3 questions come to my mind:

  1. What about symlinked files?
  2. The paths reported should be the symlinked one or the original?
  3. What about directories with a lot of symlinks?

@Maher4Ever looks feasible to you? It is a good idea?

ad 1: Watch them, of course!

ad 2: I think the original ones would cause confusion. On GNU/Linux, symlinks are transparent for the user most of the time; this should extend to out situation here.

ad 3: Is there a reason to treat them differently from directories with a lot of files?

So if I'm watching a remote SSHFS mount... Obviously I don't want to do a recursive search for symlinks, never mind multiple levels down.

Mac fseventd calls realpath() on anything submitted to it so you don't have much choice in the matter when it comes to what to report. It's a bit frustrating actually, as the mac specific and unix APIs treat /private differently (among other things).

Perhaps following symlinks, if you want to provide that functionality, can be an optional feature?

Sent from my iPhone

On Apr 26, 2012, at 3:04 AM, Thibaud Guillaume-Gentilreply@reply.github.com wrote:

Yeah, we should be able to detect symlink and automatically watch the symlinked directories. 3 questions come to my mind:

  1. What about symlinked files?
  2. The paths reported should be the symlinked one or the original?
  3. What about directories with a lot of symlinks?

@Maher4Ever looks feasible to you? It is a good idea?


Reply to this email directly or view it on GitHub:
#25 (comment)

This is an interesting bug/feature-request, namely because there are differences between what a "symbolic link" means across operating systems. Beside the notes that @ttilley mentioned above, I'm also concerned with how this feature should work on Windows. I initually thought Windows had no notion of a symlink, but it seems I was wrong.

@thibaudgg To answer your questions:

  • Symlinked files does get picked up by adapters (I just tested in Linux), so I don't think we should do anything more about this.
  • Returning the original paths would be easier to implement, but I agree with @akerbos that it would be very confusing to users. How about we add the option to choose which one to return?
  • In this sitiuation the ruby process would use a huge amount of memory, which might be confusing to some users if they only have a few symlinks in the watched directory. This is why I agree with @ttilley that this feature should be an opt-in feature (disabled by default).

Nevertheless, I think it is a good feature to have in Listen. So props for @akerbos for reporting this limitation. I'll investigate the feasibility more this weekend before deciding how to implement this. @thibaudgg Seems good to you?

@ttilley As you mention it, most GNU tools allow you to switch following symlinks off. I think this is a very reasonable thing to do.

@Maher4Ever I have never used any Windows after XP, so I'd have to trust Wikipedia: "Symbolic links are designed to aid in migration and application compatibility with POSIX operating systems. Microsoft aimed for Vista's symbolic links to 'function just like UNIX links'". So hope is not lost?

As for the reported paths, switching between both is certainly the most defensive way of approaching things, especially because operating systems may have different conventions in the first place.

Is following symlinks more expensive than looking into a "local" folder, assuming both the current folder and the link's target reside on the same file system (or at least machine)? I'd say users are responsible for what they set out to watch; it is not as if symlinks just pop up, you add them specificly to include something in that place (that said, make sure to ignore ..! :>). So my vote is for opt-out, but I can live with opt-in, too.

Good luck in implementing this, I am confident you will find a good solution!

@Maher4Ever Yeah sounds good to me, I would be conservative (but I stay curious ;-)) and choose the opt-in too. Thanks!

After I looked at how the OS-monitors work when it comes to symbolic directories, here is what I've learned:

  • Linux: Inotify does follow symbolic directories and reports changes with paths inside the symbolic directory (not the realpath where the event actually happened).
  • Mac: fsevent does not follow symbolic directories.
  • Windows: I'm still not sure if we would keep supporting FChange for Windows (because of the performance issues), so I didn't check it there yet.
  • All systems: polling could support symbolic directories if a list of all files inside that symbolic directory are stored for comparison later on.

With these facts, we would need to implement this feature almost differently across the adapters, which is not how the adapters work right now. That's because on Mac a watcher would need to be assigned to each symbolic directory, but inotify on linux won't. Polling would need some major changes to support symbolic directories.

As @ttilley mentioned above, Mac returns the realspaths of the changes. inofity on the other hand don't. This means some sort of paths conversion would be necessarily based on which type of paths we want to report. Even after this, I wonder if we could keep the results (paths) consistent across the adapters.

What we could actually do is to add the :follow_symlinks to the multi-lisener. I like this idea for two reasons:

  1. Using the multi-listener would suggest that we consider the symlinks as separate directories. The user could expect that we would return the realpaths of the changes, not paths inside the symlink (link inotify does).
  2. The :relative_paths is already disabled on the multi-listener, which we don't want for the symlinks.

The multi-listnter would accept a single normal directory. Then we could simply scan the watched directory for any symbolic directory and add it to the watched directories. This approach would also keep the results consistent, because no magic would be needed to ensure the consistency across the adapters.

I would love to know your thoughts about the second approach, as it seems the least problematic one to me.

Huh, I guess Inotify is not used on my (Linux) machines, then?

I think your (second) proposal breaks consistency a bit. "Listening to a single directory works with A, but not if it contains symlinks; use B then. B is also used to listen to multiple directories." This does not make much sense from a user's point of view. Also, I have to know beforehand whether my directory contains symlinks or not; I might end up using the multi-listener just in case. Which prompts the question: why are there even two components? As far as I can see, the multi-listener is (conceptually) only a wrapper around multiple single-listeners; if that is correct, hide away the single one. This would make for a narrower interface, which is always good.

That said, handling symlinked (sub)directories as if they were independently watched directories feels awkward, but I can't see any hard disadvantages. Tools using listen might be confused because users provide three paths but seven different paths are reported; if the behaviour is well-documented (which I assume) this is no problem, I guess. Rewriting the paths to match the originally provided hierarchy is probably the most intuitive to use, but causes a runtime penalty. Trade-off time!

@ttilley I think all points made by @akerbos are valid, it would be great to have the same behavior on both side even if we need to remove the :relative_paths option for the SingleListener for that.

I did some more digging lately to see how to get following symlinks working on Linux and Mac in a consistent way and It seems possible without the weirdness of using the MultiListener approach. As noted by @akerbos, using the MultiListener apporach to implement this feature would require the user to know beforehand if the watched directory has symlinks.

@akerbos The only difference between the Listener and the MultiListener is the ability to get relative paths in the callback when using the Listener. Since Listen is still relatively a new gem, I can't say for sure that the relative-paths feature will be used (or not), so I think waiting for a while will be wise before removing one of the Listners.

@thibaudgg I'll implement this feature in the adapters and the directory-record, so the listeners won't have to change at all.

@Maher4Ever that's perfect. Thanks!

@Maher4Ever My currect LaTeX project is packed with symlinks, so I thought I'd ping this one. Are there plans for working on this in the near future?

I just tried it with Ruby 1.9.3, listen 0.5.3 and rb-inotify 0.8.8. It does seem to look into symlinks, but it does not detect cycles/loops and consequently does not terminate. (rb-inotify is apparently not a dependency of listen, I had to install it manually. Bug or feature?)

Without rb-inotify, symlinks are not followed (same as before).

I posted an issue at rb-inotify, too.

@akerbos Yeah it's a feature, you need to install rb-inotify manually. If not you got a warning message.

@thibaudgg Right, I saw the warning only later.

Huh, now I can't reproduce listening in symlinked folders, even with rb-inotify. O.o (It's there, I checked with an explicity require 'rb-inotify).

I'd like to add another usecase as to the value of following symlinks. We have a product that is customized to many customers, and we switch customers by linking in some configuration files via symlinks from a 'customer' folder.

Now I want to trigger a listen action whenever one of those files changes - but I'd like to watch only the symlinks as to not trigger the action when any of the customers (many of course) are touched.

So some way of getting listen to watch symlinks is greatly apreciated.

Correcting my earlier comment: watching files in symlinked folders works (at least with rb-inotify), but changes in symlinked files are not detected.

Closed because no activities, feel free to reopen if interest is back.

@thibaudgg I'm still interested in watching changes in symlinked files; or did you mean interest from contributors?

Yeah contributors, I personally don't need and don't have the time to implement that. But a pull request with that feature would be very welcome!

In that case, you should maybe leave the issue open so contributors can see it as, well, open issue. :)

@akerbos that's fair enough. :)

Ugh, I just wasted 2 hours tracking this down before finding out it was a known issue :-( It's a pretty bad bug, so IMHO it should be listed in the README ...

@aspiers sorry to hear that, I just added a pending features list in the README. 5a1dfe1

Thanks a lot, really appreciate that. Worth adding to the guard README too?

We already say File system changes handled by our awesome Listen gem. in the Guard README, it should be enough no?

No I don't think it's quite enough, because people install guard via bundler without paying much attention to what dependencies it pulls in. I'll submit a PR ...

Ok, thanks in advance for the PR.

Filed as guard/guard#440. BTW this issue should be labelled as a bug not an feature request, since normal users will certainly expect symlinked directories to work. Thanks!

Right, set to bug.

+1 for watching symlinks. Potentially useful for shared SASS libraries to include into your codebase, especially in large companies with multiple websites who want a consistent look across them.

compass watch --poll works for me

Is there any work around, like using polling instead? I don't mind about performance, since we only have dozen of files.

Creating a special guard directory where you hardlink all the files you want to watch might work. (Haven't tried this yet though I considered it before...)

@phuongnd08 at the moment there's no workaround, but once the 2.0 Listen rewrite will be done I'll work on that feature.

I ended up using rsync, require 1 extra window to run rsync, but it works.

@thibaudgg We are at 2.2 now so I thought I'd ping this. Have you been able to cover any ground?

@akerbos I have think about it and I see a lot of implementation issues with symlinks, maybe we should reduce the scope of how symlinks would be supported at the beginning (i.e.: only follow existing symlinks on start). What do you think?

But maybe Celluloid::IO with nio4r will support symlink properly (#159).

@thibaudgg I have to admit that I am not immersed deep enough in the technicalities to see "a lot of issues". As far as I know, symlinks are transparent (on Linux) so one should be able to treat the as regular files. All you'd have to do is detect cycles.

I have no knowledge about the situation on other platforms, however.

If new symlinks (created while listening for changes) are a problem (why?), sure, ignoring such additions while resolving symlinks that have been there since startup is more than we have now.

Yes its mostly due to how symlinks are handled on other platforms and how adapters works, i.e.: on OS X if a new symlink is detected rb-fsevent adapter need to be completely restarted for each directories. Listen hasn't be implemented for that, so it would need some important rewrite.

However, have you tried last Listen version on Linux? Because rb-inotify events flags doesn't include dont_follow flag, so symlinks should be supported (on linux only).

I am on Linux 3.8.0-33-generic #48-Ubuntu SMP x86_64 GNU/Linux with gems

celluloid (0.15.2)
ffi (1.9.0)
listen (2.2.0, 1.2.2)
parallel (0.9.0)
rb-fsevent (0.9.3)
rb-inotify (0.9.2, 0.9.0)
rb-kqueue (0.2.0)
timers (1.1.0)

and changes in symlinked (ln -s) files (or in real files in symlinked directories) do not register. I do

$jobfilelistener = 
      Listen.to('.',
                latency: 0.5,
                ignore: [ <some stuff> ],
               ) \
      do |modified, added, removed|
        $changetime = Time.now
      end
$jobfilelistener.start

Do I have to do something differently?

Could you try directly the rb-inotify to see if it works in that case, thanks!

If I do something like

gem "rb-inotify"
require 'rb-inotify'
gem "listen"
require "listen"

notifier = INotify::Notifier.new
notifier.watch("symlinked_dir/file", :modify) do |event|
  puts "rb-n: #{event.name} modified!"
end

listener = 
      Listen.to('symlinked_dir/file',
                latency: 0.1) \
      do |modified, added, removed|
        puts "listen: modified!"
      end

listener.start
notifier.run

and change the file, I only get:

rb-n:  modified!

Note that name is empty; can that be a problem for you?

Mmm, listen only watch directory, and only use [:recursive, :attrib, :create, :delete, :move, :close_write] events flag at the moment, could you try with:

gem "rb-inotify"
require 'rb-inotify'
gem "listen"
require "listen"

events = [:recursive, :attrib, :create, :delete, :move, :close_write]
notifier = INotify::Notifier.new
notifier.watch("dir_with_symlinked_content", *events) do |event|
  puts "rb-n: #{event.name} modified!"
end

listener = Listen.to('dir_with_symlinked_content') do |modified, added, removed|
  puts "listen: modified!"
end

listener.start
notifier.run

Curious, both ways work for files in symlinked directories.

But: one instance of my program detects that a file changes in a symlinked directory, but another instance does not. That's probably why I did not realise it works (sometimes); I work far more often in the latter case. With your minimal example, however, watching both directories works. The only difference I can spot is that the failing directory has an _ in its name; can that skrew with your code that manages the list of watched directories?

Changes in symlinked files are not detected by either variant in either directory. Adding :modify does not help.

Maybe it's related with #163, running multiple Listen instances is broken since 2.0.

As in globally (bad) or as in per program (tractable)?

I'll test later with only one instance running at any time. It's certainly the case that I usually have multiple sessions running.

I skimmed #163; I actually use two listeners per program instance but on disjoint file sets and both seem to work fine (will check later, too).

ok thanks, I'll try to fix #163 this week

@akerbos supervisor branch should fix #163, could you give it a try please? Thanks!

This might still be an issue that I am experiencing with SASS 3.3.4.

TL;DR: there are plans ahead for new listen API that could help here

I'm planning a new API for listen and some features, which would include multiple adapters listening on different directories and running with different options.

Then, for some, the quick-and-dirty solution may then be to just setup a TCP listener - and then run broadcasting listen instances in the directories.

The effect? Same as symlinks giving relative paths. It's not as convenient to set up, but in a worst case scenario at least polling is done by a separate process ;)

From a user's perspective, the key is watching directories (and not files - as many intuitively want to) - and then the adapter (!) should always translate absolute paths relative to the directory it was told to listen to. That's so apps would work.

There are two topics to separate: changes and notifications.

E.g. you could have one file changed, but due to symlinks, you may need many notifications.

And that's why you can't "resolve" symlinks.

Watching symlinked content conceptually means:

  1. watching the directory(s) with the symlink(s)
  2. likely watching the directory containing the link target(s)
  3. watching the linked target itself (usually - only if it's a directory)

Because:

  1. If the symlink changes, the path to the same content may no longer be accessible
  2. If the symlink target is renamed or deleted, same thing
  3. And of course, if you move a subdirectory within the linked target directory, the files there are no longer accessible.

This means watching at least 3 objects for one symlink - and then the adapter would have to decide whether all the "different" notifications from each are really different (not as absolute paths - but if they're on the same path that was given for watching (which could contain symlinks).

Just as if you'd watch the same directory 3 times, it makes sense to get 3 identical notifications. But if you're watching 1 and the adapter is setting up many - the adapter has to remove "duplicates" (which depends not on absolute paths, but on symlinks along the way).

The ideal solution: letting the user decide which directories to watch and for what purpose - and maybe even how to "route" or "map" paths. That means a lot less notifications, a lot less confusion, more flexibility, etc.

"Mapping/routing" may not seem to make sense unless you realize that notifications don't have to produce accurate paths, as long as you know which tools to run (in response), and as long as those tools know where things are.

Case in point - TCP listener, where none of the notifications represent accessible paths if they're on another machine - symlinked or not.

I think the inotify adapter is a great reference, because it implements it's own (separate from inotify) recursion handling (but recommends using absolute paths anyway) and when it "doesn't work as expected" it's usually for good design reasons.

Anyway, I'm interested in feedback about this.

P.S. To balance things out, the current available event types will be "dumbed down" and simplified - but for good reasons.

reproduced here with guard-sass and guard 1.8.3, very annoying bug...
a fragile workaround was to run:

sass_locations.map!{ |l| l = "#{File.dirname l}/#{File.readlink l}" while File.symlink? l; l }

After lots of thinking, resolving symlinks should be done at the application level (if possible) or the directories should be reorganized so that watched files are "physically" in the watched directory.

If someone has an issue that cannot be worked around without changes in listen (or a specific adapter), it's best to have a valid real-life example to work with, because otherwise there are just too many edge cases to deal with (and everyone will have a different opinion about the "most intuitive" solution).

Since adapters can't work consistently (by definition), I think it's fine to have more adapter-specific options to let users tweak the behavior to whatever they need.

@e2 somebody told it is possible to watch symlinks with inotify adapter, but I could not find a way to select this adaptor with guard. how to do it?

@e2, I'm not sure I follow. On Linux, symlinks are essentially transparent; it's not at all clear to me where problems (beyond cycles, which are easy to detect) should arise. From an algorithmic point of view, breadth-first search (up to a certain depth, as an option) on the file system graph, starting in all specified roots, should yield a well-defined set of inodes that are listened to. Can you please detail what issues you see?

As for a real-life example, I create exercise sheets for several lectures across several years. All of them include parts of the same (LaTeX) preamble, those of the same lecture share other parts and get their problems from a central folder per lecture (which survives the years). Of course, I set up all of this with symlinks.

@brauliobo - inotify is the native adapter on Linux, and it's used unless you force polling

@akerbos - to avoid a long discussion, do you personally have a specific issue with symlinks on Linux that doesn't currently work as you'd want it to?

Regarding loops, there's: #259 (although I haven't investigated where the loop is traversed exactly - there's also an issue in rb-inotify - guard/rb-inotify#21).

In short, I don't want Listen to list "symlink support", because:

  1. means different things to different people (!)
  2. people will make false assumptions (at the very least getting already resolved paths can be surprising - and can break apps relying on path substitution)
  3. every adapter gives different results on plain files and directories anyway
  4. writing acceptance tests for such cases is not worth the effort (as of now)
  5. since results are adapter specific, it's better for people to use specialized libraries for their scenario (especially if their needs are uncommon)
  6. listen is too complex as it is to take on responsibilities easily handled by other tools or workarounds (e.g. hardlinks, mount/bind, TCP, multiple listen instances, using backend libs directly, resolving symlinks yourself, etc.)
  7. listen's responsibility isn't well defined for files and directories to begin with (it's defined by use cases - which I believe is more than sufficient)
  8. changing the directory structure while listen is running may not provide predictable results
  9. symlinks are mostly relevant only on Linux and OSX - so any "special" symlink handling seems to go against the idea of cross-platform functionality
  10. people should be careful with symlinks and specific about what they expect about handling them - so the less assumptions listen makes about them, the better
  11. performance reasons in some cases (notably - Record building, and relative path building)

Also, only hardlinks are technically transparent (handled by the OS) - symlinks are transparent only if you're not dealing with path processing (which is what Listen does).

I've changed my thoughts about symlinks in Listen...

Basically, we can close this issue if this patch gets pulled in: #273

Why? Because it would guarantee that there's max one reference (symlink or not) to a real subdirectory within a watched directory. This makes it almost trivial to resolve back and forth (symlink <-> realpath) if ever needed.

Overall, it's the user's responsibility to avoid "filesystem loops" or even avoid listening to the same physical directories (within a single adapter thread at least) - there's no way Listen can know a user modified something through a symlink or not, while reporting real paths is counter-intuitive.

And feedback is appreciated before I break the world with this patch...

@e2 Cool, thanks! This sounds like what I was thinking about (let the user care about loops). I'll head over to the other issue for the specifics.

I'm closing this since any issue with symlinks will be related to: #274 which needs to be implemented first.

Basically, recursion in Listen prevents implementing smart symlink handling (smart, meaning avoiding loops, avoiding watching the same physical directory multiple times and reporting symlink paths as changed and not the physical ones).

Fixed code to resolve links:

item = "#{File.dirname item}/#{File.readlink(item).gsub /\/$/, ''}" while File.symlink? item

After still a few more attempts to wrap my head around this, I believe these 2 feature requests would help make Listen "intuitive" with regards to symlinks: #280 and #279

hey? Is it fixed or not? I want to listen symlink to my json file with localization.
I do:

guard :shell do
  watch(%r{(?<path>^.+?)/localization.json}) do |m|
    n "I AM HERE!" #
    if system("cd #{m[1]} && ruby ios_localization_manager.rb")
      n "#{m[0]} is correct", 'JSON Syntax', :success
    else
      n "#{m[0]} is incorrect", 'JSON Syntax', :failed
    end
  end
end

and nothing happens when I change original file

@lolgear - this is a complex issue. And it depends what platform you're on. Or the adapter you're using - if you use polling, it should probably work.

Otherwise, it's a problem, because if you watch symlinked_foo/localization.json, the directory foo has to be watched, which means the even you'll get will be changed: foo/localization.js (not symlinked_foo/localization.json, so the the watch pattern won't match).

Just read my above comment - these need to be implemented: #280 and #279

It's a bit of work (it's like creating another abstraction layer above the filesystem) and few people need this.

The solution: do it the other way around. Watch the real file, and symlink it where other tools need it.

Modification of symlink is not working in rails listen gem while deploying application using capistrano deployment, every time we generating new files in release directory symlink is pointing to current folder but listen is looking for older directory.
very first time creating symlink working fine but second time modification of symlink listen shows error as

In browser while hitting the server we are getting errir as
"Errno::ENOENT No such file or directory @ realpath_rec - /var/www/releases/20170306..."

listen (3.0.8) lib/listen/adapter/config.rb:17:in realpath' listen (3.0.8) lib/listen/adapter/config.rb:17:in realpath'
listen (3.0.8) lib/listen/adapter/config.rb:17:in block in initialize' listen (3.0.8) lib/listen/adapter/config.rb:16:in map'
listen (3.0.8) lib/listen/adapter/config.rb:16:in `initialize'

Note: cap deploy command for symlink -
execute "cd /var/www/retailrecycle && ln -s ./releases/#{File.basename release_path} current"

Thanks for help