Lua Libev Support?

Question

Lua Libev Support?

gschmottlach opened this issue 11 years ago · comments

Have you considered the possibility of integration a libev based main-loop using something like the Lua libev binding (https://github.com/brimworks/lua-ev). Your scheduler looks really nice but it remains unclear (to me) how I could integrate it with Lua libev (which I use extensively) and a Lua D-Bus binding I've written that uses it. I see you have a Lua socket and nixio back-end and I would expect that there might be a way forward to integrate a more general purpose main-loop like libev. Can you share your thoughts, provide pointers on how this could be done, etc.... I'd really be interested in an implementation that would support it.

Thanks,

Glenn

Jorge · Answer 1 · Thu May 16 2013 03:30:35 GMT+0800 (China Standard Time)

Interesting question.

The problem I see is that Lumen is a "main loop" type scheduler, just as libev. So any integration will be either ugly, or invasive.

The ugly alternative is to do a selector-like backend, just as the one for luasocket or nixio. That would imply looking for a method for doing a poll()-like call on luaev: you have to provide a implementation for the step() function as found in selector-nixio.lua. That is ugly because you are efectivelly undoing all the work libev has to offer :) Lumen would remain unchanged, just have an additional (convoluted) backend for interfacing with the OS.

The invasive method is to make the Lumen scheduler itself luaev-aware. That means adding code to the go() and step() functions in sched.lua. I think rewriting the main loop found in go(), so it is not depending on the M.Idle() call but on a luaev callback, is enough to have the test.lua app running. Some more work would be needed for a nice support of sockets, etc. This would transform Lumen in a luaev based framework, providing some methods for intertask communications (signals, pipes...).

Glenn · Answer 2 · Thu May 16 2013 09:29:21 GMT+0800 (China Standard Time)

I think I like your second (invasive) option the best since it actually leverages the the libev main-loop. The Lua socket support would be nice but not a necessity for my applications. I guess you could also use libev to poll/select on the descriptors returned from the socket operations. I haven't had much experience with the Lua socket binding so it's not clear what I'd be missing. Definitely needs more investigation. Any other thoughts you might have would be appreciated.

Jorge · Answer 3 · Fri May 17 2013 05:58:07 GMT+0800 (China Standard Time)

Ok, i've never written a libev app, so details may be off, but there go some ideas:

The sched.lua's go() function just calls step() repeatedly, and step() returns how much time there is free until next timeout, so the main loop know how much time to idle. This would be replaced with a single timer callback that would call step() and re-schedule itself with the returned idle time. (Notice that even if the returned idle_time is 0, it must be possible for other callback to be triggered (see next)). Pseudocode:

function go()
  libevcallback timer() 
    idle=M.step()
    timer.schedule(now+idle)
  end
  timer.schedule(now)
  libev.go()
end

Any interruption of the idle time would come from another libev event, presumably a file or socket i/o. A very simple callback for a socket would say something like sched.signal(skt_data_event, data), which would pass control to Lumen and allow it to walk trough all tasks waiting for a socket signal. Ideally, these i/o callbacks would be created from a library with the API from selector.lua, which would allow it to be used by all provided Lumen modules (http-server, shell, etc)
Such a selector backend would be actually be a simplification of any of the provided ones. First, it would not need a task of it's own, as they create in their init() function (this task is the one that does a select/poll call, taken care by libev). Then, the method for the creation of a, for example udp socket, would initialize the socket and create a callback for it. The code of the callback itself would do something equivalent to what selector-luasocket does inside the step() function once select/poll unblocks providing a socket. Pseudocode:

M.new_udp = function ( address, port, locaddr, locport, pattern, handler)
  skt=createudpsocket(...)
  libevcallback activity(data)
    if closing then libev.unschedule(skt); return end
    do_something_with_the_specified_pattern() -- :)
    handle_incomming(skt, data) -- this is real, can be taken from existing backends
  end
  libev.scedule(skt, activity)
end

This would be it.

Glenn · Answer 4 · Sat May 18 2013 06:10:27 GMT+0800 (China Standard Time)

Thanks for your help. I managed to hack together sched.lua and added another selector back-end based on a modified luasocket back-end (called appropriately selector-ev.). I couldn't completely eliminate the task in the selector-ev. It works but I suspect there must be a better way. It appears that trying to emit signals etc... from a pure libev handler (without being synchronized so that it's run within the context of a task) does not work correctly. This likely reduces it's usefulness for my application. I suspect I'd have to go to the the effort of synchronizing all the libev signal handlers to only execute on a scheduled task. That sounds like a tall order at this point without a better understanding of your entire architecture. Regardless, thanks for the insights and I'd be interested if you ever contemplate a true libev based scheduler. My hack probably isn't good enough ;-)

Jorge · Answer 5 · Sat May 18 2013 10:48:37 GMT+0800 (China Standard Time)

Yes, thinking a little more about it I believe what I outlined is not enough: it does not handle some basic cases, like a shorter timeout getting set within the handler of a socket... Probably every time a waitd w/timeout is activated a libev event should be set up. And indeed, it did not occur to me that a signal being emitted outside a Lumen task might have some consequences. I do not actually know, without looking at my own code...
If anything, the correct solution is to be more "libev-powered". This would probably led to an important simplification of the scheduler. It would not be Lumen anymore, tough. It's not that your hack isn't good enough, it is not invasive enough :)

One of the motivations for Lumen was being able write concurrent applications that would deploy unchanged from plain PCs to consumer grade wireless routers with OpenWRT, without ever having to touch a line of C or Makefile (crosscompiling makes my head hurt :) ). So if/when lua libev can allow me that just as the "nixio or luasocket" combo, I would probably ditch the current implementation and do a libev based rewrite from scratch.

Glenn · Answer 6 · Sat May 18 2013 21:17:08 GMT+0800 (China Standard Time)

Agreed, my changes probably were not "invasive" enough to really lead to a good solution. In fact, I wonder if it's really possible. It seems the only reason Lumen works well with luasocket or nixio is that they don't require you to use a particular "main loop" . . . in which case you can use the one implemented by the "go" method. Introducing a foreign main loop (like libev) leads to issues with the architecture as I experienced. Callbacks (e.g handlers) invoked outside the scope of the scheduler coroutine leads to problems (sort of like an ISR).

Just to let you know, a colleague of mine ported Lua to a SHARC DSP and successfully leveraged Lumen in that context as a simple scheduler. Seems to work for him since there is no need for libev in an environment where there is no traditional "OS" . . . we're talking bare metal. Anyways, interesting project. Thanks for your feedback.

Glenn · Answer 7 · Sun May 19 2013 05:03:28 GMT+0800 (China Standard Time)

A thought occurred to me. Is there a way for a "foreign" coroutine (e.g. one that the scheduler doesn't know about) to either signal a task or en-queue a message to a task (either directed or broadcast). The "foreign" coroutine would be a libev callback (e.g. one of it's event handlers) which (when viewed as a pseudo interrupt service routine (ISR)) might be able to wake up a regular task and perhaps feed it data. I tried this before (inadvertently) from a libev IO event on a file-descriptor but the signal that was emitted wasn't picked up by any of the waiting tasks (the shell task in this case). If there was a way to do this then it might serve as a vehicle to inject events into the scheduler from an arbitrary libev event (or even another standard (but un-registered) coroutine). Have you ever tried this? Is there a mechanism that might support this?

Jorge · Answer 8 · Mon May 20 2013 21:46:57 GMT+0800 (China Standard Time)

Ok, I created a branch and will do a few tests.

Jorge · Answer 9 · Tue May 21 2013 04:10:58 GMT+0800 (China Standard Time)

One approach is as in tests/demo-ev.lua (libev branch).

Basically create the ev watchers from a lumen task, and then let it loop as shown (the "while true" loop in ev_task, plus the step function). Notice that the code replaces lumen's idle and get_time calls.

This code is incompatible with current selector.lua. It could be easily transformed into a selector backend, but trouble is libev does not provide methods for creating sockets/opening files (right, no?), so it would be have to be paired with luasocket/nixio for that. Easy, but messy.

(The code still has some border cases to handle cases like only ev_task running with no events, etc.)

Glenn · Answer 10 · Tue May 21 2013 08:49:36 GMT+0800 (China Standard Time)

Interesting approach . . . and similar to one that I initially considered. As you noted, it doesn't (appear) to handle the case when you have "foreign" libev watches that should be running in the background. A solution I utilized had libev running all the time with a scheduling "tick" of some pre-defined granularity. If the granularity is too small, CPU utilization increases while idling since it appears as a busy wait. A tick increment that is too long means you can't switch between "ready" tasks very quickly (e.g. no faster than the tick period).

I've attached both my modified sched.lua and selector-ev.lua to an e-mail I've sent you (apparently GitHub doesn't allow attachments in these comments). The selector uses luasocket to read/write from sockets but uses libev to detect activity on the file descriptors. The problem becomes notifying the task when there is activity on the file descriptor because it's detected (and handled) by the libev io callback. Right now it sets a global variable that the task periodically tests to see if there has been a change. Ideally, I'd like to send it a signal or message and have it scheduled immediately. It appears that this does not work from the libev callback. Perhaps you have some ideas why?

Jorge · Answer 11 · Tue May 21 2013 23:23:21 GMT+0800 (China Standard Time)

Having a tick is very suboptimal. It turns the system into a poll based one, undoing all the effort the system does to be event based... As you say, it will always have a CPU overhead and/or latency higher that strictly necessary.

You have actually two options:

The ev loop is driven from outside lumen. This can be achieved either trough polling (don't like it), or trough a deep rewriting, where the lumen scheduler itself would be driven from libev.
The ev loop is handled from inside a lumen task. This means an unchanged scheduler, and behaving as a selector backend. This is probably the method that sticks closer to lumen design.

Having the ev loop inside a lumen task does not depend on a tick because the scheduler provides the amount of free time it has to run (this is the t=sched.yield() line in selector). Thus the the task can block for as much time as necessary in a single step. To integrate it cleanly it would be needed to write a selector-ev.lua backend with a task and step() as outlined in demo-ev.lua, plus replacing the select/poll call with ev callbacks. The ev callbacks should be put in place in the new_udp / new_fd / etc. calls.

Jorge · Answer 12 · Mon May 27 2013 23:58:13 GMT+0800 (China Standard Time)

Just to add some detail in the "driven by libev" case. You can easily trigger events in lumen from outside, using a modified sched.signal. (see the libev branch)

This call is explicitly provided the task that is supposed to do the emitting. Using this call would correctly work waking up other tasks, etc. This can be seen working in tests/demo-ev2.lua.

Trouble is this will not work correctly if a task does a sleep() or wait() with timeout. Now it is ev loop's responsibility to wake the lumen scheduler doing a step() to process timeouting tasks. This only can be done keeping track of impending timeouts. This would be the rewriting needed. It doesn't look that hard: just intercept the waketime computation in wait(), and store it in a ordered list so it can be used to program a libev timer.

When I get a bit of free time i'll try it.

Glenn · Answer 13 · Wed May 29 2013 09:03:18 GMT+0800 (China Standard Time)

I'd be interested if you make any progress along these lines. I think libev integration could open up a lot of interesting capabilities since it supports different even types (in addition to socket descriptors). Thanks for your efforts trying to support this integration.

Andrew Starks · Answer 14 · Sun Feb 02 2014 13:26:08 GMT+0800 (China Standard Time)

I'd very respectfully like to put a not of caution on this. There are other libraries that support libev and all of them have their place. However, libev is a big dependency and it conflicts and overlaps with the simplicity of lumen, IMHO.

Concurrency is hard hard hard hard. Chances are, if you're doing it, you're better off with a message passing or straight socket library + some kind of message passing. Then, you can use lumen as the loop / concurrency engine. To my way of thinking, this is cleaner, lighter weight and with much less rope to hang one's self with.

Socket descriptors are actually pretty easy to mash up. It takes just about nothing to wedge them into luasockets (it's built so that you can add other bsd-like sockets to them. I did so with nanomsg and I'm an idiot. :)

If anything, I'd support looking at llthreads [https://github.com/Neopallium/lua-llthreads] plus nanomsg [https://github.com/nanomsg/nanomsg], to satisfy what I imagine to be the goal of adding something like libev. nanomsg sports an MIT license and compiles without issue on Windows, Linux, Mac, etc. libev's license is different and requires M4 to compile (good luck with MS/Visual Studio).

In short, it doesn't seem like a project that has the same spirit as lumen seems to have. I believe that it would inevitably pull the project in a much different direction.

(ps: I wrote a functional nanomsg binding, but there are others out there, too.)

FWIW. libev is awesome for what it does, of course.