toverainc / willow

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

Home Page:https://heywillow.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Screen times out even with a request in flight

skorokithakis opened this issue · comments

When the REST endpoint takes too long to respond, Willow times out the screen at the specified interval while the request is still active. The correct behavior would be for the interval to apply after the end of any request (so the screen would time out 2-3 seconds after the request has finished).

Presumably Willow keeps track of its state and knows when it's idle, right? What's tricky about this, as mentioned in #347?

Presumably Willow keeps track of its state and knows when it's idle, right? What's tricky about this, as mentioned in #347?

Willow doesn't really have a state, other than two booleans (recording and restarting), but even if it did, it would still be tricky. If WAS command endpoint mode is enabled, the request is sent to WAS via WebSocket. At this time, the request is done. We cannot guarantee there will ever be a response (MQTT endpoint doesn't support it right now, there might be a connection problem, etc). If we were to keep the display on until a response arrives, we might leave the display on "forever".

What I propose is to reset the display timer when Willow receives a command endpoint response from WAS. The screen could still timeout with a request in flight, depending on the configured display timeout and the time it takes to process the response, but it will reactivate when the response comes in. This would already be an improvement, yet simple to implement.

@kristiankielhofner thoughts?

Ah, I see, it's a fully asynchronous design. I think that's going to lead to issues down the line (as we can already see with this bug), because there's a mismatch between what the user expects (a confirmation to their request) and the behavior of the system (no guaranteed confirmation).

In my opinion, there should be two flows with Willow in general, a synchronous one (request-response) and an asynchronous one (notifications). I don't think there's a case where a user would be satisfied with Willow just ignoring them, the user will always want some sort of confirmation, even if just an "ok". This is also how all assistants work now (they show a LED or otherwise indicate that they're thinking about your request).

I know that this is hard to change now, but my proposal would be a design that is fundamentally a queue of asynchronous requests coming in, and synchronous requests that can override those. If I want to receive a notification asynchronously, I'm probably fine with it sitting in a queue for a few seconds while I'm done talking to my assistant about whatever it is I'm trying to do with them.

Telling the assistant "please add an appointment for tomorrow", having it say "your dinner is ready" and then "I've added your appointment" is surprising. Instead, I'd expect the appointment (the thing I'm actually engaging with the assistant with right now) to be handled first, and the asynchronous notifications to wait until idle time. This would require state, but will be better UX overall.

This is just a recommendation, though.

Hm, what do you mean, @hamishcunningham? Synchronous dialog wouldn't require anything more than keeping a connection open, would it? I understand that it's hard to retrofit reliability over an async connection, but that's an implementation issue, rather than a design issue.

Ah, hm, true, I forgot about that. In that case, your alternative with the reactivation of the screen sounds good to me, thanks!

Generally speaking we heavily emphasize "voice interface". Support for the display is implemented because it's there and it's useful for testing, demos, etc.

That said, I like to say "If the screen matters you're doing it wrong". Display support for a voice interface is kind of a crutch; a good voice interface should provide quality UX via audio alone and that's what we emphasize.

True, but it's not about the screen, it's about feedback that it's still processing your request vs that it's somehow failed without returning anything.

Synchronous dialog wouldn't require anything more than keeping a connection open, would it?

That's the thing, Willow sends a WebSocket message to WAS. To my knowledge the WebSocket protocol doesn't have synchronous messages, so it's not that simple. It would be possible to implement some kind of tracking, but this is definitely not trivial, and currently not a priority for us.

As for MQTT not responding at all, we added support for MQTT because some users expressed interest in sending commands via MQTT. We're not handling responses there because users of the MQTT command endpoint could have any kind of custom thing handling message from Willow, with any kind of custom response, so we can't implement a generic response handler. Responding with OK just when sending the message on MQTT succeeded would be weird, if MQTT endpoint wasn't able to understand the request.

For responses arriving out of order, I would personally prefer an alarm set for 13:00 to arrive as close to 13:00 as possible, and it not being delayed by a potential other request still in flight.

I'll look into adding some code to reactivate the screen / reset the screen timer when a response from the WAS command endpoint arrives.

Good point, but we're also a little thrown off with response times that exceed even our default screen timeout (we measure performance in milliseconds). Whatever works for you, I guess, but going back to the Alexa issue that UX is unacceptable in terms of the project overall.

That said, a potential approach here could be some kind of audio feedback while waiting > X seconds for response but that's another development effort that will be low priority because as I said that's a generally terrible user experience.