emilk / egui

Is your feature request related to a problem? Please describe.
Not all GUI app users are sighted or can see well. This project does not indicate any effort to make apps accessible to the visually impaired.

Describe the solution you'd like
For this project to make a conscious and focused effort to support apps that can be used by the visually impaired.

Describe alternatives you've considered
None.

Additional context
I (and many others) will not use a GUI library in production unless it is designed with accessibility in mind.

From the top of my head, here are some tasks to improving accessibility:

A high-contrast, large text visual theme
Output data necessary for screen readers, braille displays and similar tools
Hook up such data in egui_web (and maybe egui_glium)

The accessibility data needs structure ("there is a window here with a scroll area") and semantics ("this is a heading, this is a checkbox"). What are the good, open standards for such accessibility data?

Is there an accessibility API for the web that doesn't require equi_web to create a fake DOM that mirrors the GUI?

Ideally I'd like egui to just output the GUI structure as e.g. JSON and pipe that through some accessibility API.

I'd appreciate some help with this as I don't have any experience here!

@emilk There's some potentially helpful content in this Bevy issue:

Starting from ~ bevyengine/bevy#254 (comment)
But particularly: bevyengine/bevy#254 (comment) (from developer of https://github.com/lightsoutgames/godot-accessibility).

Thanks for those links @follower !

I'd hoped I might've been able to find some Free/Open Source Software developer specific accessibility tools/guidelines[0] but didn't manage to find much from a brief search.

[0] I wonder if there's one or more companies that might want to support such an effort? *cough* :D

There's a reasonable amount of coverage of WWW/HTML/DOM aspects but less so for Canvas/desktop applications. (There's some older somewhat "enterprisey" content but it's a little difficult to separate the wheat from the seo content marketing...)

A couple of links that might still have some applicability:

Additional thoughts

There definitely seems to be opportunity for egui to have accessibility as a strong part of its story which has the potential to be compelling for both legally mandated & philosophical reasons--and seems consistent with the values of the Rust ecosystem itself.

Of course, if it were easy everyone would be doing it, I guess...

Based on the thread I linked above perhaps the recent Bevy+egui integration might be a good place to explore accessibility-related development further.

As I mentioned in one of the comments in that thread (while quoting the godot-accessibility dev):

Examples like "Working alongside sighted developers who would prefer a visual editor" really highlight to me the importance that the tools we create not exclude people from being part of a development team by being inaccessible.

Hope some of the linked resources might be useful.

Existing information, experience & resources

In light of (1) above I decided to revisit the work done in https://github.com/lightsoutgames/godot-accessibility to see what I could learn; and, (re-)discovered that the included Godot-specific plugin https://github.com/lightsoutgames/godot-tts (see) is actually built on top of a Rust crate https://crates.io/crates/tts (see & see) (all developed by @ndarilek).

`egui` & `tts` proof-of-concept

Having discovered tts I thought I'd see what it would take to get a proof-of-concept running with egui & tts running together.

The initial result is available here: https://gitlab.com/RancidBacon/egui-and-tts

I was intending to document the functionality & prerequisites/build process better (as I ran into a couple issues along the way--still need to write them up) but was losing forward momentum so have just made it public as-is.

The proof-of-concept features:

Two egui buttons.
When the mouse is hovered over a button its tooltip is spoken via TTS.
Keyboard navigation via TAB / Shift-TAB key.
When keyboard navigation is used navigating to a button speaks its label.
Speech when a button is pressed (via Enter) or clicked.
Some attempt to not have a cacophony of sound through use of "cool down" etc. (I assume there's probably some standard set of edge-cases to handle with regard to this.)
Code written by someone still learning both Rust & idiomatic Rust. (That would be me. :) )
An, on reflection, probably not very TTS inclusive pun-based app name of "WhaTTSApp" (i.e. "What TTS App").

I think ideally if similar functionality was to be integrated with egui it would be best to handle it without requiring additional setup code such as that needed in the proof-of-concept.

But the main conclusion is that, yes, it's possible to get egui & tts working together (at least on Linux) without too much trouble but it has highlighted some areas within the API that might benefit from further development.

(Also, it's entirely possible the way I've implemented things is terrible from the perspective of someone who relies on TTS which would also be helpful feedback. It also doesn't deal with other accessibility support which standard OS-level UI toolkits provide but is hopefully at least a useful starting point.)

Feedback

With regard to (2) above, I was aware that @ndarilek had a pre-existing interest in Bevy, so was going to ping him about the fact that there was a egui+bevy integration and that this issue existed, if he was open to providing feedback.

However, in the process of researching egui keyboard control I discovered he was way ahead of me :) and was already active in the project issue tracker at #31 (comment).

[So, hi @ndarilek! Also, I have just noticed that your bio mentions you're in Texas, so I imagine given the current power/weather situation there, GitHub issues are unlikely to be a priority but I'd welcome your feedback on the proof-of-concept at some point if you're open to doing so. I appreciate all the existing effort that you've put into sharing your experiences & motivations and developing tts & related crates. Thanks! (And...wow, the weather/power situation sounds crappy, hope you're doing okay.)]

If anyone else with experience with TTS would like to try out the proof-of-concept & provide feedback that would also be appreciated--particularly with regard to aspects that would benefit from being designed in from the start.

Next steps

Unfortunately at this point in time I can't make any commitment to further development on this (thanks to some combination of ADHD & finances :) ) but at a minimum hopefully this work may serve as a useful building block or step in the right direction.

BTW @nvzqz do you have a specific evaluation criteria/check-list in mind that could serve as a framework/target during design/development?

Also, a couple of relevant recent items discovered re: GTK4 & Accessibility:

https://blog.gtk.org/2020/10/21/accessibility-in-gtk-4/ -- "This is a much more direct approach, and matches what Qt and web browsers already do."
https://developer.gnome.org/gtk4/4.0/Accessibility.html
https://developer.gnome.org/accessibility-devel-guide/stable/gad-ui-guidelines.html.en -- "User Interface Guidelines for Supporting Accessibility" (Although I think this is for an older GTK release.)

Hey, this is exciting work. Thanks for pinging me, and for putting together this PoC. I'd earmarked this for when I needed to investigate UI for my game, but covering the same accessibility ground is a lot less enjoyable than making games. :) So thanks for the start. Re: TTS chattiness. I haven't looked at your code yet, because as you've noted we're having a once-in-a-generation winter storm here and I'm stuck on a 12-year-old laptop with no modern graphics hardware. :) But one quick mitigation is to use the second `interrupt` argument to `speak` and set it to `true` at the start of each utterance. So whenever focus moves to a new widget, speech is interrupted and TTS is less chatty. Also do that for typing characters into text fields. Screen readers have additional hooks, but those are hard to make generic. Maybe you've done that already--opening any Rust codebase in VSCode grinds this poor thing to a halt. I'll look into this and help out more when I'm back on my main workstation in an apartment with reliable power. :) Thanks again!

Amazing work @follower !

It seems to me that #31 is a high priority issue for this. Once that is fixed, then we could add a mode where egui emits events in its Output whenever a widget is selected and clicked. That could then easily be plugged into the tts backend.

Finally getting some time to play with this. I'm prototyping in Bevy because it's a bit clearer to me how I might begin separating out the screen reader functionality from the UI itself. I notice that some members of `interaction`, particularly `kb_focus_id`, seem to have changed from `pub(crate)` to `pub` in 0.10. Is there some way to access `kb_focus_id` from memory? I've never worked with immediate-mode GUIs before. I suspect the way to monitor focus for changes is to have a function/system check for changes in my game loop, and act accordingly? If so, is there some way I can access the focus ID? I thought about PRing a getter on `Memory`, but wanted to check whether I'm missing something else first. Also wanted to make sure such a PR would be welcome. Also, is there some way to get a widget based on its ID? Once I'm successfully tracking changes, I'm going to want to figure out which widget is associated with them and act accordingly. The prototype does this a bit roughly, but now I'm wondering what it'd take to make this pattern more broadly applicable. Thanks.

@ndarilek There is ctx.memory().has_kb_focus(widget_id), and I just added gained_kb_focus as well.

So one design would be that widgets emit an event to egui::Output when they gain keyboard focus so that the integration can e.g. read their contents with TTS. For instance, egui/src/widgets/text_edit.rs would need something like:

ui.memory().interested_in_kb_focus(id);
if ui.memory().gained_id_focus(id) {
    ui.output().events.push(OutputEvent::WidgetGainedFocus(WidgetType::TextEdit, text.clone()));
}

Also, is there some way to get a widget based on its ID?

Widgets are pieces of of code that is run each frame. egui stores nothing about a widget, so there is nothing to get. See https://docs.rs/egui/0.10.0/egui/#understanding-immediate-mode

#31 has been closed - you can now move keyboard focus to any clickable widget with tab/shift tab.

Next up: I'm gonna add some outgoing events from egui every time a new widget is given focus. That should then be fairly easy to hook up to to TTS system.

egui now outputs events when widgets gain focus: https://github.com/emilk/egui/blob/master/egui/src/data/output.rs#L56

This should be enough to start experimenting with a screen reader, and should provide a framework for building more features around. There's still a lot more to do!

Background: I built one of the first open source Android screen readers, so I have some sense for how accessibility APIs should work. I recognize the challenges immediate-mode GUIs pose. I think, though, that there does need to be some sort of central registry of widgets independent from the GUI code, queryable by ID. I need a single entrypoint that'd let me, say, capture every exposed event on every created widget and perform some sort of accessibility handling, possibly querying for other widgets via ID as well. Without that, we'll either need lots of code duplication, or special handling of every widget at time of creation. The harder we make accessibility, the less likely it is to just be done. I wonder if some sort of HashMap<ID, _> would be sufficient? Widgets are automatically added at time of creation. Create a `Drop` impl that cleans up the registry whenever the widget goes out of scope. Seems doable, with the possible caveat that whatever gets stashed in the registry needs to be a weak reference that knows to clean itself up whenever it's the only reference to a given widget. The events would then ship along a widget ID, because whatever presents them is going to need to access very specific properties on each widget type. Not sure how doable that is under the current system. You can see an example of what I mean at https://github.com/lightsoutgames/godot-accessibility/blob/master/Accessible.gd#L588. Maybe there's a cleaner way to do that, but essentially I spawn a Godot `Node` under every control, and make it do a whole bunch of introspection on its parent to present it sensibly. After a few tree scrapes to guess labels for things like text fields, it then performs a giant conditional check and runs a whole bunch of subordinate functions for widget-specific logic. And those functions each need deep knowledge of whatever widget they're operating on. Thanks for being willing to experiment with this! :)

@ndarilek thanks for helping out!

The problem with storing references to widgets is that in immediate mode, widgets is not data, but code that is run once per frame. See for instance the toggle_switch.rs example widget or the Understanding immediate mode section of the docs.

As for events: I've just added so that egui outputs events when a widget is selected (given keyboard focus). This can then be hooked up to a screen-reader. The selected widget is controlled with space/return (buttons etc), arrow keys (sliders) or keyboard (text edit). You advance to the next widget with tab (or shift-tab to go backwards).

You can checkout latest master and cargo run --release and use TAB to select widgets. You should be able to see this in the Backend panel:

I'm gonna try hooking this up to a simple TTS system in egui_glium to close the loop so we can start playing with it for real.

I recognize the challenges immediate-mode GUIs pose. I think, though, that there does need to be some sort of central registry of widgets independent from the GUI code, queryable by ID.

I almost posted my 2 cents here yesterday on exactly this topic. I believe that immediate mode GUIs can build an in-memory representation of the UI as a DAG, just as easily as a retained mode GUI can. This DAG can be queryable, individual elements can provide additional accessibility context, and user code can go the extra mile to provide application-specific context as needed.

egui already retains some state for UI elements between each frame and identifies those elements by a unique ID:

egui/egui_demo_lib/src/apps/demo/tests.rs

Lines 21 to 33 in 44cd304

    
                   ui.label("\ 
        
                       Widgets that store state require unique and persisting identifiers so we can track their state between frames.\n\ 
        
                       For instance, collapsable headers needs to store wether or not they are open. \ 
        
                       Their Id:s are derived from their names. \ 
        
                       If you fail to give them unique names then clicking one will open both. \ 
        
                       To help you debug this, an error message is printed on screen:"); 
        
                   ui.collapsing("Collapsing header", |ui| { 
        
                       ui.label("Contents of first foldable ui"); 
        
                   }); 
        
                   ui.collapsing("Collapsing header", |ui| { 
        
                       ui.label("Contents of second foldable ui"); 
        
                   });

This is necessary for several reasons, but the same trick can be used on each frame to create the structures necessary for interacting with accessibility APIs.

This is a design I have been kicking around in my head for quite a long time. I like the flexibility and no-nonsense approach to immediate mode GUIs, but I am also aware of (some of) the needs of accessibility software in relation to GUIs. Something of a hybrid immediate/retained approach is the best-of-both-worlds; The API remains immediate, and some state is retained for ease of use and doubles as a source of truth for screen readers.

If you checkout main and edit build_demo_web.sh to enable the screen_reader feature and run the script, you should now hear a voice describing the focused widget as you move the selection with tab.

This is still very early, and more events needs to be hooked up (e.g. you probably want to hear a voice when editing a label and not just when first focusing it).

There is one more thing to consider: how should an egui app know whether or not to read things out loud? I can't find a javascript function for "does the user want a screen reader". The same problem exists on native, but I'm sure with some digging one can find platform-specific ways of doing so.

@parasyte Having egui keep a shadow-DOM behind the scenes is a big change though, and requires a lot of work and planning. Before doing that I'd like to be absolutely sure we need it. In my proof-of-concept screen reader I get away with describing just the newly focused widget so there is no need to store and manage a bunch of state behind the scenes.

Nifty! A bit brainfried right now but will check this out tomorrow. As for how to know whether a screen reader is wanted, I'd say that's game/platform-specific. My games will almost certainly enable it by default. Some games may query platform-specific APIs to determine if a screen reader is running, or announce/document a system for pressing a certain key combination or making a certain touchscreen gesture to activate it. Not sure how you're planning on this working, but if it's an EGui feature, then I'd say make it something someone can activate/toggle in your API wherever it makes sense--the UI context, maybe?

@ndarilek Absolutely there will be a programmatic option, but I was thinking more in terms of eframe (egui_web/egui_glium). For instance: if I visit https://emilk.github.io/egui/index.html and and move focus with tab, I don't want to head a voice reading out what I am selecting, but someone else may want that. How will the code running know the preference of the user? Sure the web app could add a big button "turn on screen reader", but ideally the browser would instead communicate that to the egui app somehow.

Browser-based screen reader detection is not, and should not, be available as it'd be a huge fingerprinting/privacy liability. In this situation, a button as you've described is your best option.

Ah, fingerprinting. This is why we can't have nice things :)

Sorry, I just now checked this out. Does this have to be implemented on the platform layer? Or is there somewhere in an existing UI where I can hook into whatever events the screen reader consumes and handle them myself? I'd like to build out the event support, but I think it'd be more helpful if I could do so in, say, an existing Bevy app vs. having to recompile my app and egui for every change. I see that the support is implemented on the platform layer--I just don't know if the library exposes enough for me to do a temporary implementation on a higher level, then push it down to the platform when I'm more satisfied. Maybe this is also antithetical to immediate-mode GUIs--there just has to be some way to make these more accessible, even if it means bending the conventions a bit. Thanks.

@ndarilek The events generated by egui can be handled in whatever way you want. They are avilable in output.events: https://docs.rs/egui/0.11.0/egui/struct.Output.html#structfield.events

Right now the only event is when a widget gains focus. We should probably add an event for when a value changes, and probably other things too. This should be fairly easy to add once we know what to add. Someone with more experience than me can help with the requirements here maybe?

Gotcha, thanks, now that I have an entrypoint, I'll definitely look into it. Going to need UI soon.

OK, finally digging into this and having some issues. If I'm not doing something obviously wrong, I'll put together a minimal repro. Here's relevant metadata from Cargo.toml:

bevy_egui = "0.4"

[patch.crates-io]
egui = { path = "crates/egui/egui" }

IOW, using the bevy_egui plugin to integrate with Bevy, but patching the egui dependency to a local checkout of my fork at https://github.com/ndarilek/egui. When I use VS Code's Go to definition on egui symbols, it takes me to the files in my fork, so that seems to work.

Next I try creating a start menu for my game:

fn start_menu(context: Res<EguiContext>) {
    egui::Window::new("Rampage").show(context.ctx(), |ui| {
        let start = ui.button("Start");
        start.request_focus();
        if start.clicked() {
            println!("Start clicked");
        }
        if ui.button("Quit").clicked() {
            println!("Quit clicked");
        }
    });
}

Launching my game and pressing Enter immediately prints what I'd expect. Pressing tab, shift-tab, etc. and enter doesn't print "Quit clicked" as I'd expect. Am I doing something obviously wrong?

Next I tried creating a screen reader system to consume the output events so I can start building them out with more metadata.

fn screen_reader(context: Res<EguiContext>, mut last_seen_event: Local<usize>) {
    let events = &context.ctx().output().events;
    if events.len() > 0 {
        println!("{:?}", events);
    }
    let events = &events[0..events.len()];
    for event in events {
        println!("{:?}", event);
    }
    *last_seen_event = events.len();
}

This never prints anything--not an event, not a list of previously-received events, etc. So either my initial focus request isn't being recorded as an event, or I'm doing something wrong.

Help appreciated. Essentially my plan is to do what I always do when I parachute into these "no accessibility" situations--build out my UI, adding accessibility as I need it, and push PRs upstream. But I've never worked with immediate mode GUIs before, so maybe I shouldn't be treating contexts/memories like I am here.

Thanks.

FWIW, also plunked down a:

context.ctx().memory().options.screen_reader = true;

at the beginning of start_menu. Looks like that might only make labels focusable for now? In any case, I still can't tab between my buttons.

Thanks.

Sorry to spam this issue so much. No rush, I'm just keeping it updated with a running commentary on what I've done.

I did find the issue where I create an extra quit button and don't check it, but unfortunately that's not my problem. :(

I created a minimal reproduction at https://github.com/ndarilek/bevy_egui_a11y. I did eventually get focus events out of it somehow, but I don't know how, and they didn't seem consistent nor did they update what button I seemed to click. I'm not sure if this is an issue with the Bevy plugin or something else, but I'm out of ideas, and since I can't see the GUI window to determine if anything is updating, this is probably where I have to leave it.

Thanks.

Ok, can't put this down and made some progress.

Realized my issue was that I was locking focus to my start button by setting it every frame. I added a getter to retrieve the current focus ID, only requested focus if none was set, and updated my example to use my fork. Now I can successfully click Start and Quit.

Now, pressing tab on the last button doesn't wrap around to the start. I'm having a hard time determining where Egui sets what ID should be focused next, particularly given that widget details aren't saved anywhere. I see:

            if self.pressed_tab && !self.is_focus_locked {
                self.id = None;
                self.give_to_next = true;
                self.pressed_tab = false;

I then see the various interested_in_focus calls, but it isn't immediately obvious to me where give_to_next might possibly determine that it is on the last ID and cycle around to the first. Shift-tab seems to wrap correctly, but tab doesn't.

I also just got events working and speaking. I feel like request_focus should send a focus event, but I'm not immediately clear on how/where the best place to do that might be.

Thanks.

I fixed tab sticking on the last widget--turns out my initial focus-setting code was at fault again. Thinking in immediate mode is challenging. :) Now, instead of tracking focus, I track whether my UI function ran via a Local resource, being careful to clear it on state transition. bevy_egui_a11y updated accordingly.

My current uncertainty is that request_focus still doesn't generate a focus event, so even though I focus on my start button, the screen reader system doesn't get an event indicating so. There doesn't seem to be a way for arbitrary widgets to push an event, E.g.:

        let start = ui.button("Start");
...
        if !*ran {
            start.request_focus();
            *ran = true;
        }

Seems like the only way to generate this event would be for users to call widget_info(...) with a widget-specific closure to generate the WidgetInfo. Am I missing something? Is there any way this API might be cleaned up such that request_focus() can do this automatically? My head is spinning a bit with all these data structures. :)

You found a bug @ndarilek !

The call to widget_info checks if the widget has gained focus (by e.g. pressing Tab), but this misses the case where someone calls request_focus after the call to widget_info. The fix should be that the call to widget_info detect the focus change next frame ("did I gain focus last frame after the call to widget_info?") and emit the event with one frame of delay.

Yes, immediate mode is tricky to get right :)

Got it, glad it sounds like an easy fix. I went ahead and abandoned setting an initial focus for now because things aren't interacting well with Bevy's state system.

So far I've built out start/pause menus, and have started on a settings menu. Hoping to add and emit events for checkboxes, sliders, and textedits.

One thing I'm not clear on--is pressing enter on a checkbox supposed to change its value? Checkboxes are buttons, I've found the code that clicks buttons on enter/space, and I've confirmed that my checkbox is getting clicked. But when I tab back to the value, it isn't reported as changed.

Obviously the change won't be reported immediately since there isn't an event for it, but since the value is mutable, my expectation is that clicking/pressing enter on the checkbox should toggle it, and that it should be reported as changed when focus lands back on it.

Thanks.

Making decent progress on my fork, and will submit an initia PR soon. Notably, TextEdit now reports value changes back to the caller, so my screen reader is able to read text added/removed from fields.

Now I'm trying to implement notification of selection change so I can read characters that are arrowed past, and need advice to get the implementation right. At the moment I'm checking response.changed for value changes, but this doesn't seem to get set when selection moves around in a TextEdit. So, questions:

Is there some field somewhere I should be checking to determine if the cursor has moved in a TextEdit? Presumably a field indicating that something needs to be redrawn would give me what I need, though that likely doesn't exist here because of immediate mode. Still though, thought I'd ask. :)
What horrible things would happen if I broadened changed on TextEdit to indicate that the cursor/selection has moved in the field? I know I'd have to branch at the callsite to distinguish between OutputEvent::TextSelectionChanged and OutputEvent::TextValueChanged, but that's a big enough break with the idea of something changing that I figured I should sanity-check it before going down that path. :)
Any other solution I haven't considered for passing cursor/selection changes back to the response?

Response::changed is for reporting that user data has changed, i.e. the String of the TextEdit, or the f32 of a Slider, or that the bool of a checkbox has been toggled (this frame). This is so that users can conveniently check if their data have been changed by egui without having to store the before-state and compare. This flag should not be set when moving a cursor, scrolling, etc.

So if you in fn widget_info add a if self.changed() { /* emit a WidgetEvent::Changed event */ } it will emit it in response to the user dragging a slider, toggling a checkbox or entering/deleting some text.

If you need to report cursor change, do you also need to report how it changed, or just that it has changed? Can you perhaps explain the motivation for this a bit from a users perspective? If a user presses the left arrow key, should the screen reader read "Text cursor moved" or "Text cursor moved left", or what?

Sorry, I did a bad job of communicating.

Say I type "Nilan" into a text field, realize my mistake, and want to fix it.

I type "N", "i", "l", "a", "n". The screen reader reads each character as I type.
I press left-arrow three times. The screen reader speaks "n", "a", "l" as the cursor moves over the characters.
I press backspace. The screen reader speaks "i" as I delete the mistake.

Theoretically I'm getting enough back for use cases 1 and 3--I'm currently sending text and previous text with WidgetInfo and need to do some client-side diffing to be absolutely certain. Now I'm trying to address use case 2. The issue is that there are no existing mechanisms for a Response to report back that something has changed, but not anything high level enough for a user-facing application to care.

If you look at https://github.com/ndarilek/egui/blob/master/egui/src/response.rs#L430 you'll see I've added a few additional handlers--namely for clicks and widget changes. And at https://github.com/ndarilek/egui/blob/master/egui/src/widgets/text_edit.rs#L668 you can see where I'm sending back a WidgetInfo for text changes. These are surfaced by the .changed handler in the previous link.

The problem here is that .changed isn't true if the only change is that the cursor moves in the text field, as it probably shouldn't be. But I need some condition to check for in the response that indicates a widget wants to push a WidgetInfo to the Output, distinct from "this widget has changed and likely needs application-level processing." Then, whenever the selection points move in a TextEdit, they'll set this to true. The response handler can then do something like:

...
        } else if self.changed {
            Some(OutputEvent::ValueChanged(make_info()))
        } else if self.has_widget_info {
            let info = make_info();
            if info.selection_start.is_some() && info.selection_end.is_some() {
                Some(OutputEvent::SelectionChanged(info))
             } else if // Check for other `WidgetInfo` fields as appropriate and dispatch other events {
                ...
        }

But before doing this, I wanted to check if a concept like this already existed--if there's some "dirty" flag or similar it'd likely be enough. I suspect it doesn't, but I'm only recently getting my sea legs with this codebase. :)

Thanks, hope that clarifies things a bit. I'll submit an initial PR once I have text edits and a few other things working on my end. Thanks also for the initial work on this. I'm fairly close to having an egui UI that's accessible enough to support logging into my game's server. As annoying as immediate mode is for accessibility, I love that I can throw up a GUI without having to slap together markup, or draw a bunch of assets and render over them with code.

BTW, #412 is where I'm working on this. Feedback welcome.

While I'm not a user of assistive technology, I want to bring up a point: I think "implementing TTS" is the wrong way of looking at the problem.

Someone using a screen reader already is using a screen reader, whether it be Microsoft Narrator, macOS VoiceOver, or whatever. Support for screen readers thus isn't implementing TTS readouts for your own window, but letting existing screen readers effectively read your window.

I don't know what that entails, to be completely honest. I just want to make sure that we're not accidentally going the wrong direction (which I kinda got vibes from "how do we tell if the browser wants TTS").

Agreed, but that problem is a lot more complicated than a single, or even a small handful, of developers can manage, and is further complicated by immediate mode.

AccessKit aims to solve it in a cross-platform way, and when that's available, an egui integration might be practical. But until that happens, implementing "real" accessibility is likely beyond us.

AccessKit, which @ndarilek mentioned in the previous comment, is now far enough along that we can start working on integration into egui. I have a very basic integration of AccessKit into egui on this branch.

Here's a quick summary of how AccessKit works, and how it fits into egui. AccessKit takes a push-based approach to accessibility. That is, for each frame where something in the UI has changed, the application creates a TreeUpdate, which can be either a complete tree snapshot or an incremental update, and pushes it to an AccessKit platform adapter. That platform adapter can then handle requests from assistive technologies (e.g. screen readers) without having to call back into the application, except when the user requests an action such as changing the keyboard focus or doing the equivalent of a mouse click. So in principle, this model is a good fit for an immediate-mode GUI. (In practice, the implementation could probably be made more efficient, e.g. by eliminating repeated heap allocations.) My integration creates a complete AccessKit tree for every egui frame, and AccessKit does comparisons to figure out what actually changed and fire the appropriate events.

AccessKit itself is still far from complete, and so is the integration. Most notably, I still need to work on support for text edit controls, as well as reading the value of a slider, and lots of smaller stuff. Also, AccessKit is only implemented for Windows so far. Still, at this point, you can run the eframe hello_world example on Windows, start up any Windows screen reader (Narrator, NVDA, JAWS...), tab around and get feedback, or navigate with the screen reader's commands. AccessKit and egui support one screen-reader-initiated action so far: setting the keyboard focus. It won't be hard to implement more.

I've modified egui-winit to use a proof-of-concept integration of AccessKit into winit which I've posted in my fork of that project. That direct integration into winit isn't likely to be accepted upstream, so I'll ultimately have to come up with another solution for that part.

It's also worth discussing how this work should relate to the existing "widget info" support. My AccessKit integration into egui currently uses the widget info. But another option would be to have all of the widgets manipulate AccessKit nodes directly, implement a generic, egui-independent screen reader library that uses the AccessKit tree, and ultimately drop widget info from the output struct. We're going to need direct text-to-speech output for a while yet, until AccessKit is implemented on all of the other platforms. (And even then, self-voicing would be useful for devices with no built-in screen reader, like game consoles.) But perhaps egui itself shouldn't have two ways of doing accessibility.

Wow @mwcampbell that's sounds great!

A quick update: AccessKit is still Windows-only, and there are still serious limitations in the Windows implementation, most notably lack of support for text editing. But one major blocker has just been resolved: the newly published accesskit_winit crate makes it straightforward to use AccessKit with winit, without requiring changes to winit itself.

I'm aware that my fork of egui with prototype AccessKit integration is way out of date. My next task is to update it and use the new winit adapter rather than my forked winit.

@emilk On #1844, I mentioned the possibility of replacing egui's current WidgetInfo with AccessKit, and you seemed to be in favor of it. Do you want me to replace WidgetInfo with AccessKit in one big leap, including the implementation of a new TTS output module based on AccessKit (for platforms that don't yet have a native AccessKit implementation), or would you prefer that I implement AccessKit support alongside the current WidgetInfo and work toward eventual replacement?

The accesskit branch in my egui fork has a rough but basically working AccessKit integration. It's based on the egui master branch as of earlier today. The other key difference between this branch and the work I did last December (which is now in the accesskit-old branch) is that I'm no longer using a fork of winit. (In fact, all dependencies are published crates.)

Currently, Response::widget_info fills in fields on the AccessKit node as well. I can, of course, change the widgets to fill in AccessKit node fields directly, in addition to or instead of providing WidgetInfo. I'm just waiting on @emilk 's answer to my previous comment before I decide how to approach that.

The big missing feature, still, is text editing. I'm starting on that in AccessKit next week. Aside from that, egui still needs to expose some more things that are already supported by AccessKit, such as the value and range of a slider.

And, for now, AccessKit is still only natively implemented on Windows. That's changing later this year. In the meantime, a platform-independent embedded screen reader, which is what accessible egui-based applications currently have to use, can be written based on AccessKit, using the accesskit_consumer crate to process the tree updates, traverse the tree, and find out what changed from one frame to the next.

@mwcampbell thank you so much for your work on AccessKit, and on working on the egui integration!

I like your current approach of having WidgetInfo fill in AccessKit data; it allows for a gradual migration to AccessKit, and is potentially a smaller PR (which is always good!). In particular, I like that existing widgets don't need to be re-written (always nice to avoid breaking changes for egui users).

The egui screen reader is mostly a proof-of-concept, and I don't believe it has many users right now, so breaking that end of things is less worrying to me. Still, if given a choice I would keep PlatformOutput::events etc until a replacement is merged (i.e. AcessKit + accesskit_consumer is in place and works with screenreaders on various platforms, including web).

I took a look at your egui fork, and it looks great so far. But, I would prefer having #[cfg(feature = accesskit)] around all the AccessKit code.

I just rebased my accesskit branch on the head of the egui repo (as of yesterday).

@emilk Do you want AccessKit integration to be an optional feature at all layers, including eframe, or just in the core egui crate? I'm also wondering if the accesskit_winit adapter should be integrated in egui-winit, as it is now in my branch, or only in eframe. The latter would reduce the total PR size, but would mean that anyone using egui-winit but not eframe would have to do more work to get AccessKit support.

As a developer user of egui,

eframe is framework enough that it makes sense IMHO to always have AccessKit enabled. At most, it could be a default feature; people should be pushed towards accessibility by default.

For egui and egui-wgpu, though, it's a major use case to use egui for development debug UI on top of your own rendering, and in that case it's likely the case that a graphics focused application wouldn't want the default integration as it'd likely need a fully custom solution to be accessible via screen reader.

As such, my vote goes to having an opt-in feature for AccessKit in egui/egui-wgpu with a separate entry point to turn on the integration, but that it should be as simple as using the ak enabled initialization.

I don't know how practical that is, but it's what I'd personally like using.

AccessKit is now an optional dependency in the core egui crate. I won't do anything more with egui_winit, egui_glium, egui_glow, and eframe until I get more input from @emilk.

I'd also like input on what milestones I should reach before I submit my AccessKit integration as a PR. AccessKit still only has a Windows adapter, though adapters for other platforms are now being developed (by others), and the accesskit_winit crate uses a no-op implementation on non-Windows platforms. Meanwhile, the core AccessKit crate hasn't yet reached 1.0, and I'm not sure when it will. The biggest missing functionality at the moment is text editing support. I'm hoping the API for that will be close to its final form by sometime next week.

On the one hand, if I wait until everything is done and perfect before I submit a PR, then I'll need to keep maintaining my own branch, and rebasing it on new versions of egui, for a while. On the other hand, I don't want to impose on the egui team the burden of keeping up with changes to AccessKit too soon.

You can open a draft PR right away @mwcampbell - it will make it easier for me to review your work!

OK. I'm currently in the middle of working on text editing, both in AccessKit and in my egui branch. Once I finish that and get sliders working, I'll open a draft PR.

If anyone wants to play with my work-in-progress text editing support with a Windows screen reader, here's the egui branch. Note that the AccessKit side of this is still a work in progress and isn't in the published crates yet; here's the AccessKit branch.

At this point, the major missing feature in text editing support is that the bounding rectangles of text ranges aren't yet exposed. This is why, if you use Narrator, the highlight cursor isn't where it should be when you're in a text edit widget. I suspect it's also why the JAWS cursor isn't working. I plan to implement this today. Also, Narrator isn't providing the expected feedback when deleting text; I'm guessing that's because AccessKit is returning an inappropriate error when Narrator tries to work with the old text range. I'll also look at this today.

Once I resolve those two issues, I'll open a PR on the AccessKit repo. Once that's merged, I can merge the egui work back into my main AccessKit branch.

I just released text editing support in AccessKit (still Windows only), and the matching support in egui is now on my main accesskit branch. I'm going to rebase that branch to the head of the upstream master branch, then I think I'm ready to open a draft PR.

FWIW, @DataTriny is working on a Linux platform adapter for AccessKit, implementing AT-SPI in pure Rust. He thinks that might be usable by the end of the year. Once that feature is merged in AccessKit, and AccessKit support is merged in egui, I think we will no longer need speech-dispatcher on Linux. That dependency seems to be a recurring source of frustration for egui developers.

If anyone wants to try out the work-in-progress AccessKit macOS adapter, check out this temporary egui branch. The major missing features that you're likely to encounter in simple example apps are:

Hit-testing (e.g. for moving the VoiceOver cursor to the mouse pointer)
Adjusting sliders and steppers with VoiceOver commands, as opposed to normal keyboard input to the application
Text editing

I plan to address the first two early next week. Text editing will likely take several days. I hope to have the macOS adapter on par with the Windows one by early December.

I'm posting a status update on this here because I know that macOS is popular among developers, and I figure that when AccessKit's macOS adapter is reasonably complete, interest in AccessKit in general will increase.

Quick update for anyone watching this issue but not #2294 (AccessKit integration PR): I've marked that PR as ready for review. That means I've frozen the initial implementation, except to address review feedback. On the macOS side, that adapter is now published and is used by the published AccessKit winit adapter. So my egui AccessKit integration now supports Windows and macOS using only published AccessKit crates. I resolved the hit-testing and slider/stepper issues mentioned in the previous comment. Now the big feature I need to work on for macOS is text editing support.

I just want to say: Thank you @mwcampbell for your work here! Your efforts are greatly appreciated. 😄

Thanks to @emilk for merging AccessKit support. To be clear, this doesn't mean the end of all work on accessibility in egui. AccessKit itself is still incomplete, and not all widgets are fully accessible yet. But I think this is an appropriate time to close this original issue and open new ones as they come up. @emilk Feel free to reopen if you disagree.

I agree with closing this, and I also agree with @CAD97 - thanks for working on this @mwcampbell ❤️

emilk / egui

Accessibility (A11y)

Additional thoughts

More links

Existing information, experience & resources

`egui` & `tts` proof-of-concept

Feedback

Next steps

	ui.label("\
	Widgets that store state require unique and persisting identifiers so we can track their state between frames.\n\
	For instance, collapsable headers needs to store wether or not they are open. \
	Their Id:s are derived from their names. \
	If you fail to give them unique names then clicking one will open both. \
	To help you debug this, an error message is printed on screen:");

	ui.collapsing("Collapsing header", \|ui\| {
	ui.label("Contents of first foldable ui");
	});
	ui.collapsing("Collapsing header", \|ui\| {
	ui.label("Contents of second foldable ui");
	});

Accessibility (A11y)

Additional thoughts

More links

Existing information, experience & resources

egui & tts proof-of-concept

Feedback

Next steps

`egui` & `tts` proof-of-concept