Allow bypassing diff and writing all to screen

Question

Allow bypassing diff and writing all to screen

itsjunetime opened this issue 2 months ago · comments

Problem

I'm writing a TUI app with ratatui that makes pretty heavy usage of ratatui_image to display images. Because of this, the diff process (specifically, the unicode_width::width function) takes a lot of the CPU time - much more than it would take to just render the images.

Solution

I would like to add a field to Terminal (or perhaps Frame, but I think that would be harder to do while maintaining a simple API) called bypass_diff (or something similar) that will bypass the diff call/process when flushing the terminal to the screen and instead just render everything.

Alternatives

An alternative to this would be simply improving the performance of the diff function. I expect it's possible somehow (I know there are at least some small improvements to be made easily, such as removing the unnecessary access-by-index and double call to current.symbol().width()), but it would take a lot more work than just adding an option for developers to bypass the diffing process.

Additional context

I would be happy to file a PR whenever, as I've already implemented this and verified that it does improve performance in my use case.

Josh McKinney · Answer 1 · Sun May 19 2024 11:50:59 GMT+0800 (China Standard Time)

I'd love to see some benchmarks on the diff function to measure the current performance as one easy-ish part of this.

It's actually possible to skip the entire rendering stack already if you need it:

terminal.set_cursor(x,y);
let writer = terminal.backend_mut();
write!("{}", your_image_data_as_ansi);

That's not super helpful to the ratatui-image widget approach I know, but I'm also not 100% sure what an ergonomic API for this might look like on terminal. It might be useful to think about an approach that doesn't break if we add other similar functionality. Some possible ideas (feel free to add more):

terminal.draw_direct(...);
terminal.draw_with_options(Options::new().direct(), ...);
terminal.draw(|frame| { frame.skip_diff(true); frame.render_widget(...); }

June · Answer 2 · Mon May 20 2024 06:27:15 GMT+0800 (China Standard Time)

Alright, I was able to run some pseudo-benchmarks and get some numbers on the kind of performance hit I'm looking at. To do this, I basically just added the following code around my main render loop:

let mut end_update = false;
let start_time = std::time::Instant::now();
term.draw(|f| {
	tui.render(f, &main_area, &mut end_update);
	f.bypass_diff = true; // <- I changed this line based on the test
})?;
if end_update {
	execute!(stdout(), EndSynchronizedUpdate)?;
}
let end_time = std::time::Instant::now();
let total_time = end_time.duration_since(start_time);
if redraw_caused_by_crossterm_event {
	render_times.push(total_time.as_millis());
}

And then I just print out the contents and average of render_times when the app ends. I also made sure that, when doing these tests, each crossterm event was a keystroke from the keyboard that caused the images on screen to change, so this isn't just re-rendering the same thing.

The results with bypass_diff = true:

render times: [78, 59, 28, 1, 31, 68, 87, 69, 79, 73, 80, 78, 97, 96, 92, 78, 84, 78, 78, 78, 78, 78, 79, 80, 76, 68, 86, 83, 75]
average: 72ms

The results without bypass_diff = true (or rather, with bypass_diff = false):

render times: [109, 164, 147, 155, 149, 156, 141, 152, 124, 160, 132, 165, 118, 142, 161, 161, 164, 115, 132, 92, 154, 162, 150, 127, 123, 127, 127, 122, 114]
average: 139ms

This difference is very noticeable if the user is trying to do something that causes images to change quickly.

I am aware that I can just write the images directly, but I do really like ratatui's widget system and would like to stay within that if possible. It just takes off a lot of the mental overhead associated with TUI apps.

I'm not quite certain what the best API would look like either. the Options one may be best for long-term stability and customization, but I think it would also be possible to just add an extra field to Terminal so one can just call terminal.set_skip_diff(true) once. If we added this option to Frame instead of Terminal, then we'd also have to add a way to pass it into the Terminal::flush function, since that's where the diffing is currently called from (and could be bypassed from).

I was also thinking that it might be useful to make this a non-exhaustive enum instead of just a bool, e.g.

#[non_exhaustive]
enum DiffMode {
    Full,
    Skip
}

so that, for example, if faster but less accurate diff methods were built into ratatui (e.g. if you made one that didn't account for multi-width characters, or something like that), developers could opt into the diff mode that made sense for their use case.

June · Answer 3 · Tue May 21 2024 10:35:29 GMT+0800 (China Standard Time)

I've been thinking about this issue a bit more, and I think that a better option might be not to allow skipping diffing (sorry, this might be turning into a bit of an X Y problem), but instead allowing cells to contain a 'trusted width', i.e. a pre-computed width that they should display as when written to the screen. This will allow them to avoid calling unicode_width::width when diffing.

This will allow the API changes to remain minimal, and also allow improved performance for even people who don't opt into whatever changes we decide here. This could be a simple field that library programmers could set (I'm thinking specifically of ratatui-image here), and then everyone that uses that library could get the performance boosts.

I've also implemented this and it gets similar rendering performance as bypassing diffing, so it would at least fix my problem.

Josh McKinney · Answer 4 · Tue May 21 2024 11:06:00 GMT+0800 (China Standard Time)

Something that would also be helpful here would be some flamegraphs (install cargo-flamegraph if you haven't already).
That will show where the time is being spent (i.e. that the width call the hotspot on this particular code).

June · Answer 5 · Wed May 22 2024 02:52:52 GMT+0800 (China Standard Time)

Yes, using flamegraphs is how I already determined that the unicode_width call was the hotspot here. They're not the most useful without context, as other things are also going on in background threads, but you should be able to see it pretty clearly.

Here's the flamegraph with ratatui as it is right now:

Here's the flamegraph with my bypass_diff/skip_diff suggestion implemented:

And here's the flamegraph with my trusted_width suggestion implemented:

Both of my solutions, which basically only target the unicode_width call, would fix the performance issue in a very noticeable way. I can share the code with you if you'd like to try it out yourself.

三咲雅 · Misaki Masa · Answer 6 · Thu Jun 06 2024 13:53:44 GMT+0800 (China Standard Time)

I need this too, but my purpose is a bit different from OP's: I'm maintaining a terminal file manager that provides image support, unlike ratatui_image, it directly outputs escape codes to the terminal, hence it and ratatui are on two different rendering pipelines.

This means ratatui can't know if a particular Cell is image data, as a result, the diff in the image preview area of ratatui breaks, causing popup components to not render in the image preview area (because to ratatui, the overlapping area with the image never appears to have been updated, so it always skips rendering):

What I'm looking for is a way to "skip the diff and force render". Coincidentally, ratatui already has a set_skip() method, which is meant to "skip diff and never render". So, the implementation I can think of is adding another parallel function, set_force(bool) — which is similar to what OP mentioned as bypass_diff, but it's controlled at the Cell level instead of the entire Terminal/Buffer.

The goal, like with set_skip, is to get out of the diff process, but to control "whether the Cell is rendered or not" after skipping the diff. A simple patch of Buffer::diff() looks like:

-            if !current.skip && (current != previous || invalidated > 0) && to_skip == 0 {
+            if current.force || (!current.skip && (current != previous || invalidated > 0)) && to_skip == 0 {

Another feasible approach is to allow users to customize the diff algorithm, such as adding a Terminal::set_differ(differ) method where the differ implements the Differ trait:

pub trait Differ {
  fn diff(previous: &Buffer, next: &Buffer) -> Vec<(u16, u16, &Cell)>;
}

This would allow users to define their needs more finely and let ratatui not worry about the specific implementation.

BTW, currently, I'm implementing the rendering of pop-up components by manually patching the image preview area. However, this implementation is quite complex and requires flashing more than once (ex if there are two pop-up components, it needs to flash three times). So I would love to see ratatui support this directly if it's possible :)