PerBothner / DomTerm

DOM/JavaScript-based terminal-emulator/console

Home Page:https://domterm.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sixel glitches

opened this issue · comments

Hello! The sixel glitches I'm seeing so far are here. Attached are two files:

  • The jexer jar that produced the output. Rename it to jexer.jar, then run via java -jar jexer.jar: jexer.zip
  • A capture of program output domterm_sixel.txt

What it should look like:

domterm_sixel_xterm

vs:

domterm_sixel_domterm

The issues might touch on:

  1. CSI 14/16/18 t responses are not lining up with the cell aspect ratio. These are the vertical gaps in the color wheel.
  2. Jexer emits the images as 1 image per text row, with a few cells at a time. There is no particular ordering. Each image has a cursor position in front of it.

Thank you!

Note this report is a little weak in the "how to reproduce" department: Just running java -jar jexer.jar works fine. You have to do something more - like clicking Pixels. And then I do indeed see some nasty glitches.

I also see some glitches when selecting Colors. That might be easier to debug.

One possible problem is different assumptions between the ratio of character cells size to sixel pixels. I haven't seen a specification for how this is defined, so DomTerm currently defaults to one sixel pixel to one css pixel. I read that the VT300 screen was 800x480 pixels. Assuming 80x24 characters, that works out to 10x20 pixels per character. If your library makes that assumption, and/or that is the "standard" I can easily scale the image to match.

More complicated is dealing with mixing text and graphics, such as writing text on part of an image, or allowing an image to be visible as a background to text. Currently, a <canvas> object is treated as "inline" content in a span of text - so no overlap. More flexible would be to treat an image as a background to one or more lines of text. Text could overlay an image - and such text could have a transparent background, allowing the image to be visible.

I think I want to work on this idea - but first I need to resolve some issues on the window-management side of DomTerm.

Sorry for the ambiguity / weak report.

Jexer takes the view that text will destroy images, but images will overlay text. And that text which overwrites any part of an image can be assumed to destroy the entire image. This is a lowest common denominator assumption that works to get the same output on screen for almost every terminal I have at so far, but is not sufficient for other applications such as notcurses. (I went this route because xterm would corrupt other image parts I had not touched, and it was the only terminal that worked at the time, so I adapted.) Notcurses expects sixel graphics to always overlay text, and (if I'm understanding it right) for text to destroy only the part of the graphic it overlaps.

I checked in two fixes that seem to improve the results.

It does look a bit better! I'm seeing less/no gaps between images now, just some stale placements.

FYI also - CVE-2022-24130 kills several sixel terminals, but DomTerm is fine. :) I've been noting the results here: https://gitlab.com/klamonte/jexer/-/issues/105

Do you have any idea what triggers the "stale placements"?

Ideal would a file (the shorter/smaller the better) that I can cat and which DomTerm displays incorrectly.

@PerBothner

(I have killed my old account -- transition stuff. 🏳‍⚧ I continue on as @AutumnMeowMeow . :-) )

Jexer's sixel output strategy is:

  1. Emit all sixel images, in whatever order the threadpool got to it.
  2. Each sixel image is preceded by a CUP.
  3. Each image is exactly 1 text row high, and 1 to 10-ish columns wide. (Width is based on # colors in palette and cell aspect ratio: it tries to hit a spot where there is likely to be little visible dithering required.)
  4. Scrolling is enabled (DECSDM is disabled), unless on the bottom row. Images on the bottom row are done with a DECSDM trick: https://gitlab.com/klamonte/jexer/-/issues/91 . (This works perfectly in 4 terminals, and not quite in 3. I'm not bothering anyone with it.)
  5. After all of the images are emitted -- which can overlap other previously-emitted images -- then the text is emitted.
  6. The text is top-down, each row, only emitting changed cells.
  7. For each different region, it is: CUP, then attributes, then text. I assume text is fully destructive of images.
  8. If it so happens that a row ends on black+blank cells, I use EL for that last bit of the row. <-- I'm wondering if this is part of the root cause for the four images on the girl.jpg projecting into the black region on the right on the DomTerm screenshot above.
  9. If the terminal supports Synchronized Output (https://gist.github.com/christianparpart/d8a62cc1ab659194337d73e399004036) (detected via DECRQM/DECRPM), then all of the above is within the BSU/ESU sequences. BSU/ESU does not work for xterm, so there are frames where I see xterm exposing the cell under an image if the cursor happens to be there when it is updating its screen.

I will see if I can get something a bit smaller than the first capture.

Looking at the DomTerm code, it seems quite likely that most kinds of partial overwrite of a pre-existing image (including erase-line-right) would have problems. For the restricted case of images that are single-line-high I can probably make things work by relatively local changes. However, I think it would be better to design a more comprehensive approach. It would be nice to support whatever NotCurses needs as well. And whatever the consensus is for other image protocols.

An idea is that conceptually each line may have a "background image". From the JavaScript/DOM view that is a canvas at a lower z-index than text on the same line. Escape sequences that erase text would both erase any foreground text and replace that section of the background image with the current background color. (If the entire line is replaced, the canvas can be removed, as a optimization.) Writing sixel or other image data erases existing text and overwrites the background images. Writing text creates or modifies "foreground text" (DOM Text nodes), which hides the background images. Note that writing space(s) is different from erasing.

As an optimization (and to reduce edge artifacts) is to use a single canvas object when multiple adjacent lines have background images, especially when written by a single sixel image command.

We can extend the "set background color" CSI 48;2;R;G;Bm to allow an alpha specification - maybe CSI 48;2;R;G;B;ALPHAm. This would easily support text over a background images.

I think it would be better to design a more comprehensive approach. It would be nice to support whatever NotCurses needs as well. And whatever the consensus is for other image protocols.

Image overwrite/erase semantics for sixel have differed across a lot of terminals. My experience with xterm was that putting any text on an image could do anything to the rest of the image, but usually right-and/or-down of where the cursor is. xterm is plain weird in its sixel: it's like it just blits it once, and then all the text operations manipulate the screen surface giving the illusion of changing the image, but if you scroll at all the image is blitted back to where its original anchor was, BUT the space that might have had image data from before that still does, so regions can be repeated. I gave up trying to figure it out.

But here is notcurses' thread on alacritty: dankamongmen/notcurses#2142 . @dankamongmen summarized it elsewhere as:

essentially, sixel erasure in alacritty is not a cell-by-cell deal, but rather all-or-nothing requiring erasures spanning the graphic in the vertical dimension (maybe horizontal also works, don't know). it's not that this is "wrong" per se (i don't really consider sixel well-defined), but it's definitely different from how other implementations of which i am aware work.

There is also that DECSDM trick for the bottom line which uses transparent sixels: both mlterm and wezterm have issues with putting transparent sixel regions over other sixel regions.

(I would expect the three terminals that support iTerm2 (iTerm2, mintty, wezterm) to apply similar logic to iTerm2 images. i.e. if they fix transparent-over-nontransparent for sixel, then it would be fixed for iTerm2.)

We can extend the "set background color" CSI 48;2;R;G;Bm to allow an alpha specification - maybe CSI 48;2;R;G;B;ALPHAm. This would easily support text over a background images.

I think if SGR 38/48 were added to, it might be better to change the ";2;" part to something else first. Some parsers expect exactly 3 parameters after the 2, and would then take the alpha as a fresh SGR code and get out of sync, when using the (I think more common?) semicolon-separated form of that. Or alternatively, require the colon-separated form. (Relevant: https://bugs.kde.org/show_bug.cgi?id=107487#c11 )

It's interesting though you bring it up. I have been noodling about alpha as a way to reach the terminal's background image, or the actual window manager background if the terminal is composited. @wez did a very nice writeup of how alpha is handled for wezterm here:
wez/wezterm#1537

I'd love for us to come to an agreement on specifying colors with alpha! I don't think adding more params to the semicolon form of SGR 38/48/58 is wise as it is already ambiguous and "technically incorrect" even though it is widely implemented in both TE and applications :-/

I think its worth also considering more than just 8-bit color values as part of specifying that. @dankamongmen and I have tentatively been considering 10-bit color support.

Looking over at https://iterm2.com/documentation-escape-codes.html iTerm2 has a SetColors OSC (I'm not proposing that we adopt that particular sequence) that can specify srgb, rgb (presumably linear) and p3 color spaces. DCI-P3 can handle 10-bit colors.

This feels like the obvious extension for RGBA for 8bpc:

CSI 38 : 2 : : R : G : B : A m

For 10-bit color... do we add some new colorspace values? eg: id 6 for 10bpc RGB(A) values, then the R,G,B and A values are decimal numbers in the range 0-1023?

So white with explicitly 100% alpha would be:

CSI 38 : 6 : : 1023 : 1023 : 1023 : 1023 m

I agree an extra parameter to the CSI 48; 2 syntax is probably not a good idea. I would prefer to avoid the mess of colons vs semi-colons.

One idea that occured to me is to specify an explicit denominator:

CSI 38/48 ; 6 ; RED ; GREEN ; BLUE ; ALPHA ; DENOMINATOR m

where RED, BLUE, GREEN and ALPHA are divided by DENOMINATOR to produce a fraction between 0.0 and 1.0. The default for DENOMINATOR could be 256. This generalizes to 10-bit, 12-bit, 16-bit color.

The denominator idea sounds good!

The ; vs : thing is a PITA. It's my belief that we really should avoid overloading SGR with more things that use ; separators for parameters as they add further ambiguity/special cases with batched SGR sequences like CSI 1 ; 3 m.

I feel that we should specify these extensions with :, but part of me fears that that ship has sailed and that enough software assumes one or the other that we'll need to support both forms anyway. I tried to be principled about this in wezterm but ended up having to support both because enough software assumed that ; was the right way.

Is it worth looking at a non-SGR sequence for this?

I would lean towards a non SGR myself. It's been overloaded so much, now with the underline colors and colors 88 and aixterm colors, who knows what some random terminal will do with it next?

The other thing is: alpha for which layer? Always both? Bg only? Both fg ang bg? A bg alpha could be supported by an application on any terminal today via the use of RGB and blending (except to the terminal's background image). But fg alpha on anything non solid background (so image generally) can only be done by the terminal.

It's enough different and non obvious in the corner cases that maybe we should aim for a clean break. A new sequence that sets text colors and text+image alpha for a layer, and nothing else. With very clear expectations.

re: which layer: in my mind, the 38 equivalent that come up with specifies the RGBA for the foreground color used to paint the glyph, while 48 is for the background color used in its cell.

wezterm supports OSC 4 for setting palette entries. That looks something like this when changing palette index 0, which in wezterm is the default background color for the terminal. OSC 4 looks something like this:

OSC 4 ; 0 ; #cccccc ST

That colorspec is defined in xterm's docs as being something that XParseColor supports, which can actually specify up to 16 bits, but doesn't have a way to specify alpha. It does have a namespace though; rgb: and rgbi: prefixes are used to specify the channel values in hex and floating point values. It may not be terribly egregious for us to add rgba: and rgbai: to that namespace with the additional alpha channel.

Then jexer could query the current background color using OSC 4 ; 0 ; ? ST, then, assuming it returns the RGB value for black, modify that and emit something like OSC 4 ; 0 ; rgbai:0/0/0/0.5 ST to get a 50% transparent version of whatever the background color was in the terminal. With the ability to set the alpha for each of the fg, bg and default bg, I think applications like Jexer would have a lot of power/flexibility.

  • Should we extend OSC 4 for this, or define a new OSC with similar semantics that understands alpha?
  • Should we invent rgba: and rgbai: extensions to XParseColor, or standardize on something more modern, like the CSS color specification? That latter is trivially easy for me in Rust with the help of https://docs.rs/csscolorparser/latest/csscolorparser/#example-color-format and I presume it would also be trivially easy for DomTerm to pass through
  • Should we define an OSC alternative to SGR 38/48/58 to avoid the ; vs : mess?
  • We need to consider making recommendations for eg: users that don't want their terminal to turn transparent, or when a user has configured their terminal to be partially transparent.
    • Do we report their configured alpha level in the OSC 4 query? (If we're stacking them as per the next bullet then probably no, otherwise, probably yes).
    • Do we stack their alpha with the application provided alpha? eg: if I choose to set it to 80% and app wants 50%, the effective alpha should probably end up as 0.8 * 0.5

I agree that alpha/transparency isn't very useful unless you have some kind of model of multiple layers that can blend. Assuming a traditional character cell model, when does writing a character replace/overwrite the previous contents and when does it blend? For example allowing multiple character layers to blend together is complicated to implement, complicated to specify, and isn't very useful.

A single background image layer along with a single foreground text layer seems relatively manageable and seems flexible enough to handle most use-cases. If the image is associated with a line (or a range of lines), that supports scrolling. The text layer can allow alpha to be specified to allow the background image layer to show through.

Some people want a semi-transparent terminal that allows the window background to show through, I think that is a orthogonal feature, and it is enough to have a single global transparency value.

smaller_capture.zip

@PerBothner

A smaller capture, showing the glitches without any user input:

  • smaller.txt has quite a bit of artifacts.
  • s1.txt is just the first bit showing a few artifacts. It's very easy to see them around the "pixel operations" box edges which overlap some of rounded button edges (which are sixel images).
  • I resized DomTerm to 80x31 before generating the capture.

Interestingly, the DomTerm screen is not deterministic. 'cat'ing s1.txt multiple times comes up with slightly different outcomes. Here are three from mine. In all cases, I launch qtdomterm, increase height to 31 lines, and then cat s1.txt.

smaller_capture_screens.zip

@wez

I would never have thought of the rgba/rgbai schema for OSC 4, but really like that idea! For the applications that just want to have nice translucent windows, it's a perfect one-shot solution for all non-RGB colors. It could be super easy on both ends, and might cleanly coexist with xterm right now. OMG so simple yet with some powerful application. I bow to you good sir. 🙇‍♀️

Layering: Would an alpha for say ANSI color RED be applied by the terminal once at the end when going to the physical display, or is it blended every time it is printed to the logical screen? Would RED-over-RED at 50% alpha look like RED-at-50%, or RED-at-75% when the user finally sees it?

  • If the answer is "once, at the end", then the application can readily expose the terminal's background. It's literally a one-shot change and anything and anything can look like yakuake. Cool!
  • If it's "blend every time it is received", the terminal would feel some pain, but an application willing to do its own painter's algorithm can look like notcurses. Really cool. 😎

What are you thinking for it?

I'm leaning on the former: it's "once, at the end". That scratches a lot of itch right off the bat, while keeping things very simple (a.k.a. more likely to be implemented elsewhere) for the terminal itself. And one can be a tad clever with indexed colors and still get the illusion of translucent windows, for a few layers -- and you only need about 3 layers to look rad as hell.

Don't get too hasty with the praise as I misremembered how things work! However, what I said is broadly applicable. I was thinking of OSC 11 (Change VT100 text background color), as OSC 4 to change a palette index would affect everything that uses that palette index. I do think that it would be nice to allow setting alpha via OSC 4, but for your use case, it might make more sense to use OSC 11 to set the background color with alpha.

In wezterm's implementation, alpha is something considered at render time by the GPU; rendering approximately follows the painters algorithm, in that the window background is filled in first (which may be with transparent pixels if the user has configured it that way), then the backgrounds of cells, then the text glyph foregrounds. Each of those elements has its own alpha and those are all composited over each other as they are laid down. Today, most of those things have alpha=100%. If a cell is set to use a direct color value for its fg and/or bg then we would take that alpha value. For indexed colors we'd resolve them from the palette as you might expect.

I spent some time reworking on Sixel images are handled. For example, ncplayer now works pretty well - even with .mov videoes. Jexer also seems to be quite a bit better. There are some lesser glitches which I could work on if requested.

It could still be better in terms of canvas handling - i.e how to handle multiple overlapping sixel and/or text requests. I don't know if any application needs better handling than the current implementation.

@PerBothner exciting! i'm just now seeing a bug you filed on notcurses some time ago. hopefully i can get you a reply later today!

(and sorry for the momentary post here -- i thought this was said bug)