Add support for Sixel images in conhost #17421

j4james · 2024-06-11T20:20:00Z

Summary of the Pull Request

This PR introduces basic support for the Sixel graphics protocol in
conhost, limited to the GDI renderer.

References and Relevant Issues

This is a first step towards supporting Sixel graphics in Windows
Terminal (#448), but that will first require us to have some form of
ConPTY passthrough (#1173).

Detailed Description of the Pull Request / Additional comments

There are three main parts to the architecture:

The SixelParser class takes care of parsing the incoming Sixel DCS
sequence.
The resulting image content is stored in the text buffer in a series
of ImageSlice objects, which represent per-row image content.
The renderer then takes care of painting those image slices for each
affected row.

The parser is designed to support multiple conformance levels so we can
one day provide strict compatibility with the original DEC hardware. But
for now the default behavior is intended to work with more modern Sixel
applications. This is essentially the equivalent of a VT340 with 256
colors, so it should still work reasonably well as a VT340 emulator too.

Validation Steps Performed

Thanks to the work of @hackerb9, who has done extensive testing on a
real VT340, we now have a fairly good understanding of how the original
Sixel hardware terminals worked, and I've tried to make sure that our
implementation matches that behavior as closely as possible.

I've also done some testing with modern Sixel libraries like notcurses
and jexer, but those typically rely on the terminal implementing certain
proprietary Xterm query sequences which I haven't included in this PR.

DHowett · 2024-06-11T20:37:57Z

holy shit

DHowett · 2024-06-11T20:38:19Z

James, I just derailed a team meeting to direct everyone's attention over here :D

j4james · 2024-06-11T20:42:03Z

src/buffer/out/ImageSlice.hpp

+ bool _eraseCells(const til::CoordType columnBegin, const til::CoordType columnEnd);
+
+ til::size _cellSize;
+ std::vector<RGBQUAD> _pixelBuffer;


Note that I'm using an RGBQUAD here just because it was convenient for the GDI renderer, and I figured the Atlas renderer might not care what format it's given. But if this turns out to be a problem, it shouldn't be that big a deal to change to something more generic.

DHowett · 2024-06-11T21:11:40Z

zadjii-msft · 2024-06-11T21:15:57Z

okay the bar for the best PR of the year is now very, very high

lhecker · 2024-06-11T21:40:06Z

src/buffer/out/ImageSlice.cpp

+ auto eraseIterator = _pixelBuffer.begin() + eraseOffset;
+ for (auto y = 0; y < _cellSize.height; y++)
+ {
+ std::fill_n(eraseIterator, eraseLength, RGBQUAD{});
+ std::advance(eraseIterator, _pixelWidth);
+ }


The for loop advances the iterator by _cellSize.height * _pixelWidth in total which is equal to the _pixelBuffer size. If the eraseOffset is greater than 0, the final loop iteration will leave the iterator past the end of the _pixelBuffer. Since the MSVC STL uses checked iterators this results in a debug assertion.

One way to fix this:

auto eraseIterator = _pixelBuffer.begin(); for (auto y = 0; y < _cellSize.height; y++) { std::fill_n(eraseIterator + eraseOffset, eraseLength, RGBQUAD{}); std::advance(eraseIterator, _pixelWidth); }

FYI fill_n isn't super duper optimal for clearing bytes in this loop and you may be better off using memset directly. Mostly because it skips an unnecessary emptiness check. Something like this:

auto eraseIterator = _pixelBuffer.data() + eraseOffset; for (auto y = 0; y < _cellSize.height; y++) { memset(eraseIterator, 0, eraseLength * sizeof(RGBQUAD)); eraseIterator += _pixelWidth; }

Thanks for bringing this up. It never occurred to me that it would be a problem using std::advance passed the end of the buffer, but it makes perfect sense in retrospect. I think I only starting using it because I was getting complaints from the auditor about pointer arithmetic in some places, so I'll be happy to go back to +=. If there are still pointer arithmetic warnings I'll find another way to deal with that.

Turns out += isn't any better than std::advance since it still has the debug asserts. But I found that switching to raw pointers - and continuing to use the std::advance for incrementing - was enough to keep everyone happy. Debug build doesn't appear to assert anymore, and audit still passes.

src/terminal/adapter/SixelParser.cpp

DHowett · 2024-06-11T23:12:24Z

I went and dug up "xserver-SIXEL", which needed some help to build (it was last touched ~10 years ago), but... it uh works

lhecker · 2024-06-12T00:14:10Z

Same day delivery: d7b002f

After working with this branch for a bit, I think the most important feedback I can give is that I think it'd be preferable if we shared a single shared_ptr<ImageSlice> across multiple rows:

GPUs have a minimum allocation granularity of usually 64KiB (hence the usage of sprite atlasses). The D2D for that is somewhat awkard. Popular texture allocation algorithms also usually work better for textures with an aspect ratio closer to 1. Not a big deal, but maybe an annoyance we can avoid?
More importantly, if a slice with a height of 20px gets stretched to a row height of 25px, then the first and last pixel will be aligned with the top and bottom of the row. Same for the neighboring rows and slices atop and below. This means that the "pixel density" at the edges of rows will be higher than within them. This leads to an "unevenness" in the image quality.
My understanding of the kitty image protocol is that it submits complete images (raw RGB, PNG, etc.). If we don't share a single buffer between rows, I imagine that the PNG support in particular would be awkward to implement.

I'm happy to help with any parts of this implementation as needed. 🙂

As an aside, I continue to think that the usage of std::function for DCS paired with the character-wise parsing is not quite ideal. To be clear, I don't think it's urgent. It's more like a hunch. I'm not basing this on any benchmarks, and what's not measured is not real (or whatever the actual saying was). 😅

Reason being, we never parse the same type of DCS simultaneously per VT parser instance, right? So, couldn't we call the parse method of each corresponding implementation directly and keep the state as a struct around forever? This would turn the dynamic into static dispatch. Additionally, it would be excellent for performance if we could pass entire strings of input to each parser (tighter = faster parsing loops). The parser could for instance indicate to the caller whether it's done parsing and at which character offset.

j4james · 2024-06-13T15:35:06Z

Same day delivery: d7b002f

Works like a dream! It's hard to tell, because I think most of the processing time is spent in the parser, but this does seem faster to me than the GDI renderer.

I think the most important feedback I can give is that I think it'd be preferable if we shared a single shared_ptr<ImageSlice> across multiple rows:

I expected you'd probably want to rewrite a lot of the buffer management code, and you're welcome to do so, but I'm not sure this particular change makes sense, and I fear you may not understand exactly how sixel graphics are intended to work. Most importantly, the kitty image protocol is fundamentally incompatible with sixel. If you intend to implement the kitty protocol, you'll need to manage its image content separately. It works in a totally different way.

The way the hardware graphics terminals work, the screen is essentially one large bitmap, and everything writes to that bitmap. You output some text, that's drawn onto the bitmap. You output some image content, that's drawn on top of it. Because you don't want a garbled mix of text when you write over existing text, it always clears the cell that it's writing to. But that also means that text output over an area of image content, will first punch a hole in that image.

This behavior makes it easy for us to emulate the giant bitmap concept with a combination of text that is overlaid with an image only where needed. But it's important to understand that the text and image are representing one and the same thing. When you scroll the text, you're scrolling the image too. If you insert a line in the middle of the screen, the image content below that line is going to move down along with the text. This behavior just works automatically when you have image slices attached to rows.

The kitty protocol is completely different, though. Each graphics sequence generates a new image object (potentially at least), and these images are completely separate from the text (and from each other). If you insert a line in the middle of the screen, it'll shift the text down, without necessarily having any affect on images that cover that text.

And kitty images can appear both above and below the text, and can also move independently of each other. That means you need to keep track of every kitty sequence that is output, and have some sort of protocol to manage when they're deleted if you're running low on memory. With sixel you don't have this problem, because multiple sixel sequences will just keep writing to the same image buffer.

j4james · 2024-06-13T19:32:08Z

As an aside, I continue to think that the usage of std::function for DCS paired with the character-wise parsing is not quite ideal.

I forgot to reply to this. And I definitely agree that the current implementation isn't ideal, but the performance also isn't terrible, so it's not a huge priority for me. I personally care more about the missing VT functionality than I do about speed. For me the competition is 20 year old hardware running at 19200 baud. 😀 But I certainly wouldn't object to anyone else rewriting the DCS handler. I actually thought it might come up as part of #17336.

lhecker · 2024-06-13T21:35:00Z

[...] and I fear you may not understand exactly how sixel graphics are intended to work.

This is definitely true. In particular I'm not sure I really understand yet how the "cell size" for sixels work. After my initial, cursory reading of the PR my understanding of _imageOriginCell and _imageCursor was that the SixelParser does effectively process a rectangular image that may span multiple (potentially many) ROWs. Is that incorrect? Based on that I had assumed that it would similarly be possible to store the sixel image as a single RGBA blob shared across multiple rows.

When you scroll the text, you're scrolling the image too. If you insert a line in the middle of the screen, the image content below that line is going to move down along with the text.

I wasn't aware it worked like that! I guess we could slice up the image whenever we encounter a VT/Console scroll sequence/command - we do the same for ROWs after all (via GetScratchpadRow(), etc.). But... hmm. I think I somewhat understand the problem now! It would be great if there was something we could do, but it's non-trivial. BTW if the renderer had access to all ImageSlices at once, I suppose it could stitch them together? That would also work for me... But it's overall a minor issue, I think.

(BTW thanks for explaining the differences with the kitty protocol!)

More importantly, it'd be great if we could still add an "ID" to the ImageSlice. This isn't very important for local rendering, but for Remote Desktop for instance: If we don't properly cache sixels there, it will stream the entire RGBA contents over the internet on every frame.

One option would be to just have a static std::atomic<uint64_t> counter. Then we bump the counter whenever we call GetMutableImageSlice or construct a new instance? (Or maybe some other place - I haven't read the PR in complete detail yet.)

With this ID we can then key the ID2D1Bitmap cache. 🙂

j4james · 2024-06-14T01:11:28Z

I'm not sure I really understand yet how the "cell size" for sixels work.

The cell size is necessary to emulate the behavior of a real sixel terminal on which the character cell is a specific size, typically 10x20 pixels. Software intended to run on those terminals would often have predefined sixel images which would be expected to occupy a fixed area of the screen. For example, an image that is 100x200 pixels would be expected to occupy exactly 10x10 character cells.

Linux terminals can't handle that, and as a result are mostly useless as terminal emulators. And to cope with terminals like that, modern sixel software is forced to query the terminal cell size and regenerate images on the fly to match whatever font the user happens to have selected at the time. No other software works this way, but for some reason people seem to think this is a good idea for terminals.

So my expectation was that at some point, someone would request we do the same thing in Windows Terminal, and we might need an option to match the behavior of Linux terminals (i.e. have the cell size exactly match the active font). Although once initialized at a given size, it should remain at that size for the duration of the session if you expect it to behave sensibly (it could potentially readjust after an RIS, though, or possibly even when clearing the screen).

the SixelParser does effectively process a rectangular image that may span multiple (potentially many) ROWs. Is that incorrect?

The sixel protcol itself doesn't necessarily produce rectangular output, but in most cases it probably would be a fixed width, and the way I've currently implemented it we typically reserve an area of the buffer that is as wide as the widest row for a given sequence (there are some edge cases where that isn't necessarily true though).

But the problem with thinking of sixel as a simple image format is that it's often not used that way. Imagine for example that you're creating a painting application that allows the user to scribble all over the screen with their mouse. The way you'd implement that with sixel is that almost every pixel plot would be a new sixel sequence.

For a standard 80x24 screen, that's going to use up 24 image slices at most. But if you're storing each sixel sequence as a separate image object, that's potentially 384000 images! And that's assuming your scribbling only touched each pixel once. In practice it'll likely be much worse than that. Any terminal with that architecture is either going to rapidly die, or drop content. It's just not practical.

BTW if the renderer had access to all ImageSlices at once, I suppose it could stitch them together?

That's definitely an option. I actually tried that at one point in the GDI renderer, because I thought fewer GDI calls might be faster, but it didn't seem to make any difference for me. If anything I think the overhead of combining the slices might have made it slightly slower, so in the end I just stuck with the simpler solution. But you may still find it's worthwhile in the Atlas renderer.

More importantly, it'd be great if we could still add an "ID" to the ImageSlice.

Once this is merged (assuming it does get merged), you're welcome to make whatever changes you think are necessary. And if you still think you can make it work with larger images spanning rows, that's fine too. I just wanted to make sure you were aware of the potential challenges of that approach.

hackerb9 · 2024-06-18T04:22:36Z

No other software works this way, but for some reason people seem to think this is a good idea for terminals.

@j4james is, as usual, completely correct, but I believe he may be speaking a little tongue-in-cheek.

Click here for some boring history

Back in the 1980s, full-screen software programs had to know each sixel hardware terminal's specific character cell size and overall dimensions. I think even back then people knew it was a bad idea to do it that way; both the ReGIS and Tektronix vector graphics protocols, which were contemporaneous, use virtual grids relative to the screen size so images can appear the same on any terminal. Sixel was never given that ability and so software that wanted to align graphics and text nicely had to be aware of the terminal type.

Modern terminal software allows apps to work the same way but instead of a look-up table based on $TERM, the app can request the size directly from the terminal. If ever the font or terminal window size changes, the app receives a SIGWINCH interrupt and knows to redraw itself. Ugly, yes, but easy enough and it works.

Although once initialized at a given size, it should remain at that size for the duration of the session if you expect it to behave sensibly (it could potentially readjust after an RIS, though, or possibly even when clearing the screen).

Probably a dumb question, but couldn't you just clear everything when the size changes and send SIGWINCH?

j4james · 2024-06-18T11:16:47Z

couldn't you just clear everything when the size changes and send SIGWINCH?

SIGWINCH doesn't work everywhere.
A lot of software (arguably most software) won't respond to SIGWINCH anyway, so clearing everything just means you end up with no images at all. A common example would be when you cat an image. No shell is going to magically redraw that output when the terminal sends a SIGWINCH.

j4james · 2024-06-18T22:12:24Z

These are some examples I've been using for testing which demonstrate a few things you can do with sixel besides just blitting an image onto the screen.

Paging animation

If you've got an animation with only a few frames, you can load each frame into a separate page, and then cycle through those pages to play back the animation. The VT340 only supported two pages of sixel, which rather limited what you could achieve with this technique, but we support up to six pages, which is slightly more useful.

This example includes music to control the timing of the frames, so you can just cat the file: horse.vt

Horse.mp4

Color cycling (plasma)

By changing the palette colors of a sixel image after it has been output, you can give the impression that it is animating, when it is essentially just a static image. This example is intended to produce a plasma effect, reminiscent of the 90s demo scene.

Run with python: plasma.py

Sixel.Plasma.mp4

Color cycling (waterfall)

This is using the same palette cycling technique as the plasma effect above, but with an image that is specially designed to produce the effect of water flowing over a waterfall. For more examples of this kind of thing by the same artist (Mark J. Ferrari), see http:https://www.effectgames.com/demos/canvascycle/

Waterfall.mp4

Scrolling margins

Although this wouldn't have been possible on the original VT340, a terminal which supports sixel as well as horizontal margins can use the margin scrolling area to manipulate the image content. This test case also demonstrates the use of rectangular area operations to copy parts of an image between pages.

Run with python: shuffle.py

Sixel.Shuffle.mp4

The observant among you may have noticed that some of these examples are running in Windows Terminals. This is because I was testing a merge of Leonard's passthrough branch and Atlas renderer. It's still a bit crashy at times, but it works!

hackerb9 · 2024-06-18T22:48:46Z

@j4james Okay, that was fantastic, particularly the Sixel Shuffle. Has any other terminal implemented rectangular copy capabilities yet?

I love that you are interpreting the sixel colors immediately to allow for palette shifts. Are you going to allow for a shared palette between images, similar to xterm*privateColorRegisters: False? That functionality actually was used by Sixel programs back in the day (see, WordPerfect).

DHowett · 2024-06-18T22:56:20Z

It would be IMPOSSIBLE for me to overstate how much I love this.

j4james · 2024-06-19T00:40:45Z

Has any other terminal implemented rectangular copy capabilities yet?

@hackerb9 In regards to the shuffle test specifically, MLTerm and RLogin are the only terminals I'm aware of that have all the required functionality, because that needs paging, rectangular copy, and horizontal margins. I'm not sure there are any other terminals that can do all of those things as well as sixel. But if we're just considering rectangular copy, then I believe Contour supports that too, and there may well be others.

Are you going to allow for a shared palette between images

Yes, the palette is always shared between images, at least in the sense that it's inherited. Palette changes in the second image won't affect the first though, so it's not a global palette like you have on a hardware terminal. But that's still something I'm hoping we might support one day.

DHowett

Reviewed 34/34 - I would say I understand roughly 85% of the Sixel parser, and I think the image slice design is fine--readable and straightforward.

I'm provisionally signing off given that this is a draft (I'll review incremental changes) despite the capture question.

src/buffer/out/textBuffer.cpp

src/terminal/adapter/SixelParser.cpp

j4james · 2024-06-22T19:56:30Z

I think this is as ready as it'll ever be. But let me know if you want me to put it behind a feature flag.

j4james · 2024-06-27T08:34:08Z

FYI, the audit build failure after the merge is not a problem with the actual audit - it looks like a network error caused the git checkout to fail on that run.

lhecker

(Still need to review the sixel parser.)

src/buffer/out/ImageSlice.cpp

src/host/_output.cpp

src/buffer/out/ImageSlice.cpp

src/renderer/gdi/paint.cpp

src/terminal/adapter/SixelParser.hpp

lhecker · 2024-06-28T01:13:14Z

src/terminal/adapter/SixelParser.hpp

+ bool _textCursorWasVisible;
+ til::CoordType _availablePixelWidth;
+ til::CoordType _availablePixelHeight;
+ til::CoordType _maxPixelAspectRatio;
+ til::CoordType _pixelAspectRatio;
+ til::CoordType _sixelHeight;
+ til::CoordType _segmentHeight;
+ til::CoordType _pendingTextScrollCount;
+ til::size _backgroundSize;
+ bool _backgroundFillRequired;


It seems that these members all have undefined values after construction. Should we initialize them here? (Some more below).

Looking through my comments again, I think this is the only one where I'd prefer waiting for your response. All the other ones were just my thoughts while I was reading the code.

Yeah, I wasn't really sure what to do here. All of the values that aren't initialized here are ones that are required to be reinitialized on every DefineImage call (the initialization is handled in the various _initXXX methods). So anything we initialize here would be meaningless - it's never going to be used.

At one point I just had everything set to 0, but that seemed confusing, because you start wondering what it means to have a zero aspect ratio, or a zero sixel height. And the answer is that it doesn't mean anything because those fields aren't supposed to be initialized there!

So in the end I figured it might make more sense to just leave them unset. But if you think it's better to clear all the fields, I'm happy to do that. I don't feel strongly about it either way.

I think it makes sense to initialize them. Assuming we do have a bug somewhere, perhaps only in the future if we modify this code, this would allow us to consistently reproduce the bug whereas without the initialization the member values would all be random and so they may result in random behavior which would make finding the root cause more difficult (or at least less consistent).

src/terminal/adapter/SixelParser.hpp

lhecker

Done reviewing it now. I don't have anything major to say, mostly nits.

One thing that'd be nice is if you had any sixel test files that contain potential edge cases that you'd be willing to contribute. For instance, for semi-transparency, etc., just so that we can test any changes we make to this code or the renderers in the future.

src/terminal/adapter/SixelParser.cpp

PhMajerus · 2024-06-28T20:09:59Z

One thing that'd be nice is if you had any sixel test files that contain potential edge cases that you'd be willing to contribute. For instance, for semi-transparency, etc., just so that we can test any changes we make to this code or the renderers in the future.

Here is a test file with transparency:
SixelTestFileWithTransparency.txt
(Not drawing all pixels of its canvas, successfully tested in mlterm and background properly shows through. xterm -ti vt340 on the other hand doesn't handle transparency)

And the same but with a hyperlink on the sixel image, which I think should be a supported scenario as well:
SixelTestFileWithTransparencyAndHyperlink.txt
(I couldn't find an existing terminal that supports both sixels and hyperlinks to test this one)

j4james · 2024-06-29T23:51:28Z

Here is a test file with transparency:
SixelTestFileWithTransparency.txt

That file doesn't actually have transparency! That's why it doesn't work in XTerm.

And the same but with a hyperlink on the sixel image, which I think should be a supported scenario as well:
SixelTestFileWithTransparencyAndHyperlink.txt

I'm glad you brought this up, because it is worth considering, but it's probably best discussed in a followup issue if/when we have sixel in Windows Terminal itself, because conhost doesn't support hyperlinks.

Although I'll say now that I personally don't think hyperlinks should be associated with the sixel output. If you want to attach a link to the image, you can just apply it to the text area behind (easily doable with a DECFRA call). And that gives you the flexibility to apply different links to different parts of an image, similar to an HTML image map. There are other complications as well, but we can discuss all that later.

j4james · 2024-06-30T00:01:46Z

One thing that'd be nice is if you had any sixel test files that contain potential edge cases that you'd be willing to contribute.

Most of the important compatibility tests are in hackerb9's vt340test repo. My contributions are in the j4james directory.

In an ideal world I would have liked to convert those into units tests of some kind, but I have no idea how that could be made to work.

PhMajerus · 2024-06-30T00:14:57Z

@j4james My understanding is sixels work pretty much like a dot-matrix printer with 6 needles and several colors of ink ribbons.
So it's possible to have several colors by changing the color at any time, and if you need pixels of different colors in the same 6 pixels vertical band, you need to return to the beginning of the line and overstrike using each color.
This means it may even be possible to mix colors by overstriking the same pixel on some printers depending on the ink they use, and also means if a pixel isn't output in any of the color, the page's paper is left as it.

The pixels shown in magenta in this picture are not set in any color in the sixel:

In mlterm, it properly shows as leaving those pixels as background color:

This technique of not outputting a pixel in any of the colors to keep the original background color works fine in mlterm, and is what I used in the test image. Are you saying the specs and real hardware do not handle that as transparency? and then, how do they achieve transparency if at all possible?

j4james · 2024-06-30T02:01:31Z

if a pixel isn't output in any of the color, the page's paper is left as it.

@PhMajerus Your understanding is correct as far as it goes. What you're missing is that sixel terminals had a feature called "background select", which by default filled the background area before plotting any pixels. In that case, when you don't output a pixel, it's just going to end up with the background fill color, so won't be transparent.

On a real terminal, the background fill color comes from entry 0 in the color table, and by default that should be black unless it's redefined. But it look to me like Mlterm is possibly filling with the active text background color, which gives you the impression that it's transparent, but it's really not, and it's technically incorrect.

Xterm, as far as I can recall, will fill the background with a shade of gray, because your image defines color number 0 as 78;78;78. However, that's also not correct. Color number 0 in the sixel content is not necessarily the same thing as color table entry 0, but Xterm doesn't handle color definitions correctly.

how do they achieve transparency if at all possible?

Parameter 2 of the DCS sequence specifies whether "background select" should be applied or not. If it's 0 or 2, the background is filled (the default behavior). If it's 1, the background won't be filled, and anywhere that you haven't plotted a pixel should be transparent (as you originally expected). So if you want your image to be transparent, it should start with something like \eP0;1q.

And the best way to test transparency is to output your image over some actual text content, and make sure you can see the underlying text showing through. Most Linux terminals are unlikely to support it though. Last I tested I think Xterm was the only one that did. But that was a while back, so it's possible some of the others have improved since then.

PhMajerus · 2024-07-02T16:29:48Z

@j4james

On a real terminal, the background fill color comes from entry 0 in the color table, and by default that should be black unless it's redefined. But it look to me like Mlterm is possibly filling with the active text background color, which gives you the impression that it's transparent, but it's really not, and it's technically incorrect.

Xterm, as far as I can recall, will fill the background with a shade of gray, because your image defines color number 0 as 78;78;78. However, that's also not correct. Color number 0 in the sixel content is not necessarily the same thing as color table entry 0, but Xterm doesn't handle color definitions correctly.

I think you're right:
mlterm unset pixels of the sixel are white, but the terminal palette color#0 is black and the sixel color#0 is gray, so apparently it fills the sixel background with the current text background color, not the sixel's color#0, nor the terminal current palette's color#0.
xterm fills with gray, which is the sixel color#0.

Parameter 2 of the DCS sequence specifies whether "background select" should be applied or not. (...)

You are absolutely right, I overlooked the P2 parameter in the VT330/340 reference manual. Changing P2 to 1 properly makes the background transparent in all terminals, so it works in mlterm, xterm, and conhost canary.
Thanks for taking the time to look into it and explain it to me.

So here is the corrected file:
SixelTestFileWithTransparency.txt

I still noticed a difference between conhost canary and mlterm and xterm. In your implementation, the next line of text overwrites the last line of the sixel image, while both mlterm and xterm are doing a newline before printing the next echo text.

conhost:

mlterm:

xterm:

I suspect that is part of the problem you mentioned with sixels overlapping text.

This PR add supports for two query sequences that are used to determine the pixel size of a character cell: * `CSI 16 t` reports the pixel size of a character cell directly. * `CSI 14 t` reports the pixel size of the text area, and when divided by the character size of the text area, you can get the character cell size indirectly (this method predates the introduction of `CSI 16 t`). These queries are used by Sixel applications that want to fit an image within specific text boundaries, so need to know how many cells would be covered by a particular pixel size, or vice versa. Our implementation of Sixel uses a virtual cell size that is always 10x20 (in order to emulate the VT340 more accurately), so these queries shouldn't really be needed, but some applications will fail to work without them. ## References and Relevant Issues Sixel support was added to conhost in PR #17421. ## Validation Steps Performed I've added some unit tests to verify that these queries are producing the expected responses, and I've manually tested on [XtermDOOM] (which uses `CSI 16 t`), and the [Notcurses] library (which uses `CSI 14 t`). [XtermDOOM]: https://gitlab.com/AutumnMeowMeow/xtermdoom [Notcurses]: https://github.com/dankamongmen/notcurses ## PR Checklist - [x] Tests added/passed

j4james · 2024-07-02T19:06:39Z

In your implementation, the next line of text overwrites the last line of the sixel image, while both mlterm and xterm are doing a newline before printing the next echo text.

@PhMajerus The final text cursor position is meant to end up on top of the image, otherwise you wouldn't be able to output images on the bottom row of the display. If you want it below the image you need to add a linefeed yourself.

Xterm and mlterm both use non-standard algorithms for positioning the cursor, but they don't actually match each other. The fact that they both ended up with the text below the image in your example is just a coincidence. In some cases the text will overlap the image and in other cases it won't. It depend on image size and font size.

PhMajerus · 2024-07-02T21:41:20Z

Thanks again @j4james, both for the time explaining how it is supposed to work and for implementing it. I can't wait for it to work in wt as well!
So here is a test file with both transparency and text alignment around a sixel image:
Pepelogoo.txt
It uses 9 colors and transparent background.

(Yeah, I know, this is conhost for now, but I'm sure we'll get it in Terminal soon as well)

j4james added 7 commits June 11, 2024 02:25

Hook up the Sixel DCS sequence to the dispatch classes.

6eee84a

Provide storage for images in the text buffer.

dda425a

Create a Sixel parsing class.

4b5d784

Output image content from the renderer.

8795ba3

Tie everything together.

4acc50d

Add terms to spellbot dictionary.

004a183

Update the device attributes test.

7916799

j4james commented Jun 11, 2024

View reviewed changes

lhecker reviewed Jun 11, 2024

View reviewed changes

Correct campbell color table size.

518cc05

Avoid debug assertions when advancing iterators.

e52eef1

Make sure the pixel aspect ratio isn't too big.

673bef9

Prevent overflow when copying to resized buffer.

4853123

DHowett approved these changes Jun 20, 2024

View reviewed changes

src/buffer/out/textBuffer.cpp Outdated Show resolved Hide resolved

src/terminal/adapter/SixelParser.cpp Show resolved Hide resolved

j4james added 2 commits June 22, 2024 13:50

Erase image content when line rendition changes.

bf5c6fe

Correct the VT125 color handling.

baba91b

j4james marked this pull request as ready for review June 22, 2024 19:53

DHowett approved these changes Jun 22, 2024

View reviewed changes

j4james added 2 commits June 27, 2024 01:53

Merge branch 'main' into feature-sixel

da3f95d

Use the newly merged mode reporting mechanism.

adf507f

lhecker reviewed Jun 28, 2024

View reviewed changes

src/terminal/adapter/SixelParser.cpp Show resolved Hide resolved

src/terminal/adapter/SixelParser.cpp Outdated Show resolved Hide resolved

src/terminal/adapter/SixelParser.cpp Show resolved Hide resolved

j4james added 2 commits June 29, 2024 16:23

PR feedback

48d203d

Merge branch 'main' into feature-sixel

6401446

lhecker approved these changes Jun 30, 2024

View reviewed changes

zadjii-msft added this pull request to the merge queue Jul 1, 2024

Merged via the queue into microsoft:main with commit 236c003 Jul 1, 2024
15 checks passed

lhecker mentioned this pull request Jul 1, 2024

Initialize all SixelParser members #17497

Merged

NickAcPT mentioned this pull request Jul 2, 2024

Windows Console now supports Sixel smasher164/arewesixelyet#33

Open

j4james mentioned this pull request Jul 2, 2024

Add support for querying the character cell size #17504

Merged

1 task

erwin mentioned this pull request Jul 2, 2024

[RFC] Sixel support kovidgoyal/kitty#2511

Closed

grynspan mentioned this pull request Jul 17, 2024

Add support for Sixel graphics apple/swift-testing#547

Open

Add support for Sixel images in conhost #17421

Add support for Sixel images in conhost #17421

Conversation

j4james commented Jun 11, 2024

Summary of the Pull Request

References and Relevant Issues

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

DHowett commented Jun 11, 2024

DHowett commented Jun 11, 2024

j4james Jun 11, 2024

Choose a reason for hiding this comment

DHowett commented Jun 11, 2024

zadjii-msft commented Jun 11, 2024

lhecker Jun 11, 2024

Choose a reason for hiding this comment

j4james Jun 12, 2024

Choose a reason for hiding this comment

j4james Jun 13, 2024

Choose a reason for hiding this comment

DHowett commented Jun 11, 2024 • edited Loading

lhecker commented Jun 12, 2024 • edited Loading

j4james commented Jun 13, 2024 • edited Loading

j4james commented Jun 13, 2024

lhecker commented Jun 13, 2024

j4james commented Jun 14, 2024

hackerb9 commented Jun 18, 2024

j4james commented Jun 18, 2024

j4james commented Jun 18, 2024

hackerb9 commented Jun 18, 2024 • edited Loading

DHowett commented Jun 18, 2024

j4james commented Jun 19, 2024

DHowett left a comment

Choose a reason for hiding this comment

j4james commented Jun 22, 2024 • edited Loading

j4james commented Jun 27, 2024

lhecker left a comment

Choose a reason for hiding this comment

lhecker Jun 28, 2024

Choose a reason for hiding this comment

lhecker Jun 28, 2024

Choose a reason for hiding this comment

j4james Jun 29, 2024

Choose a reason for hiding this comment

lhecker Jun 30, 2024 • edited Loading

Choose a reason for hiding this comment

lhecker left a comment • edited Loading

Choose a reason for hiding this comment

PhMajerus commented Jun 28, 2024

j4james commented Jun 29, 2024

j4james commented Jun 30, 2024

PhMajerus commented Jun 30, 2024 • edited Loading

j4james commented Jun 30, 2024

PhMajerus commented Jul 2, 2024 • edited Loading

j4james commented Jul 2, 2024

PhMajerus commented Jul 2, 2024 • edited Loading

DHowett commented Jun 11, 2024 •

edited

Loading

lhecker commented Jun 12, 2024 •

edited

Loading

j4james commented Jun 13, 2024 •

edited

Loading

hackerb9 commented Jun 18, 2024 •

edited

Loading

j4james commented Jun 22, 2024 •

edited

Loading

lhecker Jun 30, 2024 •

edited

Loading

lhecker left a comment •

edited

Loading

PhMajerus commented Jun 30, 2024 •

edited

Loading

PhMajerus commented Jul 2, 2024 •

edited

Loading

PhMajerus commented Jul 2, 2024 •

edited

Loading