Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try fix bad alignment in unicode (#1144) #1145

Merged
merged 5 commits into from
Dec 3, 2022

Conversation

SheldonNico
Copy link
Contributor

Add spaces in truncate_str doesn't satisify the function description. So I change the function pad_panel_line_to_width (maybe not right)

@th1000s
Copy link
Collaborator

th1000s commented Oct 20, 2022

Thank you, great find. Indeed, the length of Chinese characters is usually 2 (do you know any unicode points which map to 3 or more cells?), but here the assumption is <= 1.

Now however measure_text_width() is called called twice, the first time in truncate_str(). The fill-up logic would be better placed in there. It would also mean that this function now returns something with the exact display length if something had to be cut off, i.e. the rest is filled up. Because placing a space after the would look uneven, it is better to place them before or even use a blue symbol (·→, similar to the ) to make it clear that this is not the original input.

And somewhere in the wrapping logic the same miscounting happens which results in truncation instead of wrapping, probably when line_is_too_long() assumes that line.graphemes(true).count() equals the display width. Then spaces or · will also have to be placed before the WrapConfig::left_symbol ().

Save the diff below as 'badwrap' and run:
for i in {34..22}; do target/debug/delta --width=$i < badwrap | tail -n +4; done

@@ -1,5 +0,0 @@
-一二三四
-1一二三四
-12一二三四
-123一二三四
-123456789

@SheldonNico
Copy link
Contributor Author

do you know any unicode points which map to 3 or more cells?), but here the assumption is <= 1.

I didn't find any Chinese character map to 3 or more. But from the code in unicode-width, 3 means ambiguous width.

The fill-up logic would be better placed in there.

Yes, Add a symbol before is much better.

I will try in my next commit.

@dandavison
Copy link
Owner

Hi @SheldonNico, thanks very much for working on this. I'm sorry I let it slip and then forgot about it.

@SheldonNico
Copy link
Contributor Author

Add space(no custom symbol for simplicity) in truncate_str is easy.

But change how width calculated in wrapping need too much modifications, and some modifications may not be right since I didn't figure out all function logical. I try to fix broken tests, and it seems to work. Feel free to not accept this PR, if you have plan in the future.

image

@SheldonNico
Copy link
Contributor Author

do you know any unicode points which map to 3 or more cells?

Yes, original test find one: क्षि , and playground. It break my text editor. But in browser, it take 2 columns.

@th1000s
Copy link
Collaborator

th1000s commented Oct 24, 2022

Very nice, looks good to me. Also, you may add this commit which will correct the tests instead of adapting the code to work with the old tests.

Add space(no custom symbol for simplicity)

Yes, that would have to become customizable. I will do that later when making the wrapping aware of word boundaries.

@SheldonNico
Copy link
Contributor Author

SheldonNico commented Oct 25, 2022

Done.

As I notice, there are some updates. Do I need to do something in my branch?

@dandavison
Copy link
Owner

Thank you very much @SheldonNico and @th1000s. This will be released today.

@dandavison dandavison merged commit 54e1ee7 into dandavison:master Dec 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants