Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistune.util.escape_url is too aggressive #295

Merged
merged 1 commit into from
Jan 14, 2022

Conversation

dairiki
Copy link
Contributor

@dairiki dairiki commented Jan 6, 2022

This adds ';', '!', and '$' to the set of characters which will be passed unmolested by mistune.util.escape_url. These are all in RFC 3986’s reserved character list — that is to say: escaping these may change the meaning of a URL.

Bugs Fixed

I noticed this when a data: URL was mangled by the escaping of a semicolon. The two URLs: data:text/plain;base64,d293 and data:text/plain%3Bbase64,d293 are not equivalent (the first has content-type text/plain, the second has content-type text/plain;base64.)

Further Discussion

There are three other characters in the RFC 3986 reserved character list which may be worth adding to the list of unescaped octets: '[', ']', "'".

Escaping square brackets breaks URLs with numeric ipv6 addresses in the netloc (e.g. https://[::1]/).
However, I have not yet included these in this PR because doing so breaks the Backslash-escapes do not work inside autolinks test in tests/fixtures/commonmark.txt, and I'm not quite sure how sacrosanct those tests are to you.

As for single quotes, I can not come up with an example of a URL scheme that relies on them, but they are in the list of reserved characters, and it would probably be safer not to escape them.

This adds ';', '!', and '$' to the set of characters which will be
passed unmolested by escape_url.  These are all in RFC 3986 reserved
character list — that is to say: escaping these may change the meaning
of a URL.
bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this pull request Jan 8, 2022
https://build.opensuse.org/request/show/944542
by user mcepl + dimstar_suse
- Add 295-overagreesive-escape_url.patch make
  mistune.util.escape_url less aggressive
  (gh#lepture/mistune#295).
@dairiki
Copy link
Contributor Author

dairiki commented Jan 13, 2022

As far as I can tell, the CI test failures are spurious. They are failing in the first "Set up job" step.
I think that maybe github's test runners were having some kind of trouble. (I experienced this with a different project yesterday too.)

@lepture
Copy link
Owner

lepture commented Jan 14, 2022

Thanks. I'll make a new release.

@lepture lepture merged commit babb0cf into lepture:master Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants