Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pulldown_cmark dep to v0.10, and add pulldown_cmark_escape dep. #2432

Merged
merged 1 commit into from
Feb 18, 2024

Conversation

timonvo
Copy link
Contributor

@timonvo timonvo commented Feb 6, 2024

The pulldown_cmark escaping functionality is now shipped in a separate
pulldown_cmark_escape crate
(https://crates.io/crates/pulldown-cmark-escape), starting with v0.10.0.

The markdown.rs module has to be adapted to a few API changes in
pulldown_cmark, and we have to introduce explicit handling of alt
text to ensure it continues to be properly escaped.

There are also a few other behavior changes that are caught by the
tests, but these actually seem to be desired, so I've updated the insta
snapshot files for those tests to incorporate those changes.

Sanity check:

Code changes

(Delete or ignore this section for documentation changes)

  • Are you doing the PR on the next branch?

If the change is a new feature or adding to/changing an existing one:

  • Have you created/updated the relevant documentation page(s)?
    --> N/A

@timonvo timonvo mentioned this pull request Feb 6, 2024
@timonvo timonvo force-pushed the update-cmark branch 2 times, most recently from 5fa1a01 to fd8b375 Compare February 6, 2024 16:54
@@ -254,6 +255,7 @@ pub fn markdown_to_html(
let mut error = None;

let mut code_block: Option<CodeBlock> = None;
let mut alt_text: bool = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment for what this is? No need for the type as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good call. I've actually renamed the variable to inside_attribute as well, to be a little more descriptive.

@@ -83,7 +81,7 @@ line 1 of code
line 2 of code
line 3 of code
</code></pre>
<p>Block code &quot;fences&quot;</p>
<p>Block code "fences"</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how come it's not escaping anymore here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is due to a behavior change in pulldown-cmark, rather than the changes I made in this pull request. I think it might be pulldown-cmark/pulldown-cmark#830, specifically. It introduces a new escape_html_body_text which should be used for escaping text nodes and which only escapes '<', '>', and '&', as opposed to escape_html which should be used for escaping HTML attributes and which escapes single and double quotes as well.

I believe that pulldown-cmark's push_html will now convert Event::Text nodes to HTML using escape_html_body_text (see src/html.rs in that pull request), hence why we see these quotes not being escaped anymore. This is also why we need to handle the alt attribute's text manually now, to ensure it gets attribute-escaped rather than body-escaped (which is now the default).

And FWIW, all other existing uses of escape_html that I could find in Zola were for escaping HTML attributes, so I think none of them need to be switched to escape_html_body_text.

</div>
<h1 id="classes" class="bold another">Classes</h1>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it was inserting the header in the footnote?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems like it. pulldown-cmark 0.10 contains a bunch of footnotes-related fixes, which I'm guessing are responble for this change. Since the new version seems to be correct and the old version seemed wrong, I didn't investigate further to figure out exactly which commit fixed this.

The pulldown_cmark escaping functionality is now shipped in a separate
pulldown_cmark_escape crate
(https://crates.io/crates/pulldown-cmark-escape), starting with v0.10.0.

The markdown.rs module has to be adapted to a few API changes in
pulldown_cmark, and we have to introduce explicit handling of <img> alt
text to ensure it continues to be properly escaped.

There are also a few other behavior changes that are caught by the
tests, but these actually seem to be desired, so I've updated the insta
snapshot files for those tests to incorporate those changes.
Specifically, one footnote-parsing case seems to be handled better now,
and pulldown-cmark's `push_html` now doesn't escape quotes in text nodes
anymore (see pulldown-cmark/pulldown-cmark#836).
@timonvo
Copy link
Contributor Author

timonvo commented Feb 16, 2024

Thanks for the review, I think this is ready for another round. Note that I also added a note to the CHANGELOG.md file to draw attention to the minor behavior changes people may see after this change.

@Keats Keats merged commit 42fc576 into getzola:next Feb 18, 2024
5 checks passed
@Keats
Copy link
Collaborator

Keats commented Feb 18, 2024

Thanks!

veluca93 pushed a commit to veluca93/zola that referenced this pull request May 14, 2024
getzola#2432)

The pulldown_cmark escaping functionality is now shipped in a separate
pulldown_cmark_escape crate
(https://crates.io/crates/pulldown-cmark-escape), starting with v0.10.0.

The markdown.rs module has to be adapted to a few API changes in
pulldown_cmark, and we have to introduce explicit handling of <img> alt
text to ensure it continues to be properly escaped.

There are also a few other behavior changes that are caught by the
tests, but these actually seem to be desired, so I've updated the insta
snapshot files for those tests to incorporate those changes.
Specifically, one footnote-parsing case seems to be handled better now,
and pulldown-cmark's `push_html` now doesn't escape quotes in text nodes
anymore (see pulldown-cmark/pulldown-cmark#836).
LunarEclipse363 added a commit to LunarEclipse363/zola that referenced this pull request Jun 7, 2024
…cape dep. (getzola#2432)"

This reverts commit 42fc576.

Fixes the weird @@ZOLA_SC_PLACEHOLDER@@ issue.
Keats pushed a commit that referenced this pull request Jun 20, 2024
#2432)

The pulldown_cmark escaping functionality is now shipped in a separate
pulldown_cmark_escape crate
(https://crates.io/crates/pulldown-cmark-escape), starting with v0.10.0.

The markdown.rs module has to be adapted to a few API changes in
pulldown_cmark, and we have to introduce explicit handling of <img> alt
text to ensure it continues to be properly escaped.

There are also a few other behavior changes that are caught by the
tests, but these actually seem to be desired, so I've updated the insta
snapshot files for those tests to incorporate those changes.
Specifically, one footnote-parsing case seems to be handled better now,
and pulldown-cmark's `push_html` now doesn't escape quotes in text nodes
anymore (see pulldown-cmark/pulldown-cmark#836).
berdandy pushed a commit to berdandy/azola that referenced this pull request Sep 17, 2024
getzola#2432)

The pulldown_cmark escaping functionality is now shipped in a separate
pulldown_cmark_escape crate
(https://crates.io/crates/pulldown-cmark-escape), starting with v0.10.0.

The markdown.rs module has to be adapted to a few API changes in
pulldown_cmark, and we have to introduce explicit handling of <img> alt
text to ensure it continues to be properly escaped.

There are also a few other behavior changes that are caught by the
tests, but these actually seem to be desired, so I've updated the insta
snapshot files for those tests to incorporate those changes.
Specifically, one footnote-parsing case seems to be handled better now,
and pulldown-cmark's `push_html` now doesn't escape quotes in text nodes
anymore (see pulldown-cmark/pulldown-cmark#836).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants