Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sanitizer config - multiple rules #11133

Merged
merged 8 commits into from
Apr 29, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Clarify sanitizer documentation
Signed-off-by: Alexander Scheel <[email protected]>
  • Loading branch information
cipherboy committed Apr 20, 2020
commit 70941e97edf9f3b8f3ebe8aad6a7ea7feab4f111
38 changes: 34 additions & 4 deletions docs/content/doc/advanced/config-cheat-sheet.en-us.md
Original file line number Diff line number Diff line change
Expand Up @@ -646,7 +646,7 @@ Two special environment variables are passed to the render command:
Gitea supports customizing the sanitization policy for rendered HTML. The example below will support KaTeX output from pandoc.

```ini
[markup.sanitizer.1]
[markup.sanitizer.TeX]
; Pandoc renders TeX segments as <span>s with the "math" class, optionally
; with "inline" or "display" classes depending on context.
ELEMENT = span
Expand All @@ -658,11 +658,41 @@ REGEXP = ^\s*((math(\s+|$)|inline(\s+|$)|display(\s+|$)))+
- `ALLOW_ATTR`: The attribute this policy allows. Must be non-empty.
- `REGEXP`: A regex to match the contents of the attribute against. Must be present but may be empty for unconditional whitelisting of this attribute.

You must define `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` in each numbered section.
**Note**: The above section naming policy is new; previously the section was `[markup.sanitizer]` and keys could be redefined.
Now, a unique identifier must appear in the section name (e.g., `[markup.sanitizer.TeX]`) in order to parse multiple rules.
This was changed because the implementation with the ini parser used was flawed; the following configs were indistinguishable after parsing:

To define multiple entries, increment the number in the section (e.g., `[markup.sanitizer.1]` and `[markup.sanitizer.2]`).
```ini
[markup.sanitizer]
ELEMENT = a
ALLOW_ATTR = target
REGEXP = $1
ELEMENT = a
ALLOW_ATTR = rel
REGEXP = $2
ELEMENT = img
ALLOW_ATTR = src
REGEXP = $3
```

and

```ini
[markup.sanitizer]
ELEMENT = a
ALLOW_ATTR = target
REGEXP = $1
ELEMENT = img
ALLOW_ATTR = rel
REGEXP = $2
ELEMENT = img
ALLOW_ATTR = src
REGEXP = $3
```

Because of limitations in the ini library, we are unable to automatically migrate configurations.

**Note**: The above section numbering policy is new; previously the section was `[markup.sanitizer]` and keys could be redefined.
We will still parse the first rule from a `[markup.sanitizer]` section if present, but multiple rules must be manually migrated.

## Time (`time`)

Expand Down
7 changes: 4 additions & 3 deletions docs/content/doc/advanced/external-renderers.en-us.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ IS_INPUT_FILE = false
If your external markup relies on additional classes and attributes on the generated HTML elements, you might need to enable custom sanitizer policies. Gitea uses the [`bluemonday`](https://godoc.org/github.com/microcosm-cc/bluemonday) package as our HTML sanitizier. The example below will support [KaTeX](https://katex.org/) output from [`pandoc`](https://pandoc.org/).

```ini
[markup.sanitizer.1]
[markup.sanitizer.TeX]
; Pandoc renders TeX segments as <span>s with the "math" class, optionally
; with "inline" or "display" classes depending on context.
ELEMENT = span
Expand All @@ -86,9 +86,10 @@ FILE_EXTENSIONS = .md,.markdown
RENDER_COMMAND = pandoc -f markdown -t html --katex
```

You must define `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` in each numbered section.
You must define `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` in each section.

To define multiple entries, increment the number in the section (e.g., `[markup.sanitizer.1]` and `[markup.sanitizer.2]`).
To define multiple entries, define different section names (e.g., `[markup.sanitizer.1]` and `[markup.sanitizer.2]`).
These can be numbers, identifying names, or anything else.

Once your configuration changes have been made, restart Gitea to have changes take effect.

Expand Down