Add a new renderer class with XSS protections #60

Changaco · 2017-01-14T09:57:22Z

This Pull Request implements #59. Let me know if anything doesn't look right.

We need an escape function to implement XSS protection.

FSX

Thanks for your pull request!

FSX · 2017-01-14T20:43:23Z

misaka/api.py

+ :arg img_src_rewrite: the URL of an image proxy, necessary to rewrite the
+ ``src`` attributes of images
+
+ Both srings should include a ``{link}`` placeholder for the URL-encoded


srings should be strings

FSX · 2017-01-14T21:06:05Z

misaka/api.py

+
+ def __init__(self, flags=(), sanitization_mode='skip-html', nesting_level=0,
+ link_rewrite=None, img_src_rewrite=None):
+ if not isinstance(flags, tuple):


This breaks compatibility with HtmlRenderer, which accepts flags as a tuple of strings and ORed constants. It's not a big issue, but it won't be a drop-in for HtmlRenderer.

I would probably be better to have the flags be one type, but it'll be annoying for people if their code breaks.

I didn't add support for passing integers because I saw that it's deprecated.

Oops. My mistake.

FSX · 2017-01-14T21:28:02Z

tests/test_xss_protection.py

+
+ def test_html_escape(self):
+ supplied = 'Example <script>alert(1);</script>'
+ expected = '<p>%s</p>\n' % escape_html(supplied)


I prefer str.format. Is there a reason you used old-style formatting?

FSX · 2017-01-14T21:31:56Z

misaka/api.py

+ else:
+ return escape_html("[%s](%s)" % (content, raw_link))
+
+ def check_link(self, link, is_image_src=False):


is_image_src is not used.

Why not put self._allowed_url_re.match(link) directly in the render functions?

The separate check_link method is to allow customization of link filtering by subclasses.

You're right that is_image_src isn't used, that's a copy-paste error on my part.

FSX · 2017-01-14T21:35:56Z

misaka/api.py

+ link = self.rewrite_link(raw_link)
+ maybe_title = ' title="%s"' % escape_html(title) if title else ''
+ link = escape_html(link)
+ return ('<a href="%s"%s>' + content + '</a>') % (link, maybe_title)


I prefer str.format. Is there a reason you used old-style formatting?

I use percent formatting by default, mostly because it's shorter and easier to type.

Ok. That's cool.

FSX · 2017-01-14T21:43:36Z

tests/test_xss_protection.py

+ for url in ('http:https://a', 'https://b'):
+ actual = render_rewrite("['foo](%s \"bar'\")" % url)
+ expected = '<p><a href="%s" title="bar&#39;">&#39;foo</a></p>\n'
+ ok(actual).diff(expected % rewrite_link(url))


I would put % rewrite_link(url) on the above line.

FSX · 2017-01-14T21:47:05Z

misaka/api.py

+ """
+ return bool(self._allowed_url_re.match(link))
+
+ def rewrite_link(self, link, is_image_src=False):


Maybe rewrite_url? As it's both used for links and images.

FSX · 2017-01-14T21:52:31Z

tests/test_xss_protection.py

+ ok(actual).diff(expected)
+
+ def test_autolink_rewriting(self):
+ for url in ('http:https://a', "https://b?x&y"):


Use single quotes. ;D

FSX · 2017-01-14T22:09:47Z

misaka/api.py

+ """
+ Filters links.
+ """
+ if self.check_link(raw_link):


When I enable the autolink extension email addresses are not rendered into links anymore.

My testing code:

from misaka import escape_html, Markdown, SaferHtmlRenderer, HtmlRenderer render_normal = Markdown(HtmlRenderer(), extensions=('autolink',)) render = Markdown(SaferHtmlRenderer()) render_escape = Markdown(SaferHtmlRenderer(sanitization_mode='escape')) renderer_rewrite = SaferHtmlRenderer( link_rewrite='https://example.com/redirect/{link}', img_src_rewrite='https://img_proxy/{link}', ) render_rewrite = Markdown(renderer_rewrite, extensions=('autolink',),) rewrite_link = renderer_rewrite.rewrite_link print(render_normal('<https://b?x&y>')) print(render_normal('Hello https://b?x&y Hola')) print(render_normal('Hello https://example.com Hola')) print(render_normal('Banana [email protected]')) print(render_normal('Banana <[email protected]>')) print('Safe:\n') print(render_rewrite('<https://b?x&y>')) print(render_rewrite('Hello https://b?x&y Hola')) print(render_rewrite('Hello https://example.com Hola')) print(render_rewrite('Banana [email protected]')) print(render_rewrite('Banana <[email protected]>'))

Output:

<a href="https://b?x&y">https://b?x&y</a> Hello https://b?x&y Hola Hello <a href="https://example.com">https://example.com</a> Hola Banana <a href="mailto:[email protected]">[email protected]</a> Banana <a href="mailto:[email protected]">[email protected]</a> Safe: <a href="https://example.com/redirect/https%3A//b%3Fx%26y">https://b?x&y</a> Hello https://b?x&y Hola Hello <a href="https://example.com/redirect/https%3A//example.com">https://example.com</a> Hola Banana <[email protected]> Banana <[email protected]>

That's not exactly a bug, only HTTP and HTTPS links are allowed by default, so mailto: is rejected. Do you think mailto: should be allowed by default, or that the documentation should be modified to clarify that the new renderer rejects email addresses?

The latter is a good choice I think.

Changaco · 2017-01-15T13:08:38Z

@FSX I've tried to address all your comments. Let me know if you'd like me to squash the commits to keep the Git history clean.

Changaco · 2017-01-15T13:24:25Z

The python 2.6 build has errored on Travis.

@FSX Have you considered running a single build for all python versions? This project already has Tox set up so it'd be very easy to change from multiple builds to a single one.

FSX · 2017-01-15T13:46:31Z

Looks good.

I do like the separate builds of Travis. When something errors out I just restart the build.

FSX · 2017-01-15T13:48:20Z

Merging this now. Thanks!

I'll do a release when I get home tonight.

Changaco added 3 commits January 13, 2017 14:21

add a binding for hoedown_escape_html()

53c2b95

We need an escape function to implement XSS protection.

add a new HtmlRenderer subclass with XSS protections

1aa4e1f

add a built-in implementation of URL rewriting

8783ac4

FSX reviewed Jan 14, 2017

View reviewed changes

Changaco added 7 commits January 15, 2017 10:11

fix typo

7b96788

fix copy-paste error

6de2157

clarify link filtering

9833b70

link to stackexchange question

b830a0a

link → url

7a856d5

more consistent quoting style

e098829

improve code style consistency

811ff1e

FSX merged commit bc251d4 into FSX:master Jan 15, 2017

Changaco deleted the xss-protection branch January 15, 2017 13:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new renderer class with XSS protections #60

Add a new renderer class with XSS protections #60

Changaco commented Jan 14, 2017

FSX left a comment

FSX Jan 14, 2017

FSX Jan 14, 2017

Changaco Jan 14, 2017

FSX Jan 14, 2017

FSX Jan 14, 2017

FSX Jan 14, 2017

FSX Jan 14, 2017

Changaco Jan 14, 2017

FSX Jan 14, 2017

Changaco Jan 14, 2017

FSX Jan 14, 2017

FSX Jan 14, 2017

FSX Jan 14, 2017

FSX Jan 14, 2017

FSX Jan 14, 2017

Changaco Jan 14, 2017

FSX Jan 14, 2017

Changaco commented Jan 15, 2017

Changaco commented Jan 15, 2017

FSX commented Jan 15, 2017

FSX commented Jan 15, 2017

Add a new renderer class with XSS protections #60

Add a new renderer class with XSS protections #60

Conversation

Changaco commented Jan 14, 2017

FSX left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Changaco commented Jan 15, 2017

Changaco commented Jan 15, 2017

FSX commented Jan 15, 2017

FSX commented Jan 15, 2017