Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode normalization #86

Closed
agitator opened this issue Feb 12, 2020 · 2 comments
Closed

Unicode normalization #86

agitator opened this issue Feb 12, 2020 · 2 comments

Comments

@agitator
Copy link
Contributor

After further discussion of these issues
plone/Products.CMFPlone#2912
zopefoundation/Zope#670
we cam to the conclusion, that the best place to fix this would be normalize any characters in fields that handle unicode.

Proposed solution:

  • normalize unicode to combined unicode characters on field level (text)
  • provide a mechanism to suppress normalization
@gogobd
Copy link
Contributor

gogobd commented Feb 14, 2020

I was looking into that at the Alpine City Plone Sprint and tracked the problem down to the way unicodes are being handled by Python, Zope and Plone. First I was looking into Zope's HTTPRequest and learned that its _decode function isn't being used by Plone under all circumstances. Further I learned that class ZopeFieldStorage uses Python cgi's FieldStorage which is - and should under all circumstances stay - completely agnostic of any encodings. It's just handling binary data.

So handling incoming unicode in the request(s) would turn out to be quite invasive. And I abandoned that approach.

The proposed solution of handling unicode appropriately on field level immediately looked better. Unicode handling in zope.schema is pretty much straight forward and I was able to isolate the problem to just one class: Text, implementer of IFromUnicode. My proposed fix would also provide a flag to suppress unicode normalization for the (in my humble opinion rare) occasions where it is unwanted.

I will create a pull request, see #87.

gogobd added a commit that referenced this issue Feb 16, 2020
Fixing #86 by normalizing unicode for IFromUnicode
gogobd added a commit that referenced this issue Feb 17, 2020
icemac pushed a commit that referenced this issue Feb 28, 2020
The change is not backwards compatible.
icemac pushed a commit that referenced this issue Feb 28, 2020
@jamadden
Copy link
Member

Fixed in #87 and #90

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants