pytest cannot deal with utf-8 encoded repr of a custom object #678

pytestbot · 2015-02-10T06:15:55Z

Originally reported by: Roman Bolshakov (BitBucket: roolebo, GitHub: roolebo)

I have a test module which does use beautiful soup to parse some test data. I added an assertion to check that a variable (I assigned result of parsing to) is an instance of unicode type. I had a bug in my code, a list with various objects got returned instead of the expected unicode string so the assertion fired. Besides, I got a totally unexpected UnicodeDecodeError in pytest.

Here's how it could be reproduced: https://gist.github.com/roolebo/ca816a26cdc0a8b17226

It turned out that beautiful soup returns utf-8 encoded string as a result of repr invocation on Tag object. The gist above could be nailed down without beautiful soup dependency:

#!python
# coding=utf-8
def test_unicode_repr():
    class Foo(object):
        a = 1

        def __repr__(self):
            return '<b class="boldest">Б</b>'
    f = Foo()
    assert 0 == f.a

#!python

lines = ['assert 0 == 1', '{1 = <b class="boldest">\xd0</b>.a', '}']

    def _format_lines(lines):
        """Format the individual lines

        This will replace the '{', '}' and '~' characters of our mini
        formatting language with the proper 'where ...', 'and ...' and ' +
        ...' text, taking care of indentation along the way.

        Return a list of formatted lines.
        """
        result = lines[:1]
        stack = [0]
        stackcnt = [0]
        for line in lines[1:]:
            if line.startswith('{'):
                if stackcnt[-1]:
                    s = u('and   ')
                else:
                    s = u('where ')
                stack.append(len(result))
                stackcnt[-1] += 1
                stackcnt.append(0)
>               result.append(u(' +') + u('  ')*(len(stack)-1) + s + line[1:])
E               UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 23: ordinal not in range(128)

../venv/lib/python2.7/site-packages/_pytest/assertion/util.py:104: UnicodeDecodeError

Bitbucket: https://bitbucket.org/pytest-dev/pytest/issue/678

The text was updated successfully, but these errors were encountered:

pytestbot · 2015-03-24T16:20:57Z

Original comment by Andrey Gusev (BitBucket: nex2hex, GitHub: nex2hex):

fix:

#!python

result.append(u(' +') + u('  ')*(len(stack)-1) + s + line[1:].decode('utf-8'))

pytestbot · 2015-03-24T16:24:55Z

Original comment by Anatoly Bubenkov (BitBucket: bubenkoff, GitHub: bubenkoff):

please prepare PR with a test

also for the actual fix: decode should be not strict, eg errors='ignore' or 'replace'

The-Compiler · 2015-07-08T08:22:30Z

Any update, @nex2hex? A PR would be much appreciated! If you have any trouble, let us know and we'll be happy to help.

RonnyPfannschmidt · 2015-07-25T10:12:35Z

this is fixed in #878

nicoddemus · 2015-09-26T14:48:55Z

@RonnyPfannschmidt is this fixed? Can we close this?

RonnyPfannschmidt · 2015-09-27T08:22:24Z

not yet, git destroyed my updated pr, its on my agenda for today

RonnyPfannschmidt · 2015-09-27T08:56:46Z

the merge was done before

pytestbot added the type: bug problem that needs to be addressed label Jun 15, 2015

pfctdayelise added the unicode label Jul 26, 2015

alvinchow86 mentioned this issue Sep 13, 2015

UnicodeDecodeError from pytest if object representation contains non-ascii, utf8 encoded, characters #877

Closed

RonnyPfannschmidt modified the milestones: 2.8, 2.8.dev Sep 13, 2015

RonnyPfannschmidt closed this as completed Sep 27, 2015

biern mentioned this issue Feb 12, 2016

Formatting of utf-8 explanations fails #1379

Closed

dtomas mentioned this issue Sep 19, 2018

UnicodeDecodeError on failing assert if non-ASCII characters in __repr__ #3999

Closed

The-Compiler mentioned this issue Jan 17, 2020

[RFC] Use -rfE by default? (reportchars) #6454

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytest cannot deal with utf-8 encoded repr of a custom object #678

pytest cannot deal with utf-8 encoded repr of a custom object #678

pytestbot commented Feb 10, 2015

pytestbot commented Mar 24, 2015

pytestbot commented Mar 24, 2015

The-Compiler commented Jul 8, 2015

RonnyPfannschmidt commented Jul 25, 2015

nicoddemus commented Sep 26, 2015

RonnyPfannschmidt commented Sep 27, 2015

RonnyPfannschmidt commented Sep 27, 2015

pytest cannot deal with utf-8 encoded __repr__ of a custom object #678

pytest cannot deal with utf-8 encoded __repr__ of a custom object #678

Comments

pytestbot commented Feb 10, 2015

pytestbot commented Mar 24, 2015

pytestbot commented Mar 24, 2015

The-Compiler commented Jul 8, 2015

RonnyPfannschmidt commented Jul 25, 2015

nicoddemus commented Sep 26, 2015

RonnyPfannschmidt commented Sep 27, 2015

RonnyPfannschmidt commented Sep 27, 2015

pytest cannot deal with utf-8 encoded repr of a custom object #678

pytest cannot deal with utf-8 encoded repr of a custom object #678