Skip to content

Commit

Permalink
Set encoding to cp1252 for read/written strings
Browse files Browse the repository at this point in the history
Prevents `UnicodeDecodeError: invalid continuation byte` when decoding non-ASCII characters, due to a bug in Note Block Studio where only the first byte of the character's UTF-16 codepoint is written to the file, resulting in an invalid UTF-8 codepoint. This will be patched in a future version of the NBS format (see OpenNBS/OpenNoteBlockStudio#307).

`cp1252` was chosen to closely match the current (unintended) interpretation of those characters by Note Block Studio, given the fact that it has always been a Windows-only application. Also, unlike ISO-8859-1/Latin-1, all characters in the `0x8A-0xFF` range are printable characters (whereas `0x80-0x9F` are control characters in Latin-1).

By no means this is an ideal solution -- it's nothing short of a workaround to prevent that exception, until the issue is properly fixed in Note Block Studio. As the information in those characters is already lost the moment they are saved to the file, it's preferable to have the read process yield "garbage" characters than incur in an exception.

Fixes #3
  • Loading branch information
Bentroen committed Mar 12, 2022
1 parent 005e4a6 commit 05efe72
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions pynbs.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ def read_numeric(self, fmt):

def read_string(self):
length = self.read_numeric(INT)
return self.buffer.read(length).decode()
return self.buffer.read(length).decode(encoding='cp1252')

def jump(self):
value = -1
Expand Down Expand Up @@ -218,7 +218,7 @@ def encode_numeric(self, fmt, value):

def encode_string(self, value):
self.encode_numeric(INT, len(value))
self.buffer.write(value.encode())
self.buffer.write(value.encode(encoding='cp1252'))

def write_header(self, nbs_file, version):
header = nbs_file.header
Expand Down

0 comments on commit 05efe72

Please sign in to comment.