Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'ー' #9

Closed
ykim opened this issue Aug 3, 2020 · 1 comment
Closed

KeyError: 'ー' #9

ykim opened this issue Aug 3, 2020 · 1 comment

Comments

@ykim
Copy link

ykim commented Aug 3, 2020

I ran into another issue parsing a title of a book, ティンクル☆くるせいだーすGoGo!(1). The error is below:

% cutlet
ティンクル☆くるせいだーすGoGo!(1)
Traceback (most recent call last):
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/bin/cutlet", line 14, in <module>
    print(katsu.romaji(line.strip()))
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 127, in romaji
    roma = self.romaji_word(word)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 191, in romaji_word
    return self.map_kana(kana)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 231, in map_kana
    out += self.get_single_mapping(pk, char, nk)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 264, in get_single_mapping
    return self.table[kk]
KeyError: 'ー'

This may be related to the changes from #7 and/or #8. This string did not error before either changes. My guess is that the in the middle is causing some issues.

polm added a commit that referenced this issue Aug 4, 2020
Issue is that half-width katakana are not handled correctly.
@polm
Copy link
Owner

polm commented Aug 4, 2020

Thanks for the bug report.

The issue here is not the star, it's the thing that looks like a hyphen. That's actually a half-width long vowel stroke (長音符). I didn't have any handling for half-width katakana, so they were failing at lookup time.

This worked in older versions of the code because unknown characters were passed through. That would have been bad with text like this, as it would look like normal ascii, but have to be encoded in URLs or other situations. Thanks for helping me catch it!

I just released 0.1.9, which should fix this issue.

@polm polm closed this as completed Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants