Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError being raised on certain characters #7

Closed
ykim opened this issue Jul 21, 2020 · 2 comments
Closed

KeyError being raised on certain characters #7

ykim opened this issue Jul 21, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@ykim
Copy link

ykim commented Jul 21, 2020

While experimenting with cutlet using the full unidic dictionary, I've had several KeyErrors being raised:

% cutlet
《月》
Traceback (most recent call last):
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/bin/cutlet", line 14, in <module>
    print(katsu.romaji(line.strip()))
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 122, in romaji
    roma = self.romaji_word(word)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 175, in romaji_word
    return self.table[word.surface]
KeyError: '《'
% cutlet
くま クマ 熊 ベアー 2【電子版特典付】
Traceback (most recent call last):
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/bin/cutlet", line 14, in <module>
    print(katsu.romaji(line.strip()))
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 122, in romaji
    roma = self.romaji_word(word)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 192, in romaji_word
    return self.map_kana(kana)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 201, in map_kana
    out += self.get_single_mapping(pk, char, nk)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 234, in get_single_mapping
    return self.table[kk]
KeyError: '*'

Is this supposed to occur? I'm not aware if cutlet is meant to handle full-width characters in sentences.

@polm polm added the bug Something isn't working label Jul 21, 2020
@polm
Copy link
Owner

polm commented Jul 21, 2020

Thanks for the report, this is a bug.

Let me look at fixing it...

@polm polm closed this as completed in dbabb1a Jul 21, 2020
@polm
Copy link
Owner

polm commented Jul 21, 2020

Pushed a fix for this, there were a couple of issues that your examples brought up:

  • not handling punctuation like 《》【】
  • not handling zenkaku english/numbers

Those should be fixed in master now, I'll make a release with the fixes soon. Thanks for catching the bugs!

@ykim ykim mentioned this issue Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants