-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is chinese id supported ? #172
Comments
Not at the moment, it's likely not that difficult to add, likely requires creating another identifier that would be used instead of the ID when writing the prompt to the LLM |
Thank you very much for your reply, but it seems that I did not understand what you wrote. The language of ‘another identifier’ should still be English, right? Do I need to do a Chinese to English mapping? |
Like the code below? Just give each 'id' a name, and in the 'description' clearly describe the field to be extracted。 "messages": { |
In the meantime, you could rely on examples to improve the quality of extraction. It's unclear to what extent having an ID provided in chinese will affect the quality of the result since the language models already understand multiple languages. |
Thank you very much for your reply. I have modified it on the basis of your source code, and now it supports Chinese. The modified code is as follows: ADD: in kor.modes |
If you're working with your own clone of the library and you could probably remove the VALID_IDENTIFIER check completely -- as long as the code runs without the identifier and generates the correct prompt you should be OK. |
from kor import nodes
nodes.VALID_IDENTIFIER_PATTERN = re.compile(r".") # monkey patch 使支持中文identifier |
Make kor pydantic v1 and v2 compatible. Additional changes: * No more validation on node ids -- makes it easier to use node ids in other languages (e.g., [chinese](#172)) * Add testing to CI to test with both v1 and v2 Not fully implemented: * Support for serialization via parse_obj
Hi, i want to extract the key field information of Chinese text, so does kor.Text id supports chinese ?
The text was updated successfully, but these errors were encountered: