Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何提升检出率 #500

Open
rickywu opened this issue Jun 21, 2024 · 6 comments
Open

如何提升检出率 #500

rickywu opened this issue Jun 21, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@rickywu
Copy link

rickywu commented Jun 21, 2024

本合同文本供用人单位与建立劳动关系的劳动者签定劳动合同时使用。

签定应该纠正为签订,但没检查出来

@rickywu rickywu added the bug Something isn't working label Jun 21, 2024
@shibing624
Copy link
Owner

用混淆集纠错。

@TW-NLP
Copy link

TW-NLP commented Jul 25, 2024

可以使用语法错误增强工具,来提高模型的鲁棒性,代码如下:https://github.com/TW-NLP/ChineseErrorCorrector/tree/main

@rickywu
Copy link
Author

rickywu commented Jul 25, 2024

@TW-NLP 不能一次检查多个错误吗

@TW-NLP
Copy link

TW-NLP commented Jul 25, 2024

@rickywu 模型没有检出是因为,在训练预料中没有涵盖此类问题,可以用工具进行拼写错误的数据增强,然后提高模型鲁棒性,目前博主的macbert拼写纠错是可以一次检测多个错误的。

@rickywu
Copy link
Author

rickywu commented Jul 25, 2024

@TW-NLP 你意思是要用你这个微调模型?

@TW-NLP
Copy link

TW-NLP commented Jul 25, 2024

@rickywu 还是用博主的,但是可以用增强的数据,在博主给出的模型上进行二次微调,来打造自己行业的纠错模型。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants