Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何在文字定位任务对中文实现合适的word embedding #1

Open
guanyamu opened this issue Mar 20, 2024 · 1 comment
Open

Comments

@guanyamu
Copy link

你好,Lee!我正在研究深度学习的中文与另一文种的单词定位问题,希望最终能实现一个QBE和QBS的检索算法,但是另一文种的标签是由英文字母构成的拉丁转写,而现在大多数的中文数据集的标签都是中文,使用phoc等技术只能对英文数字进行编码,我也考虑过使用NLP的embedding技术但不知道如何同时应用于中文和英文,我想请教您是否对该内容有所了解,希望得到您的一些点拨!

@secsilm
Copy link
Owner

secsilm commented Mar 30, 2024

现在已经有很多模型可以一起处理包括中英文在内的多种语言,比如经典的 bert,你可以使用他们来得到 embedding 然后fine tuning。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants