-
Notifications
You must be signed in to change notification settings - Fork 769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LlaMA3在英文或者中文上tokenizer是否需要加bos token? #140
Comments
你好,tokenizer 中 add_spec_tokens 的默认参数就是 False,我们显式地设置只是为了便于读者理解,实际和不设置值是一样的哈 |
你好,我刚刚测试了,不加特殊的token,llama3在tokenizer的时候,会在前面加上<begin_of_text>这个特殊的标记,如下图: |
你好,我有一些疑惑,看了一些其他的教程发现他们在tokenizer的时候是没有设置add_spec_tokens的,请问这个有什么说法吗?
The text was updated successfully, but these errors were encountered: