-
Notifications
You must be signed in to change notification settings - Fork 958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issues about YaRN #835
Comments
This was referenced May 27, 2024
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am using YaRN when implementing DeepSeek V2 models. And current YaRN does not look good me.
@cebtenzzre Could you take a look on this? Correct me if I am wrong.
theta_base *= freq_scale
is done again later inrope_yarn
:ggml/src/ggml.c
Line 14077 in 0cbb7c0
In a basic case (
ext_factor
is 0), thetheta
uses forcos/sin
is scaled byfreq_scale * freq_scale
. I think this is wrong and this line should be deleted.value passed to
int64_t i0
is wrong: (data type does not matches, either.)ggml/src/ggml.c
Lines 14082 to 14088 in 0cbb7c0
I think it should be
ic
here.(Confirmed when implementing DeepSeek V2 models)
The text was updated successfully, but these errors were encountered: