-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for DeepseekV2ForCausalLM #7519
Commits on May 16, 2024
-
Configuration menu - View commit details
-
Copy full SHA for c8c353f - Browse repository at this point
Copy the full SHA c8c353fView commit details
Commits on May 17, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b24c9ed - Browse repository at this point
Copy the full SHA b24c9edView commit details
Commits on May 18, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 0398964 - Browse repository at this point
Copy the full SHA 0398964View commit details -
Added five new DeepSeek-V2-specific parameters:
- leading_dense_block_count => hparams.n_leading_dense_layer, - expert_feed_forward_length => hparams.n_expert_ff, - expert_shared_count => hparams.n_expert_shared, - attention.q_lora_rank => hparams.n_lora_q, - attention.kv_lora_rank => hparams.n_lora_kv
Configuration menu - View commit details
-
Copy full SHA for b50c07c - Browse repository at this point
Copy the full SHA b50c07cView commit details -
Added initial support for DeepSeek-V2-Lite model.
Added missing scaling of kq_scale parameter.
Configuration menu - View commit details
-
Copy full SHA for 79f8417 - Browse repository at this point
Copy the full SHA 79f8417View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6050941 - Browse repository at this point
Copy the full SHA 6050941View commit details
Commits on May 19, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 7e4786b - Browse repository at this point
Copy the full SHA 7e4786bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 71a7422 - Browse repository at this point
Copy the full SHA 71a7422View commit details -
Replaced hardcoded mscale value with rescaling attn_factor that resul…
…ts in the final mscale value equal to 1.0.
Configuration menu - View commit details
-
Copy full SHA for f99df46 - Browse repository at this point
Copy the full SHA f99df46View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3ae7235 - Browse repository at this point
Copy the full SHA 3ae7235View commit details
Commits on May 20, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 68a5103 - Browse repository at this point
Copy the full SHA 68a5103View commit details -
Added YaRN log multiplier model header parameter corresponding to the…
… multiplier of the ln(s) from the sqrt(1/t) = 0.1 ln(s) + 1 equation.
Configuration menu - View commit details
-
Copy full SHA for 7be56da - Browse repository at this point
Copy the full SHA 7be56daView commit details
Commits on May 21, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 842ff3f - Browse repository at this point
Copy the full SHA 842ff3fView commit details -
Configuration menu - View commit details
-
Copy full SHA for c033958 - Browse repository at this point
Copy the full SHA c033958View commit details
Commits on May 24, 2024
-
Configuration menu - View commit details
-
Copy full SHA for a54685b - Browse repository at this point
Copy the full SHA a54685bView commit details -
Configuration menu - View commit details
-
Copy full SHA for bb9c361 - Browse repository at this point
Copy the full SHA bb9c361View commit details
Commits on May 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f3b5e7d - Browse repository at this point
Copy the full SHA f3b5e7dView commit details
Commits on May 27, 2024
-
Configuration menu - View commit details
-
Copy full SHA for abef8b2 - Browse repository at this point
Copy the full SHA abef8b2View commit details -
Configuration menu - View commit details
-
Copy full SHA for a654cd9 - Browse repository at this point
Copy the full SHA a654cd9View commit details -
llama : rename qk_rope_head_dim, qk_nope_head_dim variables to n_embd…
…_head_qk_rope, n_embd_head_qk_nope
Configuration menu - View commit details
-
Copy full SHA for 5a3e6b6 - Browse repository at this point
Copy the full SHA 5a3e6b6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 20769c0 - Browse repository at this point
Copy the full SHA 20769c0View commit details -
Configuration menu - View commit details
-
Copy full SHA for fac1e80 - Browse repository at this point
Copy the full SHA fac1e80View commit details -
Configuration menu - View commit details
-
Copy full SHA for 56f7011 - Browse repository at this point
Copy the full SHA 56f7011View commit details -
llama : use attn_factor in mscale calculation to match the rope_yarn(…
…) implementation
Configuration menu - View commit details
-
Copy full SHA for 82cec8b - Browse repository at this point
Copy the full SHA 82cec8bView commit details -
llama : rename query_states, key_states, value_states to q_states, k_…
…states, v_states
Configuration menu - View commit details
-
Copy full SHA for 5cc7ec1 - Browse repository at this point
Copy the full SHA 5cc7ec1View commit details -
Configuration menu - View commit details
-
Copy full SHA for d02130d - Browse repository at this point
Copy the full SHA d02130dView commit details -
Configuration menu - View commit details
-
Copy full SHA for bde971a - Browse repository at this point
Copy the full SHA bde971aView commit details
Commits on May 28, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 98ff6e1 - Browse repository at this point
Copy the full SHA 98ff6e1View commit details -
llama : replace ggml_new_tensor_3d + ggml_set_inplace + ggml_set_inpl…
…ace with single ggml_concat in build_deepseek2()
Configuration menu - View commit details
-
Copy full SHA for 841cd47 - Browse repository at this point
Copy the full SHA 841cd47View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3efb659 - Browse repository at this point
Copy the full SHA 3efb659View commit details