Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DeepseekV2ForCausalLM #7519

Merged
merged 30 commits into from
May 28, 2024

Commits on May 16, 2024

  1. Configuration menu
    Copy the full SHA
    c8c353f View commit details
    Browse the repository at this point in the history

Commits on May 17, 2024

  1. Configuration menu
    Copy the full SHA
    b24c9ed View commit details
    Browse the repository at this point in the history

Commits on May 18, 2024

  1. Configuration menu
    Copy the full SHA
    0398964 View commit details
    Browse the repository at this point in the history
  2. Added five new DeepSeek-V2-specific parameters:

    - leading_dense_block_count => hparams.n_leading_dense_layer,
    - expert_feed_forward_length => hparams.n_expert_ff,
    - expert_shared_count => hparams.n_expert_shared,
    - attention.q_lora_rank => hparams.n_lora_q,
    - attention.kv_lora_rank => hparams.n_lora_kv
    sszymczy committed May 18, 2024
    Configuration menu
    Copy the full SHA
    b50c07c View commit details
    Browse the repository at this point in the history
  3. Added initial support for DeepSeek-V2-Lite model.

    Added missing scaling of kq_scale parameter.
    sszymczy committed May 18, 2024
    Configuration menu
    Copy the full SHA
    79f8417 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6050941 View commit details
    Browse the repository at this point in the history

Commits on May 19, 2024

  1. Configuration menu
    Copy the full SHA
    7e4786b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    71a7422 View commit details
    Browse the repository at this point in the history
  3. Replaced hardcoded mscale value with rescaling attn_factor that resul…

    …ts in the final mscale value equal to 1.0.
    sszymczy committed May 19, 2024
    Configuration menu
    Copy the full SHA
    f99df46 View commit details
    Browse the repository at this point in the history
  4. Whitespace formatting fixes.

    sszymczy committed May 19, 2024
    Configuration menu
    Copy the full SHA
    3ae7235 View commit details
    Browse the repository at this point in the history

Commits on May 20, 2024

  1. Configuration menu
    Copy the full SHA
    68a5103 View commit details
    Browse the repository at this point in the history
  2. Added YaRN log multiplier model header parameter corresponding to the…

    … multiplier of the ln(s) from the sqrt(1/t) = 0.1 ln(s) + 1 equation.
    sszymczy committed May 20, 2024
    Configuration menu
    Copy the full SHA
    7be56da View commit details
    Browse the repository at this point in the history

Commits on May 21, 2024

  1. Configuration menu
    Copy the full SHA
    842ff3f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c033958 View commit details
    Browse the repository at this point in the history

Commits on May 24, 2024

  1. Configuration menu
    Copy the full SHA
    a54685b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    bb9c361 View commit details
    Browse the repository at this point in the history

Commits on May 26, 2024

  1. Configuration menu
    Copy the full SHA
    f3b5e7d View commit details
    Browse the repository at this point in the history

Commits on May 27, 2024

  1. Configuration menu
    Copy the full SHA
    abef8b2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a654cd9 View commit details
    Browse the repository at this point in the history
  3. llama : rename qk_rope_head_dim, qk_nope_head_dim variables to n_embd…

    …_head_qk_rope, n_embd_head_qk_nope
    sszymczy committed May 27, 2024
    Configuration menu
    Copy the full SHA
    5a3e6b6 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    20769c0 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    fac1e80 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    56f7011 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    82cec8b View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    5cc7ec1 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    d02130d View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    bde971a View commit details
    Browse the repository at this point in the history

Commits on May 28, 2024

  1. Configuration menu
    Copy the full SHA
    98ff6e1 View commit details
    Browse the repository at this point in the history
  2. llama : replace ggml_new_tensor_3d + ggml_set_inplace + ggml_set_inpl…

    …ace with single ggml_concat in build_deepseek2()
    sszymczy committed May 28, 2024
    Configuration menu
    Copy the full SHA
    841cd47 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3efb659 View commit details
    Browse the repository at this point in the history