Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ADD_REL_POS perf in SAM by doing it inplace #466

Merged
merged 7 commits into from
Aug 21, 2023

Commits on Aug 20, 2023

  1. Improve ADD_REL_POS perf in SAM by doing it inplace

    - Add unit tests for the ADD_REL_POS operation
    - I am not sure if this is valid implementation as we reuse the src0
      memory in order to avoid copying it
    - When running SAM with the "Example output" command, image, point and
      16 threads, this reduces the cumulative time of the ADD_REL_POS operation
      from 1000-1100 ms to 180-200ms
    - There is further room for optimization in the access patterns used in
      the implementation of the opration
    YavorGIvanov committed Aug 20, 2023
    Configuration menu
    Copy the full SHA
    58d9081 View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2023

  1. Add non-inplace version for the GGML_OP_ADD_REL_POS

    YavorGIvanov committed Aug 21, 2023
    Configuration menu
    Copy the full SHA
    9d6eaa8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ca3e0fc View commit details
    Browse the repository at this point in the history
  3. Fix Mac printf format warnings

    YavorGIvanov committed Aug 21, 2023
    Configuration menu
    Copy the full SHA
    d65d74d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6e13e28 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d4b3b46 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    415dbf1 View commit details
    Browse the repository at this point in the history