Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Memory Leak in cdist and Metric-Like Functions #61

Merged
merged 2 commits into from
Jan 22, 2024

Conversation

smthngslv
Copy link
Contributor

Fix for Memory Leak in cdist and Metric-Like Functions

alert(gpt)

Problem Description

I identified a memory leak issue in the cdist function, as well as other metric-like functions, within our codebase. This leak becomes evident in scenarios where these functions are called repeatedly, such as within a while True loop. An example to demonstrate the issue is as follows:

while True:
    simsimd.cdist(embeddings, query)

This code snippet leads to a gradual increase in memory usage over time, indicating a memory leak.

Root Cause

The memory leak stems from the improper handling of numpy arrays, particularly in the context of garbage collection (GC). The issue arises when PyArray_NewFromDescr is used with non-null data. In such cases, the data must remain alive for the entire lifetime of the array. However, it is crucial to allow both numpy and Python's garbage collector to deallocate the array when it is no longer in use.

Solution

According to the numpy C-API documentation (Numpy C-API Array From Scratch), the constant NPY_ARRAY_OWNDATA should not be used manually. The documentation states:

"The data area is owned by this array. Should never be set manually, instead create a PyObject wrapping the data and set the array’s base to that object. For an example, see the test in test_mem_policy."

Following the guidelines from the numpy documentation, I referred to a specific implementation in the numpy repository for guidance. The reference code can be found here: Numpy Test Memory Policy. By aligning our implementation with the approach demonstrated in this example, I was able to resolve the memory leak issue.


if (!output_array) {
free(distances);
goto cleanup;
}

PyObject *wrapper = PyCapsule_New(distances, "wrapper", (PyCapsule_Destructor)&free_capsule);
Copy link
Owner

@ashvardanian ashvardanian Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for spotting the issue, @smthngslv! Any chance we can avoid the additional memory allocation for the capsule? There must be a cheaper way to forward it to Python.

PS: The cost of every memory allocation can be many times higher than the cost of actual distance computation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, that there is way to avoid creating a capsule, but we probably can add smth like out in the signature, so it will be:

query: numpy.ndarray = ...
embeddings: numpy.ndarray = ...

scores: numpy.ndarray = numpy.empty(
    shape=(query.shape[0], embeddings.shape[0]), 
    dtype=numpy.dtype("float32")
)

simsimd.cdist(embeddings, query, out=scores)

In this way, you can create an output array only once, and then just reuse it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea! Can you implement it? We also must document that behavior, as it goes beyond SciPy functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will take a look at it, later this week

@ashvardanian
Copy link
Owner

I may have a similar issue in other repos, like USearch, so finding a way to avoid the extra allocation would be great, @smthngslv 🤗

@ashvardanian
Copy link
Owner

@smthngslv how did you catch the leak? which tool would you suggest for the Python binding? Maybe something we can add in CI?

PS: I'm gonna merge the current solution and looking forward to the one that adds the out parameter, in case you are still onto it 🤗

@ashvardanian ashvardanian merged commit 0469ec2 into ashvardanian:main-dev Jan 22, 2024
13 checks passed
ashvardanian pushed a commit that referenced this pull request Jan 22, 2024
## [3.6.6](v3.6.5...v3.6.6) (2024-01-22)

### Fix

* fallback for Vercel-based apps (#66) ([dc6de11](dc6de11)), closes [#66](#66)
* Memory leak in `cdist` (#61) ([0469ec2](0469ec2)), closes [#61](#61)

### Make

* Py version is inferred from macros ([234a282](234a282))
@ashvardanian
Copy link
Owner

🎉 This PR is included in version 3.6.6 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@smthngslv
Copy link
Contributor Author

@smthngslv how did you catch the leak? which tool would you suggest for the Python binding? Maybe something we can add in CI?

Just manual testing, comment out some lines and testing again. About the build process - I use scikit-build and pybind11 with cmake (adopted from https://github.com/pybind/scikit_build_example). Kinda love pybind11, since you do not need to deal with low-level python c api:

# Find the module development requirements (requires FindPython from 3.17 or scikit-build-core's built-in backport).
find_package(Python REQUIRED COMPONENTS Interpreter Development.Module)
find_package(pybind11 CONFIG REQUIRED)
find_package(OpenMP)

# Add a library using FindPython's tooling (pybind11 also provides a helper like this).
python_add_library(_core MODULE src/main.cpp WITH_SOABI)
target_link_libraries(_core PRIVATE pybind11::headers)
if(OpenMP_CXX_FOUND)
    target_link_libraries(_core PUBLIC OpenMP::OpenMP_CXX)
endif()

looking forward to the one that adds the out parameter, in case you are still onto it 🤗

Yep, I am still on it, as soon as I manage to find some free time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants