Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEA Controller for scipy_openblas #175

Merged
merged 10 commits into from
Apr 29, 2024

Conversation

jeremiedbb
Copy link
Collaborator

Numpy, Scipy, ... will likely be linked against a common build of OpenBLAS for the sci-py ecosystem (https://github.com/MacPython/openblas-libs). At some point it might even become a runtime dependency for these packages.

This special build of OpenBLAS comes with a different naming pattern than OpenBLAS, the lib name being libscipy_openblas..., andthe symbols being prefixed by scipy_. I made a new controller inherited from the OpenBLAS controller.

I also added a build in the CI to test against the dev versions of numpy and scipy. It would probably have allowed us to face the issue earlier.

@jeremiedbb
Copy link
Collaborator Author

Should I change internal_api = "openblas" to internal_api = "scipy_openblas" for the new controller ?

@ogrisel
Copy link
Contributor

ogrisel commented Apr 9, 2024

Should I change internal_api = "openblas" to internal_api = "scipy_openblas" for the new controller ?

I guess so since they do not share common symbols.

@jeremiedbb
Copy link
Collaborator Author

The new pip-dev CI job confirms that it works:
tests/test_threadpoolctl.py::test_threadpool_limits_by_prefix[1-libscipy_openblas] PASSED [ 24%]

@ogrisel
Copy link
Contributor

ogrisel commented Apr 9, 2024

About #175 (comment), I am not so sure. Maybe @rgommers and @mattip have a different opinion. I am not sure about the scope of scipy-openblas: numpy/numpy#26240 (comment)

@rgommers
Copy link

I commented on the scope in numpy/numpy#26240 (comment)

As for whether this should be a separate controller: it feels a bit wrong conceptually. It is a fairly regular build of OpenBLAS with a symbol prefix set. That's a build-time flag; there are many such build-time flags, and there may be other distributions that do something similar.

Actually I already see a next change coming: the ILP64 symbol suffix for OpenBLAS is going to become _64 instead of 64_ at some point. So instead of openblas_get_config64_ for the regular BLAS controller, it also needs to check for openblas_get_config_64_ (the trailing _ may or may not be present).

Probably the more correct way to model this is to consider the symbol naming scheme, which is:

<prefix><symbol_name><suffix><compiler-suffix>

where:

  • <prefix> can be nothing or scipy_,
  • <symbol_name> is the plain name as found in the Netlib BLAS/LAPACK docs
  • <suffix> can be nothing, _64 or 64_
  • <compiler-suffix> can be nothing or _

That way this is extensible and wouldn't require separate controllers for different builds of the same library.

@jeremiedbb
Copy link
Collaborator Author

jeremiedbb commented Apr 10, 2024

I changed to have a same controller for all versions of OpenBLAS. I try do determine first what are the specific prefix, suffix and compiler_suffix and then use them for all symbol lookups. However I face an issue with the above description of the combinatorics of these affixes.

Running nm -gD /path/to/openblas64 | grep "openblas" shows that libopenblas64 already comes with 2 different suffixed versions of some symbols, e.g.

0000000000350190 T openblas_set_num_threads_64_
0000000000350fe0 T openblas_set_num_threads64_

but the openblas_set_num_threads_64_ version segfaults.

I can blacklist some combinations but we should take care in future openblas builds to not use these combinations as valid symbol name because it will make things hard for threadpoolctl to support all openblas versions :) For instance, suffix=_64 + compiler_suffix=_ is to be avoided.

@jeremiedbb
Copy link
Collaborator Author

Actually it's the same with the 32bit int version of openblas:

0000000000384ad0 T openblas_set_num_threads
0000000000383c10 T openblas_set_num_threads_

and openblas_set_num_threads_ segfaults. I'll remove the compiler_suffix for now.

@jeremiedbb
Copy link
Collaborator Author

jeremiedbb commented Apr 10, 2024

Ok so looking at OpenBLAS source made it clear, who would have guessed 😄
openblas_set_num_threads expects an int
openblas_set_num_threads_ expects a pointer to an int

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nitpicks but otherwise, LGTM. Thanks for the fix!

threadpoolctl.py Outdated
Comment on lines 171 to 172
_sym_prefixes = ("", "scipy_")
_sym_suffixes = ("", "64_", "_64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename to _symbol_prefixes / _symbol_suffixes to make the meaning of those variables more straightforward to understand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure I did that to fit within 88 chars 😄

I renamed them

threadpoolctl.py Outdated
@@ -111,6 +112,7 @@ def __init__(self, *, filepath=None, prefix=None, parent=None):
self.prefix = prefix
self.filepath = filepath
self.dynlib = ctypes.CDLL(filepath, mode=_RTLD_NOLOAD)
self._sym_prefix, self._sym_suffix = self._find_affixes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment for the singular versions of those attributes.

threadpoolctl.py Outdated
Comment on lines 223 to 225
if (symbol := self._get_symbol("openblas_get_corename")) is not None:
symbol.restype = ctypes.c_char_p
return symbol().decode("utf-8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I preferred naming the Python variables holding the functions objects by their symbol name as before:

Suggested change
if (symbol := self._get_symbol("openblas_get_corename")) is not None:
symbol.restype = ctypes.c_char_p
return symbol().decode("utf-8")
if (get_corename := self._get_symbol("openblas_get_corename")) is not None:
get_corename.restype = ctypes.c_char_p
return get_corename().decode("utf-8")

but this is a minor nitpick. Feel free to ignore. But if you accept, don't forget to rename all other occurrences of the symbol local variable in other methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jeremiedbb jeremiedbb merged commit 8dd980b into joblib:master Apr 29, 2024
24 checks passed
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Apr 30, 2024
3.5.0 (2024-04-29)

- Added support for the Scientific Python version of OpenBLAS
  (https://github.com/MacPython/openblas-libs), which exposes symbols with different
  names than the ones of the original OpenBLAS library.
  joblib/threadpoolctl#175
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants