-
Notifications
You must be signed in to change notification settings - Fork 721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial free threaded bindings #4421
base: main
Are you sure you want to change the base?
Conversation
noxfile.py
Outdated
@@ -32,6 +33,7 @@ | |||
PYO3_DOCS_TARGET = PYO3_TARGET / "doc" | |||
PY_VERSIONS = ("3.7", "3.8", "3.9", "3.10", "3.11", "3.12", "3.13") | |||
PYPY_VERSIONS = ("3.7", "3.8", "3.9", "3.10") | |||
IS_FREE_THREADED = bool(sysconfig.get_config_var("Py_GIL_DISABLED")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure this is the "correct" way to do this in nox
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this makes it conditional on the host python, but really what we care about is the target python, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as far as our use of nox
is concerned, these are the same. Is there an alternative?
It looks like the reason this was added is because we made the abi3
build on free-threaded a hard failure. Maybe we should relax that and then we don't need a change in the noxfile
for free-threading?
pyo3-build-config/src/impl_.rs
Outdated
"cargo:rustc-cfg=Py_3_12".to_owned(), | ||
"cargo:rustc-cfg=Py_3_13".to_owned(), | ||
"cargo:rustc-cfg=Py_GIL_DISABLED".to_owned(), | ||
"cargo:rustc-cfg=py_sys_config=\"Py_GIL_DISABLED\"".to_owned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we define both configs in this case? Or only the top-level Py_GIL_DISABLED
one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think let's go for just the top-level Py_GIL_DISABLED
👍. (You'll need to special-case this when printing the BuildFlags
. as cfgs, I guess)
Please document in pyo3-build-config/lib.rs
(in the docstring for use_pyo3_cfgs
).
return _Py_IMMORTAL_REFCNT; | ||
} | ||
let shared = (AtomicIsize::from((*ob).ob_ref_shared)).load(Relaxed); | ||
local as Py_ssize_t + Py_ssize_t::from(shared >> _Py_REF_SHARED_SHIFT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only kind of hairy bit, since there's no function exposed in the API to get the refcount
This is my first time dealing with rust atomics. I have no idea if it's safe to create them inline and then do an atomic load like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AtomicU32::from((*ob).ob_ref_local)
creates a new atomic whose changes will not be reflected in the original ob_ref_local
. So this is definitely wrong. Atomic::from_ptr
would be more appropriate.
Anyway I would feel better if we just used the C functions to do refcounting, similar to #4311 (comment). I'm sure @davidhewitt will have thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there is no function for this in the API, only a static inline function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #3357 we discussed maybe we should just remove this API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still might need an implementation though if we need to expose it in the FFI for users of PyO3, e.g. as @davidhewitt implies in #3357 (comment).
My understanding is that the plan for 3.14 is to either expose a function for this or to expose a function that provides similar functionality without needing to expose refcounts directly. Something like Py_IsOnlyReference
for refcount==1 and otherwise everything else becomes an implementation detail.
Unfortunately that doesn't help us for 3.13.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we probably do need to keep this API exposed (even after #4065 we use it for debugging and I think others might want to too). I guess that given it's a static inline function, let's just go down the Atomic::from_ptr
route and hope that CPython can provide a C function in 3.14.
pyo3-ffi/src/cpython/lock.rs
Outdated
#[repr(C)] | ||
#[derive(Debug, Copy, Clone)] | ||
pub struct PyMutex { | ||
pub _bits: u8, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be wrong to derive Copy and/or Clone, the documentation explicitly says it should not be moved.
The Debug impl + not using atomics means the debug impl would race.
The field is also not public in the C api, I'm inclined to keep it that way here as well.
#[repr(C)] | |
#[derive(Debug, Copy, Clone)] | |
pub struct PyMutex { | |
pub _bits: u8, | |
} | |
#[repr(C)] | |
#[derive(Default)] | |
pub struct PyMutex { | |
_bits: u8, | |
_pin: PhantomPinned, | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that won't work because PyObject
needs those traits. At least if we want to say that ob_mutex
is a PyObject
as I have it in this version of the PR. Maybe better to write the PyObject
bindings with ob_mutex
as a u8
. It's an internal, private CPython field and we wouldn't be using it otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably remove Copy
& Clone
from PyObject
in that case, those traits sound to me like an accident waiting to happen anyway 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Default
implementation seems a bit risky to me because of the existence of std::mem::take
. I'm not sure _pin
will save us from that unless we don't give users a way to get at &mut PyMutex
, which e.g. they could just do with:
let mut m = PyMutex::default();
let moved = std::mem::take(&mut mutex);
This would also be a move-related bug:
let mut m = PyMutex::default();
PyMutex_Lock(&mut m);
let moved = m;
If we want to allow users to construct a PyMutex, I think we need to remove the Default
implementation and have something like
impl PyMutex {
pub fn new() -> Pin<Box<PyMutex>>
}
... but I also wonder whether that's overkill and we should just
- not bother with
_pin
, and - not give users a way to construct
PyMutex
for now (we will need it inside the crate forPyObject_HEAD_INIT
, but we can just makepub(crate) _bits
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure _pin will save us
The PhantomPinned
itself doesn't do anything. It lets us build safe abstractions on top of it.
If we want to allow users to construct a PyMutex
We can remove the Default
impl and document that users can create it with mem::zeroed
, which is what CPython's documentation does.
This would also be a move-related bug:
A safe api would probably do something like this (to follow the example of having a PyMutex on the stack), which is possible because of the PhantomPinned
.
let mutex: Pin<&mut PyMutex> = pin!(PyMutex {
_bits: 0,
_pin: PhantomPinned
});
lock(mutex.as_ref());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that won't work because PyObject needs those traits
@ngoldbaum we can still have a Debug impl, it just needs a manual impl that correctly accesses the PyMutex value (or, ideally it should just be an AtomicU8).
I agree with removing the Copy and Clone impls.
noxfile.py
Outdated
@@ -32,6 +33,7 @@ | |||
PYO3_DOCS_TARGET = PYO3_TARGET / "doc" | |||
PY_VERSIONS = ("3.7", "3.8", "3.9", "3.10", "3.11", "3.12", "3.13") | |||
PYPY_VERSIONS = ("3.7", "3.8", "3.9", "3.10") | |||
IS_FREE_THREADED = bool(sysconfig.get_config_var("Py_GIL_DISABLED")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this makes it conditional on the host python, but really what we care about is the target python, I think?
pyo3-build-config/src/impl_.rs
Outdated
if build_flags.0.contains(&BuildFlag::Py_GIL_DISABLED) && abi3 { | ||
bail!("Cannot set Py_LIMITED_API and Py_GIL_DISABLED at the same time") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in some cases we simply warn and disable Py_LIMITED_API
(e.g., PyPy), do we have a principled reason to do one vs. the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 3.13 Py_GIL_DISABLED
implies Py_LIMITED_API
isn't set and I wanted a way to enforce that. Warning and disabling the limited API gives the same effect.
I see there's early-exits in some of these functions for pypy and graal. I think can do something similar for free-threaded.
return _Py_IMMORTAL_REFCNT; | ||
} | ||
let shared = (AtomicIsize::from((*ob).ob_ref_shared)).load(Relaxed); | ||
local as Py_ssize_t + Py_ssize_t::from(shared >> _Py_REF_SHARED_SHIFT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #3357 we discussed maybe we should just remove this API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for helping out with this! Clearly some sticky cases, overall this is looking great already :)
return _Py_IMMORTAL_REFCNT; | ||
} | ||
let shared = (AtomicIsize::from((*ob).ob_ref_shared)).load(Relaxed); | ||
local as Py_ssize_t + Py_ssize_t::from(shared >> _Py_REF_SHARED_SHIFT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we probably do need to keep this API exposed (even after #4065 we use it for debugging and I think others might want to too). I guess that given it's a static inline function, let's just go down the Atomic::from_ptr
route and hope that CPython can provide a C function in 3.14.
pyo3-ffi/src/cpython/lock.rs
Outdated
#[repr(C)] | ||
#[derive(Debug, Copy, Clone)] | ||
pub struct PyMutex { | ||
pub _bits: u8, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Default
implementation seems a bit risky to me because of the existence of std::mem::take
. I'm not sure _pin
will save us from that unless we don't give users a way to get at &mut PyMutex
, which e.g. they could just do with:
let mut m = PyMutex::default();
let moved = std::mem::take(&mut mutex);
This would also be a move-related bug:
let mut m = PyMutex::default();
PyMutex_Lock(&mut m);
let moved = m;
If we want to allow users to construct a PyMutex, I think we need to remove the Default
implementation and have something like
impl PyMutex {
pub fn new() -> Pin<Box<PyMutex>>
}
... but I also wonder whether that's overkill and we should just
- not bother with
_pin
, and - not give users a way to construct
PyMutex
for now (we will need it inside the crate forPyObject_HEAD_INIT
, but we can just makepub(crate) _bits
).
@@ -507,8 +557,13 @@ extern "C" { | |||
|
|||
#[inline(always)] | |||
pub unsafe fn Py_INCREF(op: *mut PyObject) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially can we leave a TODO or fixme to use inline reference counting on freethreaded builds? (Here and in Py_DECREF
.)
If it helps us eke out a tiny bit of performance, it might be a nice thing to do eventually.
noxfile.py
Outdated
@@ -32,6 +33,7 @@ | |||
PYO3_DOCS_TARGET = PYO3_TARGET / "doc" | |||
PY_VERSIONS = ("3.7", "3.8", "3.9", "3.10", "3.11", "3.12", "3.13") | |||
PYPY_VERSIONS = ("3.7", "3.8", "3.9", "3.10") | |||
IS_FREE_THREADED = bool(sysconfig.get_config_var("Py_GIL_DISABLED")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as far as our use of nox
is concerned, these are the same. Is there an alternative?
It looks like the reason this was added is because we made the abi3
build on free-threaded a hard failure. Maybe we should relax that and then we don't need a change in the noxfile
for free-threading?
pyo3-build-config/src/impl_.rs
Outdated
"cargo:rustc-cfg=Py_3_12".to_owned(), | ||
"cargo:rustc-cfg=Py_3_13".to_owned(), | ||
"cargo:rustc-cfg=Py_GIL_DISABLED".to_owned(), | ||
"cargo:rustc-cfg=py_sys_config=\"Py_GIL_DISABLED\"".to_owned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think let's go for just the top-level Py_GIL_DISABLED
👍. (You'll need to special-case this when printing the BuildFlags
. as cfgs, I guess)
Please document in pyo3-build-config/lib.rs
(in the docstring for use_pyo3_cfgs
).
I've been giving it some thought and I think we really should use atomic integers (in PyMutex and the relevant refcount fields). The code to correctly access them is rather error prone. Suppose a user wants to read the vvalue of a PyObject's PyMutex field. They will have to write (and figure out that this is how they have to write it): let x: *mut PyObject= ...;
let value = AtomicU8::from_ptr(addr_of_mut!((*x).ob_mutex)).load(Ordering::Relaxed); rather than let value = *x.ob_mutex; and if they want to write to it they must also avoid creating mutable references, which is another poorly known footgun. |
@colesbury I suspect you might have an opinion about how to wrap the new PyObject fields in PyO3 and whether or not using rust atomics in the PyObject bindings makes sense. |
Yeah, I think using atomic integers for the fields is a good idea. I'm not sure if there's much value for exposing For more generic locking, you can use |
Note that using atomic values for the fields probably means we want to consider Alternatively on platforms where |
I think CPython requires atomics support. Are there use-cases for a version of pyo3 on hosts where CPython can't be compiled? |
Are there platforms where |
That's a good question; I wouldn't know the answer to that without looking through the rust source tree (and maybe others could define their own platforms). In practice it may not be a concern we have to worry about for these cases. We could punt on worrying about that until someone actually reports an issue for their platform, I guess 👀 |
Ah just seen this - in that case I think we can assume using atomics here is fine 👍 |
Co-authored-by: David Hewitt <[email protected]>
37a21bb
to
dbcb9d2
Compare
It turns out there's one more issue adding PyMutex to the FFI bindings reveals: |
It turns out, for whatever reason, only a debug free-threaded python build successfully runs the pyo3 tests, at least on my Mac's development environment using pyenv. See python/cpython#122918 (comment). Just sharing in case anyone else wants to try this out and sees mysterious crashes with tracebacks that include |
pyo3-ffi/src/object.rs
Outdated
#[cfg(Py_GIL_DISABLED)] | ||
ob_gc_bits: 0, | ||
#[cfg(Py_GIL_DISABLED)] | ||
ob_ref_local: AtomicU32::new(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be immortal (u32::MAX
) I think
#[repr(C)] | ||
pub struct PyMutex { | ||
_bits: AtomicU8, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at Rust's built-in mutex types, I think we'd want a constructor like:
impl PyMutex {
pub const fn new() -> PyMutex {
PyMutex { _bits: AtomicU8::new(0) }
}
}
pyo3-ffi/src/object.rs
Outdated
#[cfg(Py_GIL_DISABLED)] | ||
_padding: 0, | ||
#[cfg(Py_GIL_DISABLED)] | ||
ob_mutex: unsafe { mem::zeroed::<PyMutex>() }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be PyMutex::new()
if we define it as above.
That avoids the unsafe
block. And maybe allows PyObject_HEAD_INIT
to remain a constant? Are there other fields with interior mutability?
Hmm, maybe this is what @davidhewitt is referring to above in #4421 (comment) with the bit about |
126c97b
to
4f261c2
Compare
It looks like the lint triggers because of the atomic fields, so declaring the |
Can we not just silence the lint? The copy is intentional. |
4f261c2
to
162a65e
Compare
Refs #4265
nox -s ffi-check
pass on the free-threaded buildcontinue_on_error
set for the test run.