-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add second lifetime to FromPyObject
#4390
base: main
Are you sure you want to change the base?
Conversation
I propose that we add a |
Thanks! I think that's a great idea. It's essentially the same thing, but it's much easier to grasp by giving it a distinct name, plus we can better document it in the API docs. I will adapt this to make use of that. |
c960124
to
a1c0bed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pushing on with this! Overall it looks good to me, and I only have a few docs suggestions.
Before we definitely commit to this, I want to take a brief moment just to reflect on the choice of Borrowed
over &Bound
. There's at least two good technical reasons, which is the lifetime constraint and performance by avoiding pointer-to-pointer. The codspeed branch does show a few slight perf improvements in the ~4% range on tuple extraction and we have further possibilities like PyRef
containing Borrowed
once we do this.
That said, we deliberately kept the Borrowed
smart pointer out of the way as much as possible in the original 0.21 design to try to keep new concepts down. Is it fine to increase its use? I think the upsides justify this, though we might want to increase the quality of documentation and examples on Borrowed
(which I'd left quite minimal so far).
@@ -616,8 +616,9 @@ pub fn build_derive_from_pyobject(tokens: &DeriveInput) -> Result<TokenStream> { | |||
let ident = &tokens.ident; | |||
Ok(quote!( | |||
#[automatically_derived] | |||
impl #trait_generics #pyo3_path::FromPyObject<#lt_param> for #ident #generics #where_clause { | |||
fn extract_bound(obj: &#pyo3_path::Bound<#lt_param, #pyo3_path::PyAny>) -> #pyo3_path::PyResult<Self> { | |||
impl #trait_generics #pyo3_path::FromPyObject<'_, #lt_param> for #ident #generics #where_clause { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be an interesting question for later if we ever support the 'a
lifetime in our #[derive(FromPyObject)]
. Definitely not for now 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, something to experiment with in the future, but out of scope for now for sure 😅
src/conversion.rs
Outdated
/// the normal `FromPyObject` trait. This trait has a blanket implementation | ||
/// for `T: FromPyObject`. | ||
/// Note: depending on the implementation, the lifetime of the extracted result may | ||
/// depend on the lifetime of the `obj` or the `prepared` variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a
rather than prepared
here and below? Also maybe written as 'py
or 'a
, as we do elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I was a bit lazy and just uncommented the docs that we temporarily removed during gil-ref migration.
How about
/// Note: depending on the implementation, the extracted result may
/// depend on the Python lifetime `'py` or the input lifetime `'a` of `obj`.
Also the example below that seem a bit confusing, what do you think about using Cow<str>
as an example, which may or may not borrow from the input ('a
) depending on the Python runtime type and maybe Bound<'py, PyString>
as an example that depends on the Python lifetime? Maybe we also add an example of a collection type (maybe Vec
) to introduce FromPyObjectOwned
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of that sounds great, yes please 👍
impl<'py, T> FromPyObject<'_, 'py> for PyRef<'py, T> | ||
where | ||
T: PyClass, | ||
{ | ||
fn extract_bound(obj: &Bound<'py, PyAny>) -> PyResult<Self> { | ||
fn extract(obj: Borrowed<'_, 'py, PyAny>) -> PyResult<Self> { | ||
obj.downcast::<T>()?.try_borrow().map_err(Into::into) | ||
} | ||
} | ||
|
||
impl<'py, T> FromPyObject<'py> for PyRefMut<'py, T> | ||
impl<'py, T> FromPyObject<'_, 'py> for PyRefMut<'py, T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then the follow-up future PR will be what to do about PyRef
. Should we split into PyRef
/ PyRefBorrowed
, or keep the one type PyRef<'a, 'py, T>
and accept some breakage? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think I we want to provide both variants, I would prefer PyRef<'a, 'py, T>
and PyRefOwned<'py, T>
since that would be more consistent with FromPyObjectOwned
.
In any case we would need to find a good way to differentiate their constructors. Neither obj.borrow_borrowed()
nor obj.borrow_owned()
feel to appealing to me 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it's tough to find names. I'm somewhat tempted to wait until after 0.23 and to first see how all these trait changes we've already made play out before we change PyRef
.
Thanks for the review! You are right, the choice of
I agree, if we put |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the recently added FromPyObject
documentation is too "implementation detail"-y.
I'd prefer if this documentation is kept short and concise, one paragraph max.
For example:
Depending on the python version and implementation some `FromPyObject`
implementations may produce Rust types that point into the Python type.
For example `PyString` and `PyBytes` may convert into `str` and `[u8]` references
that borrow from the original Python object.
Types that do not borrow from the input should use `FromPyObjectOwned` instead.
src/conversion.rs
Outdated
/// borrow from the input lifetime `'a`. The behavior depends on the runtime | ||
/// type of the Python object. For a Python byte string, the existing string | ||
/// data can be borrowed (lifetime: `'a`) into a [`Cow::Borrowed`]. For a Python | ||
/// Unicode string, the data may have to be reencoded to UTF-8, and copied into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or not, depending on the python version the string will be interned in the unicode object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm aware of that, but that felt like too much detail for me, and is not really relevant for the point I am trying to make here. That's why I tried to phrase it in a rather vague way.
Thanks for the feedback! I took another pass over it and tried to make it a bit more concise. I do however think that it's worth to talk about the individual lifetimes at play here, since this trait is right in the center of PyO3. I moved the paragraph about collections into the Let me know what you think about this. |
fc10c74
to
9cd0a41
Compare
Co-authored-by: David Hewitt <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for keeping this branch alive, I guess let's move forward with this before it gets too painful to keep updating.
I had yet another reflection on &Bound
vs Borrowed
and I'm feeling reasonably comfortable that Borrowed
is the correct choice.
@@ -656,8 +656,8 @@ macro_rules! tuple_conversion ({$length:expr,$(($refN:ident, $n:tt, $T:ident)),+ | |||
} | |||
} | |||
|
|||
impl<'py, $($T: FromPyObject<'py>),+> FromPyObject<'py> for ($($T,)+) { | |||
fn extract_bound(obj: &Bound<'py, PyAny>) -> PyResult<Self> | |||
impl<'a, 'py, $($T: FromPyObject<'a, 'py>),+> FromPyObject<'a, 'py> for ($($T,)+) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the record and the decision to stick with Borrowed
, the reason is that if we used &'a Bound
here for the input argument, then tuple extraction would be forced to restrict elements to FromPyObjectOwned
.
That maybe wouldn't be a terrible thing, as this is currently a special case when compared with every other container, which is forced to use FromPyObjectOwned
due to their mutability.
But I think it would be a breaking change, so it's probably not a good change overall. (E.g. users would lose the ability to extract tuples containing &str
, &Bound
and Borrowed
)
I think as a secondary effect, by using Borrowed
here we might be able to change extract_argument.rs
to make more use of Borrowed
, which IIRC is a possible performance win.
Co-authored-by: David Hewitt <[email protected]>
I believe this now also finishes the |
Fantastic, thank you so much for your help on pushing that over the line. When I get some time I'll try to write up a list of what we might have left for 0.23 (hopefully not much). |
Ah, I just started playing with this locally and found one additional finding: having Is that a problem? I think not, because that's what |
Oh, and one more case which comes up in local testing. At the moment |
It feels to me like we might want to consider macro tricks such that users can use both |
That's true, but as you mention below, I actually think this a good thing and makes a clearer use case between
Hmm, good question. I guess that is part of the question about how prominent we want |
Yes agreed, it's definitely a follow-up to make changes to expand I wonder, are there any other options we have here? E.g. does it work to have a trait with both I am still feeling a little uneasy about the choice to commit |
And (sorry to keep dangling ideas on here), inspired by the new
Both changes would significantly complicate the trait. I wonder, is there a way that we can support a simple trait like All food for thought and I think worth discussing while we're committing to breaking changes here. |
No problem 😄
If we're willing to do more breaking changes here, that sounds like a good idea to me. The only downsides I can see
This would be really cool indeed. I guess we would maybe somehow need to take advantage of method precedence between trait and inherent methods, to make the fall through to |
This adds the second lifetime to
FromPyObject
and removesFromPyObjectBound
. I hope I adjusted all the bounds correctly. I believe the equivalent to our current implementation is use HRTB for the input lifetime on any generic bound. But this should only be necessary for containers that create temporary Python references during extraction. For wrapper types it should be possible to just forward the relaxed bound.For easier trait bounds
FromPyObjectOwned
is introduced. This is blanket implemented for anyFromPyObject
type that does not depend on the input lifetime. It is intended to be used as a trait bound, the idea is inspired byserde
(Deserialize
<=>DeserializeOwned
)I tried to document different cases in the migration guide, but it can probably still be extended.
Changelog entry is still missing, will write that tomorrow.Breaking changes:
extract
method without defaultextract_bound
to make the default work