Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix behavior of foldr on empty arrays (#20144) #20160

Merged
merged 6 commits into from
Jan 27, 2017
Merged

Fix behavior of foldr on empty arrays (#20144) #20160

merged 6 commits into from
Jan 27, 2017

Conversation

vsartor
Copy link

@vsartor vsartor commented Jan 20, 2017

This is a fix for #20144.

When fixing #7465, a check was added based on mr_empty to mapfoldl, which is called by both foldl and reduce, so they would have better behavior with empty arrays.

This implements an equivalent check on foldr so that it will have the same behavior as foldl and reduce.

I also added tests to make sure "folding" an operator such as + on an empty array gives the expected result.

This is my first PR, so sorry for any mistakes.

When foldr is called with two arguments and an empty array, mapfoldr does not make a check for empty array and tries to access the first element. This check is present in mapfoldl.

The commit adds equivalent behavior to mapfoldl.
@@ -131,7 +131,13 @@ mapfoldr(f, op, v0, itr) = mapfoldr_impl(f, op, v0, itr, endof(itr))
Like `mapfoldr(f, op, v0, itr)`, but using the first element of `itr` as `v0`. In general,
this cannot be used with empty collections (see `reduce(op, itr)`).
"""
mapfoldr(f, op, itr) = (i = endof(itr); mapfoldr_impl(f, op, f(itr[i]), itr, i-1))
function mapfoldr(f, op, itr)
i = endof(itr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the trailing semicolon; remove?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to remove it, sorry.
Removed.

@@ -16,6 +17,7 @@
@test Base.mapfoldl((x)-> x ⊻ true, |, [true false true false false]) == true
@test Base.mapfoldl((x)-> x ⊻ true, |, false, [true false true false false]) == true

@test foldr(+, Int64[]) == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a comment to each of these tests referencing the relevant issue and pull request?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the foldr test I can just reference issue #20144 and PR #20160 (this one), but on the foldl test I can only reference issue #7465 since there is no PR.
@Sacha0, do I just reference the relevant issue for the foldl test or do I add the SHA of the relevant commit? (b674eed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referencing via the issue / pull request numbers (e.g. #20144, #20160) would be great. Given #20160 adds the foldl test and #20144/#20160 contain the relevant discussion, perhaps reference #20144 and #20160 for that test as well, rather than (or in addition to) #7465? Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Done.
Thanks for reviewing!

@Sacha0
Copy link
Member

Sacha0 commented Jan 20, 2017

Welcome to Julia @Vsart! Much thanks for this great first pull request!

mapfoldr(f, op, itr) = (i = endof(itr); mapfoldr_impl(f, op, f(itr[i]), itr, i-1))
function mapfoldr(f, op, itr)
i = endof(itr)
if i == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I belive you don't need the i = endof(itr) here, instead, you can do if isempty(itr).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pabloferz, the existing implementation uses i in the call to mapfoldr_impl (see just below)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, missed that. Carry one then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is endof(itr) == 0 correct for all iterator types though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered the same given the mapfoldl implementation takes what might be a more general approach. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, start and done is definitively better and general enough (the endof(itr) == 0 is actually what threw me out)

Copy link
Author

@vsartor vsartor Jan 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the check from mapfoldr_impl (here) which tests if the i argument is 0, I'm assuming that if the one I implemented in mapfoldr should change so should that check?

@vsartor vsartor changed the title RFC: Fix behavior of foldr on empty arrays (#20144) Fix behavior of foldr on empty arrays (#20144) Jan 21, 2017
@vsartor
Copy link
Author

vsartor commented Jan 22, 2017

The build failure on OSX from Travis seems to be unrelated.

@Sacha0
Copy link
Member

Sacha0 commented Jan 22, 2017

@nanosoldier runbenchmarks(ALL, vs=":master")

@tkelman
Copy link
Contributor

tkelman commented Jan 22, 2017

I'd maybe make some of the reduce tests more specific and also test for type equality via ===, but you'd need to use Int for the element type rather than Int64 (or compare to an Int64 rhs instead of an Int)

@vsartor
Copy link
Author

vsartor commented Jan 22, 2017

@tkelman About the type equality tests, do you mean testing for example

foldr(+, Int16[]) === Int16(0)

or you meant something else?

Currently this is not true (for both the new foldr and the old foldl) Some examples below

julia> typeof(foldl(+, Int8[]))
Int32

julia> typeof(foldl(+, Int16[]))
Int32

julia> typeof(foldl(+, Int32[]))
Int64

julia> typeof(foldl(+, Int64[]))
Int64

Note that this behavior is not particular to empty arrays.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@tkelman
Copy link
Contributor

tkelman commented Jan 22, 2017

Right, reductions currently widen mostly to make them less susceptible to overflow. But where we or packages care at all about the return type, it's worth encoding that specificity into the tests. If we ever intend to change the return type, we'll have to visibly change the tests. What testing the types would avoid is accidentally changing the return types and not noticing.

@ararslan ararslan added the domain:collections Data structures holding multiple items, e.g. sets label Jan 23, 2017
@StefanKarpinski
Copy link
Sponsor Member

This looks good to me. Can all the reviewer please approve the changes?

@@ -131,7 +131,13 @@ mapfoldr(f, op, v0, itr) = mapfoldr_impl(f, op, v0, itr, endof(itr))
Like `mapfoldr(f, op, v0, itr)`, but using the first element of `itr` as `v0`. In general,
this cannot be used with empty collections (see `reduce(op, itr)`).
"""
mapfoldr(f, op, itr) = (i = endof(itr); mapfoldr_impl(f, op, f(itr[i]), itr, i-1))
function mapfoldr(f, op, itr)
i = endof(itr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the endof could be moved to after the if now, but I don't know if there are any collections where endof would be expensive enough for it to matter when isempty is true

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense though, the only reason it's before is because I was previously using the i value in the if. Should I make the change?

Copy link
Contributor

@tkelman tkelman Jan 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably not worth waiting in the CI queue for again, if nobody else has any more substantial feedback. Maybe next time someone's working around this bit of code.

Copy link
Contributor

@pabloferz pabloferz Jan 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is that you could leave the i = endof(itr) and use if done(itr, i) instead of if isempty(itr). Other than that, I don't have more comments.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pabloferz I considered that option over isempty but I found out that because of the following behavior I couldn't use it

julia> itr = Float64[]
0-element Array{Float64,1}

julia> i = endof(itr)
0

julia> done(itr, i)
false

It would only return true when i == 1 for the empty array.

Copy link
Contributor

@pabloferz pabloferz Jan 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's right (forgot that this recurses backwards). Then, it is fine as is right now.

Copy link
Contributor

@tkelman tkelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be squashed on merge, lgtm

@Sacha0 Sacha0 merged commit e63fec8 into JuliaLang:master Jan 27, 2017
@Sacha0
Copy link
Member

Sacha0 commented Jan 27, 2017

Thanks @Vsart!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants