Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid associative (and identity) aggregations over the complete partition #176

Open
lukaseder opened this issue Jan 9, 2016 · 0 comments

Comments

@lukaseder
Copy link
Member

Associative and identity aggregations could be calculated based on the result of the aggregation of the previous value:

v   sum (-3, -1)  current calculation   better
-------------------------------------------------
1   e             e                     e
2   1             1                     e + 1
3   3             1 + 2                 1 + 2
4   6             1 + 2 + 3             3 + 3
5   9                 2 + 3 + 4         6 + 4 - 1
6   12                    3 + 4 + 5     9 + 5 - 2

As the partition is already sorted correctly, we can simply access the (cached) calculated value from lag()

Associative and identity operations currently supported are:

  • count() (optimisation not necessary, as count() is already O(1))
  • sum() (and all derived aggregations)
  • avg() (and all derived aggregations - not strictly associative, but can depend on sum() and count())
  • toList() (a potential optimisation would require an immutable linked list)
  • toString()

More limited optimisation is possible for associative-only aggregations, when using a Long.MIN_VALUE .. 0 frame. These include:

  • min()
  • max()
  • median() (would depend on cached toList())
  • percentile() (would depend on cached toList())
  • allMatch()
  • anyMatch()
  • noneMatch()
@lukaseder lukaseder added this to the Version 0.9.10 milestone Jan 9, 2016
@lukaseder lukaseder changed the title Avoid associative aggregations over the complete partition for [Long.MIN_VALUE .. 0] frames Avoid associative (and identity) aggregations over the complete partition Jan 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant