Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-7204] [core] CombineHint.NONE #4350

Closed
wants to merge 3 commits into from

Conversation

greghogan
Copy link
Contributor

FLINK-3477 added a hash-combine preceding the reducer configured with CombineHint.HASH or CombineHint.SORT (default). In some cases it may be useful to disable the combiner in ReduceNode by specifying a new CombineHint.NONE value.

We don't currently have unit tests for this part of the code but I'm happy to discuss and add some.

@fhueske
Copy link
Contributor

fhueske commented Jul 17, 2017

Hi @greghogan, I think this is definitely a valuable improvement.
However, I looked at the plan of a query with a ReduceFunction with CombineHint.None and found that it still has a combine node:

[IN] -FWD-> [COMBINE(SORTED_REDUCE)] -HASH-> [REDUCE(SORTED_REDUCE)] -> [OUT]

I think we should

  1. set combinerStrategy to DriverStrategy.NONE in ReduceNode for CombineHint.NONE.
  2. change ReduceProperties.instantiate() with a check for combinerStrategy == NONE and not add a node for the combiner in that case.
  3. add a test in ReduceCompilationTest similar to testGroupedReduceWithHint() to check that the resulting plan has no combine node.

Best, Fabian

FLINK-3477 added a hash-combine preceding the reducer configured with
CombineHint.HASH or CombineHint.SORT (default). In some cases it may be
useful to disable the combiner in ReduceNode by specifying a new
CombineHint.NONE value.
ReduceNode processes CombineHint.NONE by setting the ReduceProperties
combinerStrategy to DriverStrategy.NONE upon which ReduceProperties now
excludes the combiner.

New test ReduceCompilationTest#testGroupedReduceWithoutCombiner checks
the program compilation when excluding the combiner with
CombineHint.NONE. Gelly now excludes the combiner when simplifying
graphs as used in most algorithm unit and integraiton tests.
@greghogan
Copy link
Contributor Author

@fhueske updated and manually verified. I had misinterpreted DriverStrategy to imply that SORTED_REDUCE would not use a combiner.

@fhueske
Copy link
Contributor

fhueske commented Jul 19, 2017

Hi @greghogan, the changes look good and the PR is ready to be merged.

If you like, you can simplify the test before merging by using field position keys (as in testGroupedReduceWithFieldPositionKey()) instead of a KeySelector. That would remove the key extractor and projector map operators and make the plan a bit easier to verify.

Simplify the new testGroupedReduceWithoutCombiner by grouping on field
position keys rather than a KeySelector.
@greghogan
Copy link
Contributor Author

Test updated.

@fhueske
Copy link
Contributor

fhueske commented Jul 19, 2017

+1 to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants