Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Seq.reducingMap() and Seq.reducePartially() methods #338

Open
tlinkowski opened this issue Aug 8, 2018 · 5 comments
Open

Add Seq.reducingMap() and Seq.reducePartially() methods #338

tlinkowski opened this issue Aug 8, 2018 · 5 comments

Comments

@tlinkowski
Copy link

On several occasions, I've been looking for a method that would be an inverse of Stream.flatMap. I noticed that other people do too - here's an example question on StackOverflow.

However, there are quite many ways to define it (especially the criteria for such an inverse operation). Finally I've decided to stick to the well-understood reduction semantics, and I've come up with the following two method signatures (names to be discussed) that I'd like to propose for Seq:

  1. reducePartially works like Stream.reduce(BinaryOperator), only it yields and restarts the reduction if conditionalAccumulator returns an empty Optional:
  public static Seq<T> reducePartially(BiFunction<? super T, ? super T, Optional<T>> conditionalAccumulator) {
    return reducingMap(Function.identity(), conditionalAccumulator);
  }

reducePartially is similar to StreamEx.collapse(BiPredicate,BinaryOperator), with the main difference being that the first argument to conditionalAccumulator is the result of previous reduction (exactly as in reduce).

  1. reducingMap works like Stream.reduce(U, BiFunction, BinaryOperator), only it yields and restarts the reduction if conditionalAccumulator returns an empty Optional, and it has a different mechanism for providing initial Us (instead of U identity we have an initialMapper that converts the first T to the initial U):
  public static <U> Seq<U> reducingMap(Function<? super T, ? extends U> initialMapper,
          BiFunction<? super U, ? super T, Optional<U>> conditionalAccumulator);

reducingMap bears some resemblance to StreamEx.collapse(BiPredicate,Collector), but - again - it's much more a "reduce" than a "mark" + "merge".


Here are a few examples of how it is supposed to work:

  1. Seq.reducePartially

Example 1: Merge words ending with hyphens with the following word:

  • conditional accumulator: (a, b) -> a.endsWith("-") ? Optional.of(withoutLastChar(a) + b) : Optional.empty();
  • input: "Sample", "hyp-", "hen-", "ated", "text"
  • output: "Sample", "hyphenated", "text"
  1. Seq.reducingMap

Example 1: Join integers using a semicolon until 0 occurs:

  • initial mapper: String::valueOf
  • conditional accumulator: (String a, int b) -> b != 0 ? Optional.of(a + ";" + b) : Optional.empty();
  • input: 7, 8, 0, 1, 5, 7, 8, 0, 3, 1, 7, 9, 0, 0, 5
  • output: "7;8", "0;1;5;7;8", "0;3;1;7;9", "0", "0;5"

I have this implemented (using a custom, lazy Spliterator) so I could provide a PR if the feature gets accepted.

@lukaseder
Copy link
Member

Thank you very much for your suggestion. The functionality seems to be covered by Seq.grouped(). The advantage of the Seq.grouped() API is that it produces lazy evaluations of group values. In your case, you seem to want eager grouping, which requires consuming the entire stream before emitting output, so I'm not sure what your suggestion gains compared to calling collect()

@tlinkowski
Copy link
Author

Thank you for your answer. I'll try to highlight the differences to make it more clear:

  1. Seq.grouped

    • works on the current T only (uses Function<T,K>)
    • evaluates lazily until next K is obtained
  2. My proposal

    • works on pairs of values, where each pair is:
      • a previously merged T and the current T (Seq.reducePartially: uses BiFunction<T, T, Optional<T>>)
      • an accumulator U and the current T (Seq.reducingMap: uses BiFunction<U, T, Optional<U>>)
    • evaluates lazily until next empty Optional is obtained (does not necessarily consume entire Stream)

Maybe the examples that I provided are too weak, because the conditions in those examples always test only one value (either a or b). Please, consider the example below and see that it cannot be solved using Seq.grouped:

Example 2 (Seq.reducePartially): Merge words on matching hyphens:

  • conditional accumulator: (a, b) -> a.endsWith("-") && b.startsWith("-") ? Optional.of(withoutLastChar(a) + withoutFirstChar(b)) : Optional.empty();
  • input: "mer-", "-ge-", "-able", "but", "the", "arrow", "stays", "->"
  • output: "mergeable", "but", "the", "arrow", "stays", "->"

My real use case is much more complex than simple string concatenation, but it effectively uses something like canMergeWithPrevious and canMergeWithNext flags, where merging occurs only if both flags are true.

@lukaseder
Copy link
Member

works on the current T only (uses Function<T,K>)

indeed, this could be improved with additional overloads

My real use case is much more complex than simple string concatenation, but it effectively uses something like canMergeWithPrevious and canMergeWithNext flags, where merging occurs only if both flags are true.

OK I think I understand your use case now. I'm not convinced about the naming of course, but I see the value of such a method. Will think about it this week.

@tlinkowski
Copy link
Author

Did you find the time to think about it? :)

I'll understand if you want to wait until more people request this feature, though.

In the meantime, here's some explanation and further proposals concerning naming:

1) Seq<T> ???(BiFunction<T, T, Optional<T>> conditionalAccumulator)

I named it reducePartially because it's actually reduce applied to parts of the Stream. But since #296 proposes the words "chunks" for such parts, I think a much better name could be chunkedReduce.

Note that this proposed method assumes that:

  1. chunks are always at least one-element long,
  2. each chunk is processed as if using Stream.reduce(T identity, BinaryOperator<T> accumulator) where:
    1. T identity is the first element of a given chunk,
    2. BinaryOperator<T> accumulator is derived from BiFunction<T, T, Optional<T>> conditionalAccumulator (Optional.empty() means: start next chunk)

2) Seq<U> ???(Function<T, U> initialMapper, BiFunction<U, T, Optional<U>> conditionalAccumulator)

I named it reducingMap because it seems the opposite of flatMap:

  • flatMap maps a single element to one or more elements, and
  • reducingMap maps one or more elements to a single element.

But this analogy does not seem intuitive so I propose to name this method chunkedReduce too.

Note that this proposed method assumes that:

  1. chunks are always at least one-element long,
  2. each chunk is processed as if using Stream.reduce(U identity, BiFunction<U,T,U> accumulator, *) where:
    1. U identity is obtained using Function<T, U> initialMapper from the first element of a given chunk,
    2. BiFunction<U,T,U> accumulator is derived from BiFunction<U, T, Optional<U>> conditionalAccumulator (Optional.empty() means: start next chunk)

PS. Please, remove the "Feedback pending" label unless I am to give some more feedback :)

@lukaseder
Copy link
Member

Did you find the time to think about it? :)

No, sorry

I'll understand if you want to wait until more people request this feature, though.

Yes, definitely. I already regret 1-2 prematurely added features to jOOλ. I prefer not to add more things without enough demand from the community

In the meantime, here's some explanation and further proposals concerning naming...

One reason why I'm often not so quick (or keen) to respond to these things is because of the complexity of the feature. It takes quite a bit of time to think about why this could be useful (not just how it works). This is, to me, a strong indicator that it is not generally useful.

PS. Please, remove the "Feedback pending" label unless I am to give some more feedback :)

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants