Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seq.shuffle() (and others) should delay buffering until a terminal op requests the stream values #195

Open
lukaseder opened this issue Feb 17, 2016 · 3 comments

Comments

@lukaseder
Copy link
Member

See comment there:
aol/cyclops#99 (comment)

Others:

  • innerJoin()
  • leftOuterJoin()
  • removeAll()
  • retainAll()
  • window()
  • zip() eagerly allocates an Iterator
  • SeqUtils.transform() eagerly allocates a Spliterator
  • reverse()
  • crossJoin()
  • duplicate()
  • grouped()
  • partition

In short: Each operation has this flaw

@tlinkowski
Copy link

An idea on how to get rid of toList() calls in some of the methods mentioned above.

I think of a special SeqBuffer class (possibly implementing Streamable [#85]) that would decorate given Seq<T>. Internally it would use this Seq's specially wrapped Spliterator<T> spliterator. This Spliterator would add consumed elements to an ArrayList<T> buffer. And the seq() method of this SeqBuffer would return Seq.seq(buffer).append(spliterator).

In every place we now call toList we could create an instance of this class, and then call its seq() method.

But it's just a theoretical concept from the top of my head - I haven't tested this in any way and I don't know whether this would really work.

@tlinkowski
Copy link

No, it is a bad solution. The above wouldn't work if the seq() method were called more than once and then both Seqs were consumed interchangeably:

  1. Seq.seq(buffer).append(spliterator) would throw a ConcurrentModificationException
  2. if both of the Seqs entered the append(spliterator) section, they would not return all the elements

But I'm pretty sure it can be done to support multiple seq() calls and arbitrary (but single-threaded) consumption of all the resulting Seqs.

@lukaseder
Copy link
Member Author

Hmm, I'm happy to look at a concrete code implementation :)

tlinkowski pushed a commit to tlinkowski/jOOL that referenced this issue Apr 27, 2017
tlinkowski pushed a commit to tlinkowski/jOOL that referenced this issue May 11, 2017
[jOOQ#195] Applied SeqBuffer to certain Seq method implementations
[jOOQ#122] Tread-safe Seq.duplicate()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants