Return a Sequence rather than an in-memory array? #13

rfdickerson · 2016-11-19T17:50:46Z

I was wondering if we could return a Sequence<Record> where Record is that that result set that we define in #4 . What this would do, would be that you can have access to the results one at a time, using some of the things we are doing #11 as the mechanism.

To the user, they would have access to Sequence-like behavior, even though it's lazily evaluated. e.x.:

let results = connection.execute("SELECT * FROM books")

let totalBookPrice = results.reduce(0,  { $0["price] + $1 })

You could imagine how expensive that would be to do if it was all in main memory (where there are thousands of books each with dozens of fields), but here- we just deal with price. Haha- I realize you can do that in SQL with the SUM function- so maybe my example isn't a good one- but you can imagine doing some complex business side logic in this way.

The text was updated successfully, but these errors were encountered:

shmuelk · 2016-11-22T09:45:00Z

I assume that by record you mean a database record, i.e. a row in the result set.

If so, then this is a good idea and solves the pagination issue as well (at least on PostgrSQL where in any event the result set is fetched from the server one row at a time under the covers).

irar2 · 2016-11-22T12:32:09Z

Here is what we are planning to implement in order to enable the proposed Sequence API:

Replace QueryResult.rows(titles: [String], rows: [[Any?]]) with
QueryResult.resultSet(resultSet: ResultSet)

and replace QueryResult.asRows with

/// Data received from the query execution represented as a `ResultSet`.
public var asResultSet: ResultSet

Define a protocol for fetching query results to be used in ResultSet:

/// A protocol for retrieving query results. All database plugins must implement this protocol.
public protocol ResultFetcher {

   	/// Fetch the next row of the query result. This function is blocking.
  	///
	/// - Returns: An array of values of type Any? representing the next row from the query result. 
	func fetchNext() -> [Any?]? 

	/// Fetch the next row of the query result. This function is non-blocking.
  	///
	/// - Parameter callback: A callback to call when the next row of the query result is ready.
	func fetchNext(callback: ([Any?]?) ->())

	/// Fetch the titles of the query result. This function is blocking.
  	///
	/// - Returns: An array of column titles of type String.	
	func fetchTitles() -> [String]
}

ResultSet will have both blocking (a Sequence) and non-blocking (nextRow() function) APIs:

/// Query result representation as either a blocking `RowSequence` or as a non-blocking nextRow() function.
public struct ResultSet {
    private var resultFetcher: ResultFetcher
    
    /// The query result as a Sequence of rows. This API is blocking.
    public private (set) var rows: RowSequence
    
    /// Instantiate an instance of ResultSet.
    ///
    /// - Parameter resultFetcher: An implementation of `ResultFetcher` protocol to fetch the query results.
    public init(_ resultFetcher: ResultFetcher) {
	self.resultFetcher = resultFetcher
	rows = RowSequence(resultFetcher)
    }

    /// Fetch the next row of the query result. This function is non-blocking.
    ///
    /// - Parameter callback: A callback to call when the next row of the query result is ready.
    public func nextRow(callback: (row: [Any?]?) ->()) { 
       resultFetcher.fetchNext { row in
	   callback(row)
    }

    /// The column titles of the query result. Blocking.
    public var titles: [String] {
	return resultFetcher.fetchTitles()
    }
}

/// A query result as a Sequence of rows. 
public struct RowSequence : Sequence, IteratorProtocol  {
   private var resultFetcher: ResultFetcher

   init(_ resultFetcher: ResultFetcher) {
	self.resultFetcher = resultFetcher
   }
 
   /// Get the next row. This function is non-blocking.
   ///
   /// - Returns: An array of values of type Any? representing the next row from the query result. 
   public mutating func next() -> [Any?]? {
       return resultFetcher.fetchNext()
   }
}

The plugins will have to implement ResultFetcher. PostgreSQL plugin will (at least at first) just fetch all the rows and keep them in an array, and return them via fetchNext().

irar2 · 2016-12-06T12:23:42Z

What should happen if we get an error in a middle of a sequence, i.e. we fetched some rows, and then something goes wrong. If we just return nil, it's equivalent to saying that there are no more rows. Is throwing an error a good solution?

rfdickerson · 2016-12-06T12:29:38Z

I would normally say throw an exception. However, from looking at how to implement Sequence, the next function just returns an Optional. It is not a throwing method at all. I don't think we can throw, right?

struct Countdown: Sequence, IteratorProtocol {
    var count: Int

    mutating func next() -> Int? {
        if count == 0 {
            return nil
        } else {
            defer { count -= 1 }
            return count
        }
    }
}

irar2 · 2016-12-06T12:37:19Z

Right :(

groue · 2016-12-06T12:55:08Z

You're not the first database library that has difficulties with Swift's non-throwable sequences, and you may appreciate Cursors. Basically they look like lazy sequences, but designed for wrapping external resources that may throw errors when iterated.

The link above provides a general Cursor protocol that you can adopt, and an implementation for many expected methods like contains, enumerated, filter, flatMap, map, reduce, etc.

The protocol:

/// A type that supplies the values of some external resource, one at a time.
public protocol Cursor : class {
    /// The type of element traversed by the cursor.
    associatedtype Element
    
    /// Advances to the next element and returns it, or nil if no next element
    /// exists. Once nil has been returned, all subsequent calls return nil.
    func next() throws -> Element?
}

Usage:

let cursor = ...
while let elem = try cursor.next() { ... }
try cursor.forEach { elem in ... }

If this looks right to you, you may support the proposal on the Swift Evolution mailing list.

irar2 · 2016-12-08T09:51:34Z

@groue Thank you, your Cursor works great and solves our problem of not being able to throw an error with Sequence. We'd love to see you creating a PR against Swift-Kuery with this class. (We will be able to accept it within a week or so).

groue · 2016-12-08T10:51:10Z

Thanks @irar2. I'm not sure which kind of PR you'd like to see.

It's easy to add the Cursor.swift file. Since it defines the Cursor protocol and the bunch of built-in concrete cursors (MapCursor, FilterCursor, FlattenCursor, etc.), it does not contain any concrete cursors from GRDB itself. It's just the protocol and its helper functions, ready to welcome future Kuery cursors.

But you don't need me for that: just copy that file in Kuery. You can embed the GRDB licence in it (or not, should the MIT license of this file mismatch with Kuery's Apache license).

However, changing Kuery so that it actually uses cursors is another beast: it heavily depends on the design decisions of your team. Should Query use cursors, all drivers should return cursors as well (if you want errors to propagate correctly, from the initial production of rows (by drivers) to the high-level Kuery APIs). I'm not the correct person for such a redesign.

groue · 2016-12-08T11:09:30Z

For example, the SQLite driver is still fundamentally array-based (1, 2), and does not support cursors at all.

ianpartridge · 2016-12-08T11:27:07Z

Thanks @groue. The SQLite driver is certainly primitive at the moment - plenty to do!

groue · 2016-12-08T11:35:02Z

You're right, @ianpartridge. The ResultFetcher protocol, which looks like it is the fundamental fetcher type for drivers, is designed to have both blocking and non-blocking fetchNext() methods. Not really identical to my simple cursors, whose next() method is blocking :-)

See? I can't really send a pull request. I'm just not entitled to do it.

ianpartridge · 2016-12-08T11:55:09Z

Well, it's open source - anyone who is interested and motivated is encouraged to open PRs :) @irar2 was just being friendly.

Your thoughts/comments are very welcome too, of course.

groue · 2016-12-08T12:08:53Z

@ianpartridge Thanks :-) @irar2 did push the first revision of ResultFetcher, so maybe she's the most able to understand the consequences of making this type able to throw errors at each single step. Those consequences, I believe, are not light. Such a decision requires deep involvement in Kuery's features and future, which I don't have yet - GRDB.swift is already quite an involving side-job. Have a look at its query generation features, when you have time. Its goals are not exactly the same as Kuery's, but they quite overlap.

irar2 · 2016-12-08T12:19:30Z

@groue I didn't mean to ask you to fix our plugins. We just need your Cursor.swift to be in Swift-Kuery. The easiest way for us would be if you could create a PR with it (just to avoid all the bureaucracy issues). No need to change any Swift-Kuery code.

Just to clarify, ResultSet has two parts: the blocking RowSequence (which we would like to change to RowCursor) and the non-blocking fetchNext(), to enable the user to choose how he/she wants to work.

rfdickerson added the question label Nov 19, 2016

irar2 added a commit that referenced this issue Nov 24, 2016

#13 Return query result as Sequence

7757ca9

irar2 added a commit to Kitura/Swift-Kuery-PostgreSQL that referenced this issue Nov 24, 2016

Kitura/Swift-Kuery#13 Implemented query fetcher and updated the tests

76ed790

irar2 added a commit to Kitura/Swift-Kuery-PostgreSQL that referenced this issue Nov 29, 2016

Kitura/Swift-Kuery#13 Fixed typo

71b4527

irar2 added a commit to Kitura/Swift-Kuery-PostgreSQL that referenced this issue Dec 6, 2016

Kitura/Swift-Kuery#13 Retrieve query result row-by-row

b3f02d0

groue mentioned this issue Dec 8, 2016

Introduce the Cursor protocol #32

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return a Sequence rather than an in-memory array? #13

Return a Sequence rather than an in-memory array? #13

rfdickerson commented Nov 19, 2016 •

edited

shmuelk commented Nov 22, 2016

irar2 commented Nov 22, 2016

irar2 commented Dec 6, 2016

rfdickerson commented Dec 6, 2016

irar2 commented Dec 6, 2016

groue commented Dec 6, 2016 •

edited

irar2 commented Dec 8, 2016

groue commented Dec 8, 2016

groue commented Dec 8, 2016 •

edited

ianpartridge commented Dec 8, 2016

groue commented Dec 8, 2016 •

edited

ianpartridge commented Dec 8, 2016

groue commented Dec 8, 2016 •

edited

irar2 commented Dec 8, 2016

Return a Sequence rather than an in-memory array? #13

Return a Sequence rather than an in-memory array? #13

Comments

rfdickerson commented Nov 19, 2016 • edited

shmuelk commented Nov 22, 2016

irar2 commented Nov 22, 2016

irar2 commented Dec 6, 2016

rfdickerson commented Dec 6, 2016

irar2 commented Dec 6, 2016

groue commented Dec 6, 2016 • edited

irar2 commented Dec 8, 2016

groue commented Dec 8, 2016

groue commented Dec 8, 2016 • edited

ianpartridge commented Dec 8, 2016

groue commented Dec 8, 2016 • edited

ianpartridge commented Dec 8, 2016

groue commented Dec 8, 2016 • edited

irar2 commented Dec 8, 2016

rfdickerson commented Nov 19, 2016 •

edited

groue commented Dec 6, 2016 •

edited

groue commented Dec 8, 2016 •

edited

groue commented Dec 8, 2016 •

edited

groue commented Dec 8, 2016 •

edited