Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return a Sequence rather than an in-memory array? #13

Open
rfdickerson opened this issue Nov 19, 2016 · 14 comments
Open

Return a Sequence rather than an in-memory array? #13

rfdickerson opened this issue Nov 19, 2016 · 14 comments
Labels

Comments

@rfdickerson
Copy link
Contributor

rfdickerson commented Nov 19, 2016

I was wondering if we could return a Sequence<Record> where Record is that that result set that we define in #4 . What this would do, would be that you can have access to the results one at a time, using some of the things we are doing #11 as the mechanism.

To the user, they would have access to Sequence-like behavior, even though it's lazily evaluated. e.x.:

let results = connection.execute("SELECT * FROM books")

let totalBookPrice = results.reduce(0,  { $0["price] + $1 })

You could imagine how expensive that would be to do if it was all in main memory (where there are thousands of books each with dozens of fields), but here- we just deal with price. Haha- I realize you can do that in SQL with the SUM function- so maybe my example isn't a good one- but you can imagine doing some complex business side logic in this way.

@shmuelk
Copy link
Collaborator

shmuelk commented Nov 22, 2016

I assume that by record you mean a database record, i.e. a row in the result set.

If so, then this is a good idea and solves the pagination issue as well (at least on PostgrSQL where in any event the result set is fetched from the server one row at a time under the covers).

@irar2
Copy link
Contributor

irar2 commented Nov 22, 2016

Here is what we are planning to implement in order to enable the proposed Sequence API:

Replace QueryResult.rows(titles: [String], rows: [[Any?]]) with
QueryResult.resultSet(resultSet: ResultSet)

and replace QueryResult.asRows with

/// Data received from the query execution represented as a `ResultSet`.
public var asResultSet: ResultSet

Define a protocol for fetching query results to be used in ResultSet:

/// A protocol for retrieving query results. All database plugins must implement this protocol.
public protocol ResultFetcher {

   	/// Fetch the next row of the query result. This function is blocking.
  	///
	/// - Returns: An array of values of type Any? representing the next row from the query result. 
	func fetchNext() -> [Any?]? 

	/// Fetch the next row of the query result. This function is non-blocking.
  	///
	/// - Parameter callback: A callback to call when the next row of the query result is ready.
	func fetchNext(callback: ([Any?]?) ->())

	/// Fetch the titles of the query result. This function is blocking.
  	///
	/// - Returns: An array of column titles of type String.	
	func fetchTitles() -> [String]
}

ResultSet will have both blocking (a Sequence) and non-blocking (nextRow() function) APIs:

/// Query result representation as either a blocking `RowSequence` or as a non-blocking nextRow() function.
public struct ResultSet {
    private var resultFetcher: ResultFetcher
    
    /// The query result as a Sequence of rows. This API is blocking.
    public private (set) var rows: RowSequence
    
    /// Instantiate an instance of ResultSet.
    ///
    /// - Parameter resultFetcher: An implementation of `ResultFetcher` protocol to fetch the query results.
    public init(_ resultFetcher: ResultFetcher) {
	self.resultFetcher = resultFetcher
	rows = RowSequence(resultFetcher)
    }

    /// Fetch the next row of the query result. This function is non-blocking.
    ///
    /// - Parameter callback: A callback to call when the next row of the query result is ready.
    public func nextRow(callback: (row: [Any?]?) ->()) { 
       resultFetcher.fetchNext { row in
	   callback(row)
    }

    /// The column titles of the query result. Blocking.
    public var titles: [String] {
	return resultFetcher.fetchTitles()
    }
}

/// A query result as a Sequence of rows. 
public struct RowSequence : Sequence, IteratorProtocol  {
   private var resultFetcher: ResultFetcher

   init(_ resultFetcher: ResultFetcher) {
	self.resultFetcher = resultFetcher
   }
 
   /// Get the next row. This function is non-blocking.
   ///
   /// - Returns: An array of values of type Any? representing the next row from the query result. 
   public mutating func next() -> [Any?]? {
       return resultFetcher.fetchNext()
   }
}

The plugins will have to implement ResultFetcher. PostgreSQL plugin will (at least at first) just fetch all the rows and keep them in an array, and return them via fetchNext().

irar2 added a commit that referenced this issue Nov 24, 2016
irar2 added a commit to Kitura/Swift-Kuery-PostgreSQL that referenced this issue Nov 24, 2016
irar2 added a commit to Kitura/Swift-Kuery-PostgreSQL that referenced this issue Nov 29, 2016
irar2 added a commit to Kitura/Swift-Kuery-PostgreSQL that referenced this issue Dec 6, 2016
@irar2
Copy link
Contributor

irar2 commented Dec 6, 2016

What should happen if we get an error in a middle of a sequence, i.e. we fetched some rows, and then something goes wrong. If we just return nil, it's equivalent to saying that there are no more rows. Is throwing an error a good solution?

@rfdickerson
Copy link
Contributor Author

I would normally say throw an exception. However, from looking at how to implement Sequence, the next function just returns an Optional. It is not a throwing method at all. I don't think we can throw, right?

struct Countdown: Sequence, IteratorProtocol {
    var count: Int

    mutating func next() -> Int? {
        if count == 0 {
            return nil
        } else {
            defer { count -= 1 }
            return count
        }
    }
}

@irar2
Copy link
Contributor

irar2 commented Dec 6, 2016

Right :(

@groue
Copy link

groue commented Dec 6, 2016

You're not the first database library that has difficulties with Swift's non-throwable sequences, and you may appreciate Cursors. Basically they look like lazy sequences, but designed for wrapping external resources that may throw errors when iterated.

The link above provides a general Cursor protocol that you can adopt, and an implementation for many expected methods like contains, enumerated, filter, flatMap, map, reduce, etc.

The protocol:

/// A type that supplies the values of some external resource, one at a time.
public protocol Cursor : class {
    /// The type of element traversed by the cursor.
    associatedtype Element
    
    /// Advances to the next element and returns it, or nil if no next element
    /// exists. Once nil has been returned, all subsequent calls return nil.
    func next() throws -> Element?
}

Usage:

let cursor = ...
while let elem = try cursor.next() { ... }
try cursor.forEach { elem in ... }

If this looks right to you, you may support the proposal on the Swift Evolution mailing list.

@irar2
Copy link
Contributor

irar2 commented Dec 8, 2016

@groue Thank you, your Cursor works great and solves our problem of not being able to throw an error with Sequence. We'd love to see you creating a PR against Swift-Kuery with this class. (We will be able to accept it within a week or so).

@groue
Copy link

groue commented Dec 8, 2016

Thanks @irar2. I'm not sure which kind of PR you'd like to see.

It's easy to add the Cursor.swift file. Since it defines the Cursor protocol and the bunch of built-in concrete cursors (MapCursor, FilterCursor, FlattenCursor, etc.), it does not contain any concrete cursors from GRDB itself. It's just the protocol and its helper functions, ready to welcome future Kuery cursors.

But you don't need me for that: just copy that file in Kuery. You can embed the GRDB licence in it (or not, should the MIT license of this file mismatch with Kuery's Apache license).

However, changing Kuery so that it actually uses cursors is another beast: it heavily depends on the design decisions of your team. Should Query use cursors, all drivers should return cursors as well (if you want errors to propagate correctly, from the initial production of rows (by drivers) to the high-level Kuery APIs). I'm not the correct person for such a redesign.

@groue
Copy link

groue commented Dec 8, 2016

For example, the SQLite driver is still fundamentally array-based (1, 2), and does not support cursors at all.

@ianpartridge
Copy link
Contributor

Thanks @groue. The SQLite driver is certainly primitive at the moment - plenty to do!

@groue
Copy link

groue commented Dec 8, 2016

You're right, @ianpartridge. The ResultFetcher protocol, which looks like it is the fundamental fetcher type for drivers, is designed to have both blocking and non-blocking fetchNext() methods. Not really identical to my simple cursors, whose next() method is blocking :-)

See? I can't really send a pull request. I'm just not entitled to do it.

@ianpartridge
Copy link
Contributor

Well, it's open source - anyone who is interested and motivated is encouraged to open PRs :) @irar2 was just being friendly.

Your thoughts/comments are very welcome too, of course.

@groue
Copy link

groue commented Dec 8, 2016

@ianpartridge Thanks :-) @irar2 did push the first revision of ResultFetcher, so maybe she's the most able to understand the consequences of making this type able to throw errors at each single step. Those consequences, I believe, are not light. Such a decision requires deep involvement in Kuery's features and future, which I don't have yet - GRDB.swift is already quite an involving side-job. Have a look at its query generation features, when you have time. Its goals are not exactly the same as Kuery's, but they quite overlap.

@irar2
Copy link
Contributor

irar2 commented Dec 8, 2016

@groue I didn't mean to ask you to fix our plugins. We just need your Cursor.swift to be in Swift-Kuery. The easiest way for us would be if you could create a PR with it (just to avoid all the bureaucracy issues). No need to change any Swift-Kuery code.

Just to clarify, ResultSet has two parts: the blocking RowSequence (which we would like to change to RowCursor) and the non-blocking fetchNext(), to enable the user to choose how he/she wants to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants