Transaction Support #109

ritch · 2013-12-20T08:04:58Z

In the interest of LoopBack reliability, I'd like to start a discussion on making LoopBack ACID compliant. After brushing off earlier attempts to look into the issue I had an idea.

I think Transactions + Multiversioning will get us 80% or more to ACID.

Transactions

function Transaction(dataSource) {
  this.domain = domain.create();
  this.dataSource = dataSource;
  this.id = this.getHashCode();
  this.snapshots = [];

  domain.transaction = this;
  domain.on('error', this.onError.bind(this));
}

Transaction.prototype.commit = function(callback) {
  this.domain.run(function run() {
    this.dataSource.startTransaction(this);
  }.bind(this));
}

Transaction.prototype.snapshot = function(name, callback) {
  this.snapshots.push(this.dataSource.snapshot(name, this, callback));
}

Transaction.prototype.rollback = function(name, fn) {
  this.currentSnapshot = name ? this.getSnapshot(name) : this.snapshots.pop();
  if(this.currentSnapshot) {
    this.dataSource.applySnapshot(this.currentSnapshot)
  }
}

Transaction.prototype.onError = function() {
  this.rollback(function rollback(err) {
    this.dispose();
    throw err;

  }.bind(this));
}

Transaction.prototype.dispose = function(callback) {
  this.domain.dispose();
  this.dataSource.endTransaction(this, callback);
}

Transaction.getCurrent = function() {
  return domain.active.transaction;
}

// strong-remoting hooking
remotes.before('**', function(ctx, next, method) {
  var t = ctx.transaction = new Transaction(method.constructor.dataSource);
  t.domain.add(ctx.req);
  t.domain.add(ctx.res);
  t.commit(next);
});

remotes.after('**', function(ctx, next) {
  ctx.transaction.dispose(next);
});

The above requires the dataSource / connector to implement methods that delegate to driver transactions, or multiversioning. To support multiversioning, an outer wrapper of the connector could be devised that essentially tracks each method call and model instance. As well as changing the behavior slightly.

Multiversioning

By default all update operations during a transaction would create copies of the given model instance and tag them with the transaction.id. Model's created during the transaction would also be tagged with the transaction.id.

By default, all queries would exlclude instances tagged with transaction.ids.

Failure / Cleanup

If the transaction fails. All models tagged with a transaction.id are removed. This could also be ensured on startup / by a job of some sort to essentially garbage collect tagged instances.

If the transaction completes without error all tagged model instances would replace matching untagged models.

Concerns
During the final phase of a succesful transaction, since transactions aren't isolated by locking, there is a potential for cleanup to happen out of order. If the transaction.id is sortable we might be able to avoid this.

The text was updated successfully, but these errors were encountered:

bajtos · 2013-12-20T13:54:52Z

Adding @piscisaureus, he might want to comment on the way how domains are used here.

First of all, I think adding some sort of transactions would be great, as it will allow us to implement transaction-level caching and save database roundtrips when the same object is queried again.

Could you add a code sample showing how to do a transaction with commit? How to rollback the transaction on error?

Getting transactions right is a very tricky business. Imagine this scenario (Tx1, Tx2 and Tx3 are transactions started by two different HTTP requests)

Tx1 reads Account A1 balance ($10).
Tx1 updates A1.balance - adds $1.
Tx2 reads A1.balance. If my understanding is correct, your proposal says that Tx2 will always read the original A1.balance ($10) due to multiversioning.
Tx2 updates A1.balance - adds $2.
Tx1 is commited, A1.balance is $11.
Tx2 is commited. What happens now? Is A1.balance $12? The commit fails due to versioning conflict?
Tx3 reads A1.balance What is the result?

If we changed Step 3, so that Tx2 gets the updated value ($11), consider alternative scenario, where Tx1 is rolled back in step 5. Swapping the order of steps 2 and 3 makes the situation even more interesting.

Things get much complicated when we want to have a single abstraction for SQL and multiple NoSQL stores, because everybody use a very different approaches to achieve consistency. To name a few:

RDBMs (SQL) have full ACID transactions and have ways how to solve the problem described above, usually using locks. E.g. Tx1 could lock the whole Account1, so that Tx2 starts only after Tx1 is finished.
Riak uses vector clocks, writes are always successful and reads report consistency conflicts [1]. Tx3 would have to resolve the conflict created by Tx2 write.
MongoDB seems to use the "last-write-wins" and "commutative/foldable operations" strategies[2]. Since loopback converts "add $1" to "set $11", Tx2 creates inconsistent data.
CouchDB returns 409 Conflict when you attempt to save an update that is based on an outdated version[3]. Commit of Tx2 would fail with 409.

[1] https://docs.basho.com/riak/1.3.0rc1/references/appendices/comparisons/Riak-Compared-to-MongoDB/ (row Data Versioning and Consistency)
[2] https://blog.mongodb.org/post/520888030/on-distributed-consistency-part-5-many-writer
[3] https://wiki.apache.org/couchdb/HTTP_Document_API#PUT

ritch · 2013-12-20T16:26:11Z

Getting transactions right is a very tricky business.

Agreed.

When you have concurrent transactions against the same data, isolation can only be maintained by ensuring order. Locking makes this reliable but slow. Snapshot isolation is how I think we would be able to generically handle this. The idea is that a transaction would fail if any updates conflict. The upside is performance and a relatively simple implementation. The downside is users have to handle conflict errors for all transactions.

In step 4 (Tx2 updaiting a model that is tagged with a transaction.id) will cause a conflict error in step 6.
Tx2 will fail and all of its operations will be rolled back.
Tx3 would see a balance of $11.

raymondfeng · 2013-12-20T16:40:21Z

Let's take advanced features (such as 2PC, nested TXs, and NoSQL) out of the discussion and focus the simple 1PC case.

1PC transaction for RDBs need to use the same connection
We need to have a way to control the transaction boundary (begin --> commit/rollback)
We need to have a way to set and access the current transaction and associate the transaction with exactly one transaction
I assume as long as the TX is active, all SQL statements run against the connection will be part of the TX, no matter if they are executed in parallel or serial from Node's perspective.

bajtos · 2013-12-20T17:46:16Z

@raymondfeng Before taking NoSQL out of the discussion, please consider this: Because we don't support MongoDB's foldable operators and instead we always use $set which works in "last-writer-wins" manner, it's impossible to achieve data consistency in a vanilla LoopBack application at the moment.

Unless the plan is to have a set of transaction APIs, a different API for every database-type (SQL, Mongo, Couch, etc.). In which case it's ok to focus on SQL only. (But I don't think such set of APIs could be called "easy to work with".)

@ritch Failing transactions with a conflict is a possible solution. We will probably have to associate transaction.id with with Model.id, not a Model instance, because there may be multiple instances (JS objects) representing the same database record Model.id), but that's an implementation detail.

bajtos · 2013-12-26T09:15:12Z

I did a bit of thinking about this topic in the past days and would like to make few more points.

LoopBack is database agnostic, it makes it easy to use existing data in any database. Applying this approach to read/write consistency, LoopBack should make it easy to use the tools provided by the database, not invent a new one. This means transactions for SQL, foldable operators for Mongo, etc. (I do realise this is sort of a 180 degree turn from my previous comment).
Making LoopBack transactions a very thin layer on top of the native SQL transactions will give us several benefits. Most importantly, we can provide more features with less effort. Secondly, LoopBack users can re-use the skills and expertise they already have in their company, instead of learning how to work with yet another mechanism for maintaining data consistency.

I am proposing the following incremental steps:

1. Expose native low-level API

The first step is to allow LB users to re-use single connection for multiple commands/queries and allow them to call native API (send SQL commands, call Mongo's update with foldable operators, etc.).

This addresses the following points from Raymond's comment:

(1) 1PC transaction for RDBs need to use the same connection
(2) We need to have a way to control the transaction boundary (begin --> commit/rollback)

/*** Oracle **/
var oracle = loopback.createDataSource({ /* Oracle config */ });

// to avoid an extra callback, the connection is initialised lazily
// as part of sending the first request
var conn = oracle.getConnection();
conn.execute('START TRANSACTION', function(err) {
  if (err) { conn.close(); return fail(); }
  conn.execute(
    'SELECT balance FROM accounts WHERE id = 42',
    function(err, res) {
      if (err) { /* rollback, close connection, return fail */ }
      conn.execute(
        'UPDATE accounts SET balance = :bal WHERE id = 42',
        { bal: res[0].balance + 1 },
        function(err) {
          if (err) { /* rollback, close connection, return fail */ }
          conn.execute(
            'COMMIT',
            function(err) {
              conn.close();
              if (err) return fail();
              done();
            }
          );
        }
      );
    }
  );
});

/*** MongoDB ***/
var mongo = loopback.createDataSource({ /* MongoDB config */ });

var conn = mongo.getConnection();
conn.accounts.update({ _id: 42 }, { $inc: { balance: 1 } }, function(err) {
  conn.close();
  if (err) return fail();
  done();
});

2. Models

Once we have the low-level mechanisms available, we can integrate them with Models. As Raymond wrote:

(3) We need to have a way to set and access the current transaction and associate the transaction with exactly one transaction
(4) I assume as long as the TX is active, all SQL statements run against the connection will be part of the TX, no matter if they are executed in parallel or serial from Node's perspective.

I see two possible approaches here:

Keep using singletons, make a singleton-like Transaction object that will be used by all DAO objects (Model), probably as described by Ritchie in the beginning of this thread. Due to the async nature of Node, we have to come up with a mechanism how to save the Transaction instance when the flow enters an async call and how to restore the correct Transaction object when a callback is called later.
```
(time-line view)
[A] HTTP request comes in
[A] Tx1 is created
[A] (static DAO functions like findById use Tx1)  
[A] Tx1 calls async fn
[B] another HTTP request comes in
[B] Tx2 is created
[B] (static DAO functions use Tx2)
[B] Tx2 calls async fn
[A] async fn callback is called
[A] (static DAO functions use Tx1)
etc.
```
Session & repository pattern

Instead of using ActiveRecord pattern, switch to Repository pattern when running inside transactions. Because all static functions like findById are accessed via member properties, we don't need to worry about restoring global/singleton transaction on async boundaries.
```
var tx = db.createTransaction();
// tx.accounts.findById is equivalent to
// Account.findById within the transaction tx
tx.accounts.findById(10, function(err, res) {
  // etc.
  tx.commit();
});
```

3. REST

Provide a convenient helpers to automatically wrap certain remotable methods in a transaction. Note that this should be an optional detail built on top of existing APIs, because for more complex operations involving multiple external services, users should limit the duration of the db transaction to only the relevant part of the operation (i.e. don't hold db transaction open while you are waiting for a result from SOAP call).

This can be implemented via strong-remoting hooks as described by Ritchie, or even by extending LDL and adding a new configuration property.

fn.shared = true;
fn.wrapInTransaction = true;
fn.accepts = // etc.

ritch · 2013-12-30T03:29:44Z

Expose native low-level API

+1

ritch · 2013-12-30T03:31:34Z

... example above

var tx = db.createTransaction();
// tx.accounts.findById is equivalent to
// Account.findById within the transaction tx
tx.accounts.findById(10, function(err, res) {
  // etc.
  tx.commit();
});

This is a great idea.

ritch · 2013-12-30T03:34:06Z

LoopBack should make it easy to use the tools provided by the database, not invent a new one.

After contemplating this a bit more, this is where I landed as well. We don't have to relegate features for compatibility / abstraction. We should expose features as they exist and clearly explain the discrepancies.

FelipeAguayo · 2015-01-18T05:13:56Z

Have you implemented this? (transactions). because, I really need to implement this in my project

bajtos · 2015-01-19T07:30:04Z

Have you implemented this?

Not yet, it hasn't been a priority for us.

FelipeAguayo · 2015-01-19T12:32:41Z

Uhmmm, the data consistency should be a priority. Especially if the framework works with databases.

altsang · 2015-01-19T17:16:14Z

If the DB driver supports transactions you should be able to call natively from the LB connector. We're not implementing transactions at the Juggler level at this time. 2PC is difficult enough when the multiple instances of a data source support it, let alone across multiple disparate data sources.

Our priorities for @bajtos and the rest of the very small team are being driven by our customer obligations and community feedback such as yours. We appreciate the +1 in this area. If there's an urgent commercial need or support please feel free to contact our sales team.

FelipeAguayo · 2015-01-19T17:37:13Z

ok, thanks for the answers. I'll wait later versions.

vladmiller · 2015-03-13T09:11:17Z

Absence of transactions is a pain.

violet-day · 2015-03-16T01:58:35Z

👍

javiboo · 2015-04-15T15:25:32Z

+1

melochale · 2015-04-27T07:25:01Z

+1

SnehalSatardekar · 2015-05-07T12:18:52Z

Hello,

Have you implemented "Transactions" yet?

raymondfeng · 2015-05-07T15:43:11Z

Not yet. I'm in the process of refactoring the RDBMS connector implementations and it will become easier to add the transaction support.

antoineverger · 2015-05-12T07:12:54Z

+1

richardaum · 2015-11-05T12:46:32Z

At Strongloop, is possible only attach a transaction to a Model? Is possible to use multiple models operations inside a Transaction block without chaining transactions, as in SQL?

raymondfeng · 2015-11-05T16:45:12Z

There is no reliable way in Node to implicitly propagate the transaction context over the async invocation chain. As a result, we choose to use options arg to control the transaction scope.

visusharma · 2017-04-25T09:32:20Z

is there any to pass this options to native sql query also ?

bajtos · 2017-04-28T12:45:05Z

is there any to pass this options to native sql query also ?

@visusharma please open a new issue to discuss that use case.

raymondfeng added enhancement labels Feb 28, 2014

raymondfeng self-assigned this Feb 28, 2014

raymondfeng added this to the 2.0.0 milestone Feb 28, 2014

bajtos mentioned this issue Sep 22, 2014

[Q&A] Transactions #577

Closed

bajtos mentioned this issue Sep 30, 2014

Leftovers from 2.0 #391

Closed

47 tasks

bajtos modified the milestone: #Rel lb 2.0.0 Sep 30, 2014

raymondfeng added #tob and removed #tob labels May 12, 2015

raymondfeng added the #wip label May 12, 2015

chandadharap added #sprint70 and removed #wip labels May 12, 2015

raymondfeng closed this as completed May 19, 2015

raymondfeng removed the #wip label May 19, 2015

strongloop locked and limited conversation to collaborators Apr 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transaction Support #109

Transaction Support #109

ritch commented Dec 20, 2013

bajtos commented Dec 20, 2013

ritch commented Dec 20, 2013

raymondfeng commented Dec 20, 2013

bajtos commented Dec 20, 2013

bajtos commented Dec 26, 2013

ritch commented Dec 30, 2013

ritch commented Dec 30, 2013

ritch commented Dec 30, 2013

FelipeAguayo commented Jan 18, 2015

bajtos commented Jan 19, 2015

FelipeAguayo commented Jan 19, 2015

altsang commented Jan 19, 2015

FelipeAguayo commented Jan 19, 2015

vladmiller commented Mar 13, 2015

violet-day commented Mar 16, 2015

javiboo commented Apr 15, 2015

melochale commented Apr 27, 2015

SnehalSatardekar commented May 7, 2015

raymondfeng commented May 7, 2015

antoineverger commented May 12, 2015

richardaum commented Nov 5, 2015

raymondfeng commented Nov 5, 2015

visusharma commented Apr 25, 2017

bajtos commented Apr 28, 2017 •

edited

Loading

Transaction Support #109

Transaction Support #109

Comments

ritch commented Dec 20, 2013

bajtos commented Dec 20, 2013

ritch commented Dec 20, 2013

raymondfeng commented Dec 20, 2013

bajtos commented Dec 20, 2013

bajtos commented Dec 26, 2013

1. Expose native low-level API

2. Models

3. REST

ritch commented Dec 30, 2013

ritch commented Dec 30, 2013

ritch commented Dec 30, 2013

FelipeAguayo commented Jan 18, 2015

bajtos commented Jan 19, 2015

FelipeAguayo commented Jan 19, 2015

altsang commented Jan 19, 2015

FelipeAguayo commented Jan 19, 2015

vladmiller commented Mar 13, 2015

violet-day commented Mar 16, 2015

javiboo commented Apr 15, 2015

melochale commented Apr 27, 2015

SnehalSatardekar commented May 7, 2015

raymondfeng commented May 7, 2015

antoineverger commented May 12, 2015

richardaum commented Nov 5, 2015

raymondfeng commented Nov 5, 2015

visusharma commented Apr 25, 2017

bajtos commented Apr 28, 2017 • edited Loading

bajtos commented Apr 28, 2017 •

edited

Loading