Skip to content
This repository has been archived by the owner on Sep 9, 2023. It is now read-only.

Discussion: Sequence ID validation race condition when requesting snapshot via REST #252

Closed
evan-coygo opened this issue Jan 14, 2021 · 6 comments

Comments

@evan-coygo
Copy link
Contributor

evan-coygo commented Jan 14, 2021

For exchanges that request the l2 snapshot over REST there is a pretty common race condition issue that can lead to missing order book data.

The problem w/example

I'll use Bittrex as an example since the issue happens often for me with Bittrex. The culprit code looks like this:

 _sendSubLevel2Updates(remote_id, market) {
    this._requestLevel2Snapshot(market);
    this._wss.send(
      JSON.stringify({
        H: "c3",
        M: "Subscribe",
        A: [[`orderbook_${remote_id}_${this.orderBookDepth}`]],
        I: ++this._messageId,
      })
    );
  }

The issue is as follows:

  1. I call subscribeLevel2Updates()
  2. First it makes a REST request to get the snapshot via this._requestLevel2Snapshot(market)
  3. While that REST request is sent, the ws "Subscribe" event is sent to subscribe to l2 updates.
  4. The REST call gets its response first, and emits a l2snapshot event with sequenceId of "100".
  5. The ws stream for l2 updates starts arriving, with the first event having a sequenceId of "105".

The snapshot has sequenceId of "100" while the first update has a sequenceId of "105". This means you missed four updates and your order book will never be valid.

Solutions

  • Subscribing to l2 updates first and waiting to receive the first l2 update before requesting the snapshot over REST would work and I believe this may be the best backwards-compatible fix. You would have to ignore any of the first few l2 updates that arrived w/a sequenceId before the snapshot's sequenceId, but this issue already exists w/snapshots over REST in the current implementation. I imagine something like this below (not run or tested):
 _sendSubLevel2Updates(remote_id, market) {
   let hasRequestedSnapshot = false;
   this.on('l2update', (data, updateMarket) => {
      if (hasRequestedSnapshot) return;
      // when first l2update arrives, request the snapshot so we know it will be at a point in time
      // that is >= when the first l2 update was received, to avoid a gap between snapshot and first l2 update
      if (updateMarket.id === market.id) {
        hasRequestedSnapshot = true;
        this._requestLevel2Snapshot(market);
      }
    });
    this._wss.send(
      JSON.stringify({
        H: "c3",
        M: "Subscribe",
        A: [[`orderbook_${remote_id}_${this.orderBookDepth}`]],
        I: ++this._messageId,
      })
    );
  }
  • As a solution outside of this library, the code using ccxws could just detect this scenario and call client._requestLevel2Snapshot(market) again to get the newer snapshot, then ignore the updates that arrive before the newer snapshot's sequenceId. This will work and is what I'm doing right now, but I'm wondering if there's a better way to handle this in a generic way.
  • If there is no solution then I think this scenario should at least be documented prominently somewhere.
@bmancini55
Copy link
Member

@evan-coygo, thanks for submitting the issue! Yeah, this is sort of a known issue that's shows its ugly head when you try to build certain books.

To rehash, the issue is that the socket stream starts after the sequenceId returned by the REST API. In many cases I think this is due to request caching of the REST API. So you end up with a slightly stale snapshot (lower sequenceId) than you have obtained via the socket stream. It could also just be that establishing socket subscriptions takes longer than the REST request for some exchanges. I digress on the cause.

From a code perspective, to avoid this issue, you need to perform the snapshot after the subscription has been confirmed and you've queued a sufficient number of messages (intentionally ambiguous as the sufficient depth is likely exchange and latency specific).

In the past I've recommend monkey patching a timestamp delay over _requestLevel2Snapshot(market) for the client:

const REST_DELAY_MS = 500
client._originalRequestLevel2Snapshot = client._requestLevel2Snapshot;
client._requestLevel2Snapshot = market => setTimeout(() => client._originalRequestLevel2Snapshot(market), REST_DELAY_MS);

However, this solution isn't foolproof and may introduce some funkiness on reconnections.

Ideally (as usually), changes after the refactor (#149) which will include subscription success/failure (#103) combined with firing the snapshot request after a delay after subscription success would be best.

As a fallback though, I think you need to be prepared to call _requestLevel2Snapshot if you receive a snapshot older than your update stream. Which totally should certainly be documented!

@evan-coygo
Copy link
Contributor Author

Alright thanks for the monkey-patch recommendation. I've added that as a temp fix and it seems to work okay-ish, albeit without a guarantee of success

@evan-coygo
Copy link
Contributor Author

This seems like it'll be around for awhile so I've added it to the README in a PR #253

@bmancini55
Copy link
Member

Awesome thanks! Will take a look shortly.

Aside: this would be a good issue to convert to a Discussion. Not exactly sure how though... the "convert to issue" link seems to have disappeared. Way to go GitHub haha.

@evan-coygo
Copy link
Contributor Author

I'll be honest, I didn't know that was a thing. Great idea though to avoid clutter in Issues. Discussion created here #255, I'm closing this issue

@bmancini55
Copy link
Member

Cool thanks! In some weird confluence of events, the "Convert to Discussion" button is back. Weird.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants