Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast-forward replication through transitive checkpoint analysis #3675

Open
kocolosk opened this issue Jul 21, 2021 · 1 comment
Open

Fast-forward replication through transitive checkpoint analysis #3675

kocolosk opened this issue Jul 21, 2021 · 1 comment

Comments

@kocolosk
Copy link
Member

Summary

I'd like to be able to choose the starting sequence for a replication between a given source and target using more information than just the replication history between those two databases. Specifically, I'd like to be able to use other replication checkpoint histories to discover transitive relationships that could be used to accelerate the first replication between CouchDB databases that share a common peer.

Desired Behaviour

It might be simplest to provide an example. Consider a system where you have a pair of cloud sites (call them us-east and us-west) and a series of edge locations (e.g. store1):

  • us-east and us-west are replicating with each other
  • store1 is pulling data from us-east
  • us-east experiences an outage, so we respond by initiating us-west -> store1

In the current version of CouchDB, the us-west -> store1 replication will start from 0 because those peers have no replication history between them. Going forward, it would be useful for us to recognize that us-west -> us-east has a history, and us-east -> store1 has a history, so we can fast-forward us-west -> store1 by analyzing the pair of those checkpoint histories to discover the maximum sequence on us-west guaranteed to have been observed on store1 (by way of us-east).

Possible Solution

I believe we actually already employ this transitive analysis for fast-forwarding internal replications between shard copies in a cluster, so we may be able to refactor some of that code to apply it more generally.

I'm not sure if we track the target sequence in the current external replication checkpoint schema. That's essential for this analysis to work.

There's nothing fundamental that limits the analysis to first-order transitive relationships. One could build out an entire graph. I'm not sure the extra complexity that would bring is worth it in a first pass.

Additional context

Proposing this enhancement after chatting with a user who is planning this kind of deployment and would benefit from the enhancement.

@nickva
Copy link
Contributor

nickva commented Jul 21, 2021

I think it might be doable if we record a few more bits in the checkpoint documents and change their shape a bit.

  • That's true that we don't record target sequences in checkpoints [1]. We could add that and bump the checkpoint version format.

  • I think we'd need to uniquely identify endpoint instances by a UUID. Then, we could add a new type of checkpoints, in addition to the current ones, which might look like _local/rep_checkpoint_$fromid_$toid. In main, we recently added a per-db instance "uuid" field[2] already! Now wondering if it would this be possible to add it to 3.x dbs too. A new field the _dbs docs perhaps...?

  • To have second-order transitive checks, we may have to add a step during checkpointing where the replication jobs would copy all the _local/rep_checkpoint_$fromid_$toid checkpoint docs from the source to the target. So when us-west -> store1 is initiated, we'd look for common $fromid -...-> store1 and $fromid -...-> us-west checkpoints and find us-east -> store1 (on store1 target) and us-east -> us-west (on us-west source).

  • One thing I am not 100% sure on is if we could always compare sequences to pick the max start sequence. On main update_seqs are FDB versionstamps, on 3.x we have shard nodes and uuids in there. We could have case of a random graph with PouchDb, main and 3.x clustered CouchDB endpoints and such.

[1] example checkpoint:

{
    "_id": "_local/d99e532c1129e9cacbf7ed085deca509",
    "_rev": "0-17",
    "history": [
        {
            "doc_write_failures": 0,
            "docs_read": 249,
            "docs_written": 249,
            "end_last_seq": "249-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE5tygQLsZiaGqWlpxpjKcRqRxwIkGRqA1H-oSeVgk5JMkkxNkg0xdWUBAJ5nJWc",
            "end_time": "Wed, 21 Jul 2021 17:10:06 GMT",
            "missing_checked": 253,
            "missing_found": 249,
            "recorded_seq": "249-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE5tygQLsZiaGqWlpxpjKcRqRxwIkGRqA1H-oSeVgk5JMkkxNkg0xdWUBAJ5nJWc",
            "session_id": "dc645ae85a7c3fe6c3ac5da8e73077ce",
            "start_last_seq": "228-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE0tzgQLsZiaGqWlpxpjKcRqRxwIkGRqA1H-oSflgk5JMkkxNkg0xdWUBAJgFJVI",
            "start_time": "Wed, 21 Jul 2021 17:01:21 GMT"
        },
        ...
    ],
    "replication_id_version": 4,
    "session_id": "dc645ae85a7c3fe6c3ac5da8e73077ce",
    "source_last_seq": "249-g1AAAACTeJzLYWBgYMpgTmHgz8tPSTV0MDQy1zMAQsMckEQiQ1L9____szKYE5tygQLsZiaGqWlpxpjKcRqRxwIkGRqA1H-oSeVgk5JMkkxNkg0xdWUBAJ5nJWc"
}

[2] Unique, per db-instance UUID on main

http $DB/mydb1

{
    ...
    "instance_start_time": "0",
    "sizes": {
        "external": 34,
        "views": 0
    },
    "update_seq": "00000008d5c93d5a00000000",
    "uuid": "ce0279e40045b4f7cd6cd4f60ffd3b3c"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants