Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Implement Partitions api #126

Open
garrensmith opened this issue Oct 3, 2018 · 4 comments
Open

[RFC] Implement Partitions api #126

garrensmith opened this issue Oct 3, 2018 · 4 comments
Assignees

Comments

@garrensmith
Copy link
Member

garrensmith commented Oct 3, 2018

We are currently adding partition support to CouchDB apache/couchdb#1605
This document details how I think the api for partitions should work for nano.

Partitions allow a user to store related documents into a partition within CouchDB. Using the new partition endpoints this would then allow a user to query only documents in a specific partition. This leads to much faster query time as CouchDB only needs to fetch the documents from a subset of the shards for a db.

To store a document in a partition a user prefixes an id with the partition name e.g {_id: "partition1:my-doc", "field": "one"} and {_id: "partition2:my-doc", "field": "one"}.

Then using a tradional view to query the document you would use these endpoints:
Map/Reduce:
/my-db/_partition/partition1/_design/my-view

And for Mango:
/my-db/_partition/partition1/_find

The idea around partitions, which I've hopefully conveyed really quickly above, is that data in a partition is quite separate and when a database is partitioned a user would work with each partition separately. I would like to reflect that kind of thinking in the api. So I propose that we would add a new function called partition which accepts a partition name and returns an object for you to query a specific partition. Hopefully the below example explains it.

await nano.db.create('db1', {partition: true});
const db = nano.use('db1');
await db.insert({
      views: {
        aview: {
          map: "function(doc) {\n  if (doc.group) {\n    emit([doc.some, doc.group], 1);\n }\n}",
          reduce: "_count"
        }
      }
    }
}, '_design/example-query');


db.insert({some: "field"}, 'partition1:doc1');
db.insert({some: "field2"}, 'partition1:doc2');
db.
// This goes in partition 2
db.insert({some: "field2"}, 'partition2:doc1');

const partition1 = db.partition('partition1');

// This will only return doc with id `partition1:doc1`
const docs = await partition1.find({
   selector: {
     some: "field"
   }
});

const docs2 = await partition1.view("example-query", "aview", {include_docs: true});

const partition2 = db.partition('partition2');
// A view can be used for each partition
const docsFromPartition2 = await partition1.view("example-query", "aview", {include_docs: true});

The new partition object would support all the .find and .view options to query with and internally would remember the name of the partition to use when querying.

Currently we don't support _all_docs or changes.

@glynnbird
Copy link
Contributor

If a user has a "partition object" e.g.

const partition1 = db.partition('partition1');

then it might make sense for them to be able to do all CRUD operations:

  • partition1.insert(doc, [params) - insert and update
  • partition1.get(name) - fetch single doc
  • partition1.destroy(docname, rev) - delete single doc
  • partition1.bulk(docname, rev)- bulk C/U/D

This mechanism allows the partition to be expanded in future to support _all_docs and _changes endpoints if they were to be implemented on the partition level.

@glynnbird
Copy link
Contributor

It's also worth noting that the Nano library includes the search endpoint which models the Cloudant-specific Lucene search API. It might be worth allowing partition1.search(...) too.

@garrensmith
Copy link
Member Author

@glynnbird good point. I think we should add search and I like the idea of using insert, get, destroy and bulk. I'm guessing when we do that they would not supply the partition we would automatically insert it in?

@glynnbird
Copy link
Contributor

I think so. If someone does partition1.insert({ _id: 'bob', x: 45 }), the document _id would be manipulated to add the partition prefix. Same story for the other operations. The "partition" object in Nano "knows" the partition you are working with so it knows what prefix to add to each document id.

@glynnbird glynnbird self-assigned this Nov 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants