Fdb b+tree reduce #3018

garrensmith · 2020-07-21T11:56:24Z

Overview

Reduce on FDB using ebtree. This is a new reduce implementation that uses the ebtree to do builtin reduce.
This requires that #3017 is merged first.

Testing recommendations

Related Issues or Pull Requests

Checklist

Code is written and works correctly
Changes are covered by tests
Any new configurable parameters are documented in rel/overlay/etc/default.ini
A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation

rnewson · 2020-07-22T18:30:25Z

src/couch_views/src/couch_views_reduce_fdb.erl

+
+
+create_val(Key, Val) ->
+    KeySize = erlang:external_size(Key),


if this is intended to be the billing size, it should be the couch_ejson_size:encoded_size/1 function like currently. check with eric who changed the external size away from external_size the last time.

I've followed how the sizing is calculated in couch_views_indexer. There is a note there to change to couch_ejson_size:encoded_size/. So we should probably change both in a separate PR.

garrensmith · 2020-07-23T07:46:19Z

This PR is ready for a first review. Some notes about the design:

I don't store the map values in the b-tree. I only store the reduce results. That means the leaf nodes contain the size of the reduce values and the reduce values. I didn't want do store duplicates of the map k/v in the b-tree that doesn't make any sense to me.
I have a separate b-tree for each reduce function. So if we have a map function that has multiple reduce functions each reduce is in its own b-tree. I've done this to limit the size of the k/v's and nodes in the b-tree so that we are less likely to exceed any of FDB's k/v size limits when we have a higher order for a b-tree.
I don't have very good tests yet for the size calculation. I'm still trying to decide the best approach for that.

Implements built in reduces using ebtree.

rnewson

I've left a few comments on the code itself but my main comments are on the design.

your 1 and 2 strike me as the wrong approach though I think I see your reasoning. The first items claim to "store the reduce results" is misleading to the casual reader as, without something like ebtree and its storage of data on inner nodes, it's not possible to calculate useful intermediation reductions. What I think you're doing is reducing the k-v's emitted by a single document? If so, that is not something that CouchDB has done to date and seems to have limited value, certainly in the common case that a map function emits one row per document.

A simple design that uses less space over all would be to insert the emitted keys and values directly into ebtree and pass in a reducer function that calculates each of the desired reductions specified, and store those in a tuple or a map. couch_views can then call ebtree's functions for lookup (?key=), ranges (for ?startkey=X&endkey=Y&reduce=false) and the various reduce functions as needed (group=true, group_level=X, reduce=true). This ensures we only store each distinct thing once and the logic gets much simpler.

As for exceeding the fdb size limits, that is a valid concern and we must tackle it head on. Once we're able to test this more easily (i.e, after this PR) we'll need to figure out the useful ranges for Order. I suspect we will also add the "chunking" that we do for doc bodies, that is, splitting a #node across multiple k-v entries if they get excessively large. I note that Adam has posted on the dev list with some thoughts about moving some of the node state out into their own rows. This occurred to me throughout the development of ebtree and is certainly worth attempting. The tree itself should be small, it's a design choice in ebtree that the emitted keys, values and intermediate reductions are stored inside the fdb value of the inner node. That can be changed fairly easily if warranted.

Finally, on insertion generally, if it's possible to do even a small amount of batching we'll reap considerable performance rewards. For the case where we update the view atomically with the document, obviously that can't happen, but for new indexes it would be good if we would update 10 or 20 documents per fdb txn that involves an ebtree. Insert performance would be approximately 10x / 20x faster.

rnewson · 2020-07-23T08:08:43Z

src/couch_views/src/couch_views_reduce_fdb.erl

+        tx := Tx
+    } = TxDb,
+
+    [View] = lists:filter(fun(View) -> View#mrview.id_num == ViewId end, Views),


View = lists:keyfind(ViewId, #mrview.id_num, Views), is the idiomatic way to do this.

rnewson · 2020-07-23T08:15:23Z

src/couch_views/src/couch_views_reduce_fdb.erl

+    end.
+
+
+% The reduce values are stored as keys in the b-tree


Can you clarify here? I think the reduce value you've calculated outside of ebtree is the reduction over the k-v's emitted by a single document?

garrensmith · 2020-07-23T10:10:57Z

Thanks for taking a look @rnewson

your 1 and 2 strike me as the wrong approach though I think I see your reasoning. The first items claim to "store the reduce results" is misleading to the casual reader as, without something like ebtree and its storage of data on inner nodes, it's not possible to calculate useful intermediation reductions. What I think you're doing is reducing the k-v's emitted by a single document? If so, that is not something that CouchDB has done to date and seems to have limited value, certainly in the common case that a map function emits one row per document.

A simple design that uses less space over all would be to insert the emitted keys and values directly into ebtree and pass in a reducer function that calculates each of the desired reductions specified, and store those in a tuple or a map. couch_views can then call ebtree's functions for lookup (?key=), ranges (for ?startkey=X&endkey=Y&reduce=false) and the various reduce functions as needed (group=true, group_level=X, reduce=true). This ensures we only store each distinct thing once and the logic gets much simpler.

I'm currently doing this, if a document emits the following from a map function:

([1, 1], 1)
([1, 1], 2)
([3, 3], 1)

Then the reduce results for a _sum would be:

([1, 1], 3)
([3, 3], 1)

I would then store those values like this in ebtree

(([1, 1], doc_id), (KVSize, 3))
(([3, 3], doc_id), (KVSize, 1))

I know this is different to how CouchDB had done in the previously. But I don't see what we would gain from storing the map index a second time. Querying the current map index should be faster than reading from the b-tree since we can do a range scan of the map index. I'm really not comfortable using ebtree for the map index. We would need to determine how performant it would be to do that and what advantages we would get. I would really like to keep ebtree just for the reduce part. I'm very cautious on the idea of using it outside of that. I think if we start loosing a lot of the functionality we get with FDB.

Finally, on insertion generally, if it's possible to do even a small amount of batching we'll reap considerable performance rewards. For the case where we update the view atomically with the document, obviously that can't happen, but for new indexes it would be good if we would update 10 or 20 documents per fdb txn that involves an ebtree. Insert performance would be approximately 10x / 20x faster.

This is already done in couch_views_indexer.

garrensmith · 2021-02-25T16:26:21Z

No longer needed

garrensmith changed the base branch from master to prototype/fdb-layer July 21, 2020 11:56

garrensmith force-pushed the fdb-btree-reduce branch 3 times, most recently from 736b4c6 to a0df7ba Compare July 22, 2020 14:32

rnewson reviewed Jul 22, 2020

View reviewed changes

garrensmith marked this pull request as ready for review July 23, 2020 07:41

FDB reduce using ebtree

279b9d9

Implements built in reduces using ebtree.

garrensmith force-pushed the fdb-btree-reduce branch from a0df7ba to 279b9d9 Compare July 23, 2020 07:54

rnewson requested changes Jul 23, 2020

View reviewed changes

davisp mentioned this pull request Aug 12, 2020

Prototype/fdb layer ebtree views #3073

Closed

4 tasks

davisp mentioned this pull request Sep 18, 2020

Feature ebtree views #3164

Merged

4 tasks

wohali changed the base branch from prototype/fdb-layer to main October 21, 2020 18:09

garrensmith closed this Feb 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fdb b+tree reduce #3018

Fdb b+tree reduce #3018

garrensmith commented Jul 21, 2020

rnewson Jul 22, 2020

garrensmith Jul 23, 2020

garrensmith commented Jul 23, 2020

rnewson left a comment •

edited

Loading

rnewson Jul 23, 2020

rnewson Jul 23, 2020

garrensmith commented Jul 23, 2020

garrensmith commented Feb 25, 2021

Fdb b+tree reduce #3018

Fdb b+tree reduce #3018

Conversation

garrensmith commented Jul 21, 2020

Overview

Testing recommendations

Related Issues or Pull Requests

Checklist

rnewson Jul 22, 2020

Choose a reason for hiding this comment

garrensmith Jul 23, 2020

Choose a reason for hiding this comment

garrensmith commented Jul 23, 2020

rnewson left a comment • edited Loading

Choose a reason for hiding this comment

rnewson Jul 23, 2020

Choose a reason for hiding this comment

rnewson Jul 23, 2020

Choose a reason for hiding this comment

garrensmith commented Jul 23, 2020

garrensmith commented Feb 25, 2021

rnewson left a comment •

edited

Loading