Couchdb 3287 pluggable storage engines #496

davisp · 2017-04-27T15:02:33Z

Overview

Pluggable Storage Engines (Oh my!)

I've finally covered enough bases to start asking for reviews on the pluggable storage engine work. I know that this is a fairly large change so I don't expect this to actually be merged for a number of weeks. So everyone that wants to review any part of this feel free to take your time and be thorough.

I highly suggest reviewing this work a commit at a time as there are a few fairly large commits. A few signposts for the intrepid reader:

Add couch_db_engine module

This is a single file addition that just introduces the new pluggable storage engine API. Its fairly thoroughly documented and hopefully makes sense as to the level that this API is geared at. When I initially started this work I spent a lot of time trying to figure out what level of the API would be the best place to separate the storage engine from database logic. Go too high and every engine is reimplementing core bits of logic. Go to low and there's not enough room for interesting changes to the actual storage engine algorithms. I believe this is a happy middle ground that gives storage engines the room to play and invent alternative implementations while also not requiring an extremely large amount of reimplementation for various bits of behavior that are required.

Add legacy storage engine implementation

This commit is quite big but its important to note that all its doing is copying the existing parts of couch_db.erl and couch_db_updater.erl and moving them to a new set of modules named couch_bt_engine_*.erl. Nothing uses this code yet. I have this in its own commit since its fairly large. However, if you mostly just want to check it you should be able to see that the implementation is just pulled from existing functions. The behavior of this engine is identical to the "pre-PSE" engine because that's what it is. Its just been reformatted a bit and had its name changed.

Also, I've kept this and couch_db_engine.erl in the main couch application since we'll always want to have at least one storage engine. Though this was pre-monorepo when adding another git repo to our builds seemed awfully heavy. Now that we're monorepo we're just creating a folder which is easy enough. For any reviewer I'd like to have them keep this in mind as one thing I expect to discuss is whether we should split this out into its own application.

Implement pluggable storage engines

This is the doozy. There are two main bits of work going on in this commit. First, the removal of all the code that was in the previous commit and second, the addition of all the code to have couch_db and couch_db_updater start using the couch_db_engine APIs. Its long and big but really if you take your time there's nothing magical going on here. For the core bits to study I'd recommend spending some extra time on couch_server to see how engines are configured and chosen at runtime.

Add storage engine test suite

This is a fairly complete test suite for the entire storage engine API. I don't remember what coverage of couch_bt_engine was but I know its fairly high. The useful bit about this is when devs are creating new storage engines they can just pull this in with a single eunit case to test their implementations. I'll show a few examples on this below.

Ensure deterministic revisions for attachments

I may end up squashing this into the implementation commit. This was a fix while I was developing PSE that I ended up refixing after the fact slightly differently. The original goal was to make sure everything compiled and all tests suites ran for each commit. I think I've just convinced myself to squash this. So if its missing its because I did that and then forgot to edit this description to remove this paragraph...

Testing recommendations

$ make check

JIRA issue number

https://issues.apache.org/jira/browse/COUCHDB-3287

Related Pull Requests

This PR is based on the previous mixed cluster upgrade PR. However when merge time comes around we'll merge #495 first and then this will be merged into master so that users can deploy the first changes before these changes. (Uptime is fun time).

#495

Checklist

Code is written and works correctly;
Changes are covered by tests;
Documentation reflects the changes;

(We should probably remove the todo item for updating rebar.config since we're doing almost all work on the monorepo now)

davisp · 2017-04-27T15:03:49Z

Also for documentation I'm not sure what we want to do there. Should I open a PR for the docs now or should we wait and let this soak on master and just let the more adventurous folk take cracks at running and writing new storage engines to make sure there's nothing major that needs reworked?

davisp · 2017-04-27T15:15:25Z

Also, I forgot to link out to the first complete alternative storage engines.

https://github.com/cloudant/couch_ngen

The couch_ngen engine (its a super terrible name I know) was one that I wrote fairly quickly just to try and prove that the API wasn't artificially restrictive. The biggest thing here is that I'm using multiple files for a single database to try and make things like compaction faster. This is the same general shape as the existing legacy engine with a few tweaks that we've considered over the years but never actually tried to implement due to upgrade concerns. This also uses a NIF for file I/O with dirty schedulers (thus your Erlang VM needs to have dirty schedulers enabled to run this engine). Initial benchmarks show this engine to be significantly faster than the legacy engine at the local node interface. Unfortunately those gains don't show up nearly as well at the clustered interface (so we've got more work to do there :)

https://github.com/cloudant-labs/lethe/tree/initial-branch

Lethe is the first engine written by !me. This work is by @chewbranca trying to investigate the storage engine API by actually using it to write a storage engine. Lethe is an ephemeral/in-memory storage engine. Think ramdisk database thing. Its not been benchmarked to the best of my knowledge but last I heard it was passing the storage engine test suite and could run most of the existing test suite (it doesn't support attachments which are optional in the API).

There are some other obvious candidates for storage engines as well that I've not tried (on purpose to let others play with something new and interesting). LevelDB/RocksDB, LMDB, PostgreSQL, SQLite, Bitcask, etc all come to mind as obvious things to play with. Also something I've been contemplating is writing a custom storage engine in C that is behaviorally similar to the existing code but hopefully significantly faster (though remember if speed is your goal we've got more work to do in the clustering layer :)

wohali · 2017-04-30T18:18:41Z

@davisp IMO write the docs and have that PR land at the same time as this one, or they might get left behind.

iilyak · 2017-05-03T19:26:31Z

src/couch/test/couch_stream_tests.erl

@@ -14,10 +14,11 @@

 -include_lib("couch/include/couch_eunit.hrl").

+-define(ENGINE, {couch_bt_engine_stream, {Fd, []}}).


The variable name is hard-coded in a define. Could we do something about it? For example:

-define(ENGINE(Var), {couch_bt_engine_stream, {Var, []}}).

Yeap, that's better.

iilyak · 2017-05-03T19:48:23Z

src/couch/src/couch_att.erl

@@ -233,6 +234,14 @@ transform(Field, Fun, Att) ->
 store(Field, NewValue, Att).


+copy(Att, DstStream) ->


I couldn't find any calls to this function.

Strange. Looks like I wrote it and then never used it. Will remove it.

iilyak · 2017-05-03T23:12:36Z

src/couch/src/couch_util.erl

@@ -62,6 +63,37 @@ normparts(["." | RestParts], Acc) ->
 normparts([Part | RestParts], Acc) ->
 normparts(RestParts, [Part | Acc]).

+


This looks very similar to https://github.com/erlang/otp/blob/master/lib/stdlib/src/filelib.erl#L103. Is there a reason to write own implementation? filelib's implementation considered slow (based on info from erlang mailing list) due to use of regexp. However this implementation uses it as well.

I'll read through the implementation some more tomorrow (maybe there is a reason).

Answering my own question:

This implementation doesn't do useless call to file:read_file_info(File).

Hah. So, it looks like I'm adding a comment to explain this cause it took me at least 10m of staring at this and the original to remember why I did this. While it does avoid the extra read_file_info, that's ancillary to the main change which is to accept both directories and files when matching the regular expression. This way a pluggable storage engine can be a directory or single file (or group of files not utilizing a directory structure for some odd reason).

garrensmith · 2017-05-04T09:35:48Z

Here are the steps to use this branch with a couch_ngen
Most of this is obvious for someone experienced with Erlang development. But for someone like me who is not that familiar with erlang, I needed to complete all these steps.

Erlang dirty scheduler

couch_ngen requires erlang with a dirty scheduler. So to install that on a mac:

brew install kerl
KERL_CONFIGURE_OPTIONS=--enable-dirty-schedulers kerl build 19.3 19.3-dirty
kerl install 19.3-dirty /Users/username/kerl/19.3-dirty
. /Users/username/kerl/19.3-dirty/activate

Setup dependancies

In the CouchDB repo add these two lines to the rebar.config.script in the DepDescs section:

%% PSE
{couch_ngen,  {url, "https://github.com/cloudant/couch_ngen"}, "45c73d3d822e7cf42fbe60cd3e316f4ff12c2399"},
{nifile,  {url, "https://github.com/cloudant/nifile"}, "d5f3b909846137f616cd5e754b810246802394e1"},

Then run rebar get-deps.

Edit local.ini

Edit `rel/overlay/etc/local.ini and add these settings in:

[couchdb]
default_engine = bt_engine

[couchdb_engines]
couch = couch_bt_engine
ngen = couch_ngen

This sets up the list of supported engines and which is the default engine. If you change default_engine=ngen then that will be the default storage engine used.

Important to note, don't add these settings to the bottom of the file if you are using dev/run it will break dev/run. Rather add it somewhere in the middle.

Build CouchDB

make clean && make to build a new version

Run CouchDB

dev/run --with-haproxy -a tester:testerpass

Create database with couch_ngen

If couch_ngen isn't the default storage engine. To create a database using that engine:

curl -X PUT https://testuser:[email protected]:5984/testdbname?engine=ngen

Boom! You now have multiple storage engines running.

davisp · 2017-05-04T15:34:00Z

Awesome, thanks for the notes @garrensmith!

iilyak · 2017-05-04T19:03:07Z

src/couch/test/couchdb_views_tests.erl

- DbFile = filename:join([DbDir, ?b2l(DbName) ++ ".couch"]),
- {ok, _} = file:copy(DbFile, DbFile ++ ".backup"),
- ok.
+ {ok, Db} = couch_db:open_int(DbName, []),


Should we abstract it away into get_filepath(DbName)?
In this case it could be used from both backup_db_file and restore_backup_db_files.

I'm not see what you're suggesting to abstract away for this.

iilyak · 2017-05-04T19:07:04Z

src/couch/src/couch_db.erl

@@ -414,37 +390,34 @@ get_before_doc_update_fun(#db{before_doc_update = Fun}) ->
 get_committed_update_seq(#db{committed_update_seq=Seq}) ->
 Seq.

-get_update_seq(#db{update_seq=Seq})->
- Seq.
+get_update_seq(#db{} = Db)->


Minor nitpick. Some of the get_ functions return {ok, Result} and some just Result.

This is a nit for me too but its historical and if I were to change it here the diff would spider out to all places that call these functions so I left them all as they were.

iilyak · 2017-05-04T20:57:43Z

src/couch_mrview/src/couch_mrview.erl

@@ -531,17 +541,25 @@ map_fold({{Key, Id}, Val}, _Offset, Acc) ->
 user_acc=UAcc1,
 last_go=Go
 }};
-map_fold({<<"_local/",_/binary>> = DocId, {Rev0, Body}}, _Offset, #mracc{} = Acc) ->
+map_fold(#doc{id = <<"_local/", _/binary>>} = Doc, _Offset, #mracc{} = Acc) ->


I had to stop here for quite some time. Leaving comment for future reviewers. This change would work because we open local btree differently. We pass split and join callbacks to couch_btree:open for local_tree now.

iilyak · 2017-05-04T21:09:59Z

src/couch_mrview/src/couch_mrview_util.erl

@@ -236,7 +237,9 @@ init_state(Db, Fd, State, Header) ->
 view_states=ViewStates
 } = Header,

- IdBtOpts = [{compression, couch_db:compression(Db)}],
+ IdBtOpts = [
+ {compression, couch_compress:get_compression_method()}


This would work since old implementation calls to couch_compress:get_compression_methods() from couch_db_updater:init_db. However this would mean that we wouldn't be able to support different compression methods for different databases. Also it wouldn't be possible to delegate the decision on a compression algorithm to the storage engine (if we are not dispatching through couch_engine).

I'm a bit confused by this comment but I think you have it right. To make sure we agree this was the intent:

The pluggable storage engine API avoids dictating a compression algorithm to use so that each storage engine can choose its own approach to compression. For instance a LevelDB engine could just rely on LevelDB applying snappy under the covers.

However, since we don't have a notion of "database compression" we now have to change the scheme for view compression. Given that the existing approach is to just set it to the result of couch_compress:get_compression_method() (via what is returned when a database is opened, and more to the point not when it was created) this is effectively not a behavior change.

iilyak · 2017-05-04T21:17:59Z

src/couch_mrview/src/couch_mrview_util.erl

@@ -840,6 +843,8 @@ maybe_convert_reductions({KVs0, UserReductions}) ->

 maybe_convert_kv({<<"_local/", _/binary>> = DocId, _}) ->


Shouldn't we convert this clause to maybe_convert_kv(#doc{id = <<"_local/", _/binary>>} = Doc) ->?

Its an upgrade clause for the time being. The #doc version will already be caught on line 846.

iilyak · 2017-05-04T22:04:56Z

src/couch/src/couch_server.erl

+exists(DbName) ->
+ RootDir = config:get("couchdb", "database_dir", "."),
+ Engines = get_configured_engines(),
+ Possible = lists:foldl(fun({Extension, Engine}, Acc) ->


We have exactly same lines in get_engine(Server, DbName) function. We can create a common function such as:

select_candidate_engines(DbName, RootDir, Engines)

select_possible_engines(DbName, RootDir, Engines)

nickva · 2017-05-12T22:57:55Z

Let's rebase it on master when you have a chance.

davisp · 2017-05-13T03:28:24Z

Rebased. Forgot to push earlier. Just have that one function I'm not sure about where it went. I'm guessing the couch_server revert but will have to look harder on Monday. This branch is rebased on master but only loosely. I haven't done a full make check yet so there might be a lingering issue or two.

nickva · 2017-05-16T16:57:26Z

I ran couchdyno's replication tests against PSE and they all passed:

/test.sh -x
==================================================================================== test session starts =====================================================================================
platform darwin -- Python 2.7.13, pytest-3.0.3, py-1.4.33, pluggy-0.4.0 -- /Users/nvatama/src/couchdyno/venv/bin/python
cachedir: .cache
rootdir: /Users/nvatama/src/couchdyno, inifile: pytest.ini
collected 47 items

tests/test_rep_00_basics.py::test_basic_continuous PASSED
tests/test_rep_00_basics.py::test_basic_10_continuous PASSED
tests/test_rep_00_basics.py::test_basic_10_continuous_1000_docs PASSED
tests/test_rep_00_basics.py::test_basic_normal PASSED
tests/test_rep_00_basics.py::test_basic_10_normal PASSED
tests/test_rep_00_basics.py::test_basic_attachment PASSED
tests/test_rep_00_basics.py::test_basic_js_filter PASSED
tests/test_rep_01_patterns.py::test_n_to_n_pattern[False-False] PASSED
tests/test_rep_01_patterns.py::test_n_to_n_pattern[False-True] PASSED
tests/test_rep_01_patterns.py::test_n_to_n_pattern[True-False] PASSED
tests/test_rep_01_patterns.py::test_n_to_n_pattern[True-True] PASSED
tests/test_rep_01_patterns.py::test_1_to_n_pattern PASSED
tests/test_rep_01_patterns.py::test_n_to_1_pattern PASSED
tests/test_rep_01_patterns.py::test_n_chain_pattern PASSED
tests/test_rep_01_patterns.py::test_all_pattern_continuous PASSED
tests/test_rep_02_filter.py::test_n_to_n_view_filter[False-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_view_filter[False-1] PASSED
tests/test_rep_02_filter.py::test_n_to_n_view_filter[True-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_view_filter[True-1] PASSED
tests/test_rep_02_filter.py::test_n_to_n_js_filter[False-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_js_filter[False-1] PASSED
tests/test_rep_02_filter.py::test_n_to_n_js_filter[True-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_js_filter[True-1] PASSED
tests/test_rep_02_filter.py::test_n_to_n_doc_ids_filter[False-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_doc_ids_filter[False-1] PASSED
tests/test_rep_02_filter.py::test_n_to_n_doc_ids_filter[True-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_doc_ids_filter[True-1] PASSED
tests/test_rep_02_filter.py::test_n_to_n_mango_filter[False-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_mango_filter[False-1] PASSED
tests/test_rep_02_filter.py::test_n_to_n_mango_filter[True-None] PASSED
tests/test_rep_02_filter.py::test_n_to_n_mango_filter[True-1] PASSED
tests/test_rep_03_attachments.py::test_attachments[False-64-1] PASSED
tests/test_rep_03_attachments.py::test_attachments[False-64-100] PASSED
tests/test_rep_03_attachments.py::test_attachments[False-100000-1] PASSED
tests/test_rep_03_attachments.py::test_attachments[False-100000-100] PASSED
tests/test_rep_03_attachments.py::test_attachments[False-attachments4-1] PASSED
tests/test_rep_03_attachments.py::test_attachments[False-attachments5-100] PASSED
tests/test_rep_03_attachments.py::test_attachments[True-64-1] PASSED
tests/test_rep_03_attachments.py::test_attachments[True-64-100] PASSED
tests/test_rep_03_attachments.py::test_attachments[True-100000-1] PASSED
tests/test_rep_03_attachments.py::test_attachments[True-100000-100] PASSED
tests/test_rep_03_attachments.py::test_attachments[True-attachments10-1] PASSED
tests/test_rep_03_attachments.py::test_attachments[True-attachments11-100] PASSED
tests/test_rep_04_cluster.py::test_migration_on_node_failure SKIPPED
tests/test_rep_05_migration.py::test_migration_error PASSED
tests/test_rep_05_migration.py::test_migration_triggered PASSED
tests/test_rep_05_migration.py::test_migration_downgrade_failed SKIPPED

======================================================================================= 45 passed, 2 skipped in 13390.16 seconds ========================================================================================

Ninja edit from @davisp: CouchDyno can be found here: https://github.com/cloudant-labs/couchdyno

tonysun83 · 2017-05-17T05:54:33Z

@davisp: I followed @garrensmith 's setup and created a db using the ngen engine. It's probably documented somewhere, but how do I verify that the db created is using another engine or are we hiding that abstraction? For instance, I can do
https://localhost:15984/testdbblah?engine=blahgen
and that will work just fine.

garrensmith · 2017-05-17T09:41:55Z

@tonysun83 each storage engine should have a different file extension. So if you do https://localhost:15984/testdb?engine=ngen and then look at the files you will see a few db files with a .ngen extension.

From what I understand of this code

couchdb/src/couch/src/couch_server.erl

Lines 629 to 642 in 6375cd4

 get_engine(Server, DbName) -> 

 #server{ 

 root_dir = RootDir, 

 engines = Engines 

 } = Server, 

 Possible = get_possible_engines(DbName, RootDir, Engines), 

 case Possible of 

 [] -> 

 get_default_engine(Server, DbName); 

 [Engine] -> 

 Engine; 

 _ -> 

 erlang:error(engine_conflict) 

 end.

If you do ?engine=blahgen and that engine is not supported it will fallback to the configured default engine.

garrensmith · 2017-05-17T15:30:47Z

With the / endpoint we have a features field that will show if the CouchDB instance supports the new replication api. I think it might be worthwhile adding another field that shows which storage engines are supported.

davisp · 2017-05-17T15:37:54Z

@tonysun83 The shard's db info blob should include the engine, however this wasn't included in the clustered db info because I had originally had plans on being able to upgrade the storage engine for a database. However that's not been written yet but if it were then a clustered db could have multiple engines involved for different shards and we don't currently surface shard specific information.

@garrensmith That's correct. It currently just defaults to the configured default engine. And if nothing is configured the default is the legacy couch engine.

Good call on the features list. That was added after the majority of the PSE dev but that'd be a good thing to support and as I recall it was fairly simple. I'll take a look at adding that today.

tonysun83 · 2017-05-17T18:55:05Z

@davisp: given the negative scenario above with https://localhost:15984/testdbblah?engine=blahgen, a 200 was returned saying the db was created successfully, but it seems it didn't default to use the couch_bt_engine engine. I get a 500 saying internal_server_error : No DB shards could be opened. I checked and there were no .couch files for that particular db. Upon retrying to create the database(with and without the engine parameter), I get a 412 saying the db is already created.

tonysun83 · 2017-05-17T19:42:33Z

@davisp @nickva : I'm not sure if there has been work to rebase the latest scheduler replicator with PSE. but I'm seeing some function clauses and weird behavior in my weird scenario.

Create a _replicator database using ngen engine (it worked, but not sure we support this?)
Create a normal db and add some documents.
Run a replication with create_target:true that creates a new ngen db

{
  "_id": "ngen_test",
  "source": "https://adm:pass@localhost:15984/mydb",
  "target": "https://adm:pass@localhost:15984/repngen?engine=ngen",
  "create_target": true
}

So after I did this, the odd behavior was that my original mydb had 7 documents, but the repngen keeps increasing in number of docs. Currently, it's at 20. and the original mydb has not changed at all. Must be something to do with the update_seq. Log is provided below.

https://gist.github.com/tonysun83/b600e855ea0d3ae0e6ef92e388cf05c0

davisp · 2017-05-17T22:41:44Z

@tonysun83 Hrm, I'll add a test for that case and try and track down what's causing the 500's with the bad engine. For that and for the replicator function clause its probably a bad rebase or refactor somewhere.

For the replication you're trying I'd expect that to not work I think. Adding URL strings to replication URLs seems like it'd not work so hot. Can you check what documents have been added? My guess is that the URL string is screwing things up and the replicator ends up posting random data that gets a UUID since the target database would interpret everything after the ? as a query parameter. Once its created any post would probably be interpreted as a request to create a new document. And that's not PSE specific. A similar test on not-PSE would be to try and specify a Q or N parameter in the target URL. I'd expect similar results there.

jiangphcn · 2017-09-27T08:36:59Z

src/couch/src/couch_db_updater.erl

- MSecSinceLastRead = couch_file:msec_since_last_read(Fd),
- case MSecSinceLastRead > IdleLimitMSec of
+ LastActivity = couch_db_engine:last_activity(Db),
+ DtMSec = time:now_diff(os:timestamp(), LastActivity) div 1000,


Should we need to use timer:now_diff/2 instead of time:now_diff/2?

jiangphcn · 2017-09-27T08:44:03Z

src/couch/src/couch_bt_engine.erl

+
+
+last_activity(#st{fd = Fd}) ->
+ couch_file:get_last_read(Fd).


There is no couch_file:get_last_read/1. Should we need to change to couch_file:last_read/1?

davisp · 2018-02-21T15:11:20Z

Has anyone got any reason for me to not merge this in 72h or whenever? I've rebased it against master and its passing CI checks. As far as I remember everyone was happy with the current state for the V1 pluggable storage engines.

eiri

Passes for me locally as well.
+1 to merge from me.

jiangphcn · 2018-02-27T08:33:02Z

+1 from my side. All tests passed locally.

This relied on having introspection into the state of a #doc{} record that is no longer valid with PSE. This is a harmless change as it was only necessary when performing a rolling reboot to a version that included the new get_meta_body_size function. We can remove it now because anyone wanting to do a rolling reboot with PSE will have to step through a version that includes commit aee57bf which includes the previous upgrade logic as well.

This is the primary API for pluggable storage engines. This module serves as both a behavior and a call dispatch module for handling the engine state updates. COUCHDB-3287

This is the legacy storage engine code. I've kept it as part of the core couch application because we'll always need to have at least one storage engine available. COUCHDB-3287

This change moves the main work of storage engines to run through the new couch_db_engine behavior. This allows us to replace the storage engine with different implementations that can be tailored to specific work loads and environments. COUCHDB-3287

This allows other storage engine implementations to reuse the same exact test suite without having to resort to shenanigans like keeping vendored copies up to date. COUCHDB-3287

This re-fixes a corner case when recreating a document with an attachment in a single multipart request. Since we don't detect that we need a new revision until after the document has been serialized we need to be able to deserialize the body so that we can generate the same revisions regardless of the contents of the database. If we don't do this then we end up including information from the position of the attachment on disk in the revision calculation which can introduce branches in the revision tree. I've left this as a separate commit from the pluggable storage engine work so that its called out clearly for us to revisit. COUCHDB-3255

davisp · 2018-02-28T15:15:25Z

Last rebase. Will merge once CI comes back green.

wohali added the dbcore label Apr 30, 2017

wohali added the feature label Apr 30, 2017

iilyak reviewed May 3, 2017

View reviewed changes

iilyak reviewed May 4, 2017

View reviewed changes

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch 2 times, most recently from 1558cb7 to 18f5226 Compare May 12, 2017 15:33

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from 30d29f4 to 559cd0b Compare May 13, 2017 03:27

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from 559cd0b to 6375cd4 Compare May 15, 2017 18:36

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch 2 times, most recently from d07c28d to 4e693f3 Compare September 12, 2017 15:32

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from d3fdfcf to b5d3e6f Compare September 12, 2017 20:08

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch 2 times, most recently from d7943f3 to fc3939c Compare September 13, 2017 14:51

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from b5d3e6f to 1bb435e Compare September 13, 2017 15:10

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch from fc3939c to 56ce55a Compare September 13, 2017 16:13

jiangphcn reviewed Sep 27, 2017

View reviewed changes

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch from 56ce55a to 49b05fe Compare September 27, 2017 19:57

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from 1bb435e to 5260a77 Compare January 31, 2018 21:05

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch 2 times, most recently from 1c8b846 to 742ffed Compare February 20, 2018 20:52

davisp changed the base branch from COUCHDB-3288-mixed-cluster-upgrade to master February 21, 2018 14:56

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from 742ffed to 5439cc4 Compare February 21, 2018 14:58

eiri approved these changes Feb 21, 2018

View reviewed changes

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from 5439cc4 to 065185f Compare February 23, 2018 20:33

davisp added 6 commits February 28, 2018 09:14

Add couch_db_engine module

4c4e2bc

This is the primary API for pluggable storage engines. This module serves as both a behavior and a call dispatch module for handling the engine state updates. COUCHDB-3287

Add legacy storage engine implementation

43ed699

This is the legacy storage engine code. I've kept it as part of the core couch application because we'll always need to have at least one storage engine available. COUCHDB-3287

Implement pluggable storage engines

2f138cf

This change moves the main work of storage engines to run through the new couch_db_engine behavior. This allows us to replace the storage engine with different implementations that can be tailored to specific work loads and environments. COUCHDB-3287

Add storage engine test suite

214a529

This allows other storage engine implementations to reuse the same exact test suite without having to resort to shenanigans like keeping vendored copies up to date. COUCHDB-3287

davisp force-pushed the COUCHDB-3287-pluggable-storage-engines branch from 065185f to 360c4df Compare February 28, 2018 15:15

janl added this to the 2.2.0 milestone Feb 28, 2018

davisp merged commit 49d4194 into master Feb 28, 2018

wohali deleted the COUCHDB-3287-pluggable-storage-engines branch July 23, 2018 23:38

		@@ -14,10 +14,11 @@

		-include_lib("couch/include/couch_eunit.hrl").

		-define(ENGINE, {couch_bt_engine_stream, {Fd, []}}).

		@@ -233,6 +234,14 @@ transform(Field, Fun, Att) ->
		store(Field, NewValue, Att).


		copy(Att, DstStream) ->

		@@ -62,6 +63,37 @@ normparts(["." \| RestParts], Acc) ->
		normparts([Part \| RestParts], Acc) ->
		normparts(RestParts, [Part \| Acc]).

		@@ -840,6 +843,8 @@ maybe_convert_reductions({KVs0, UserReductions}) ->

		maybe_convert_kv({<<"_local/", _/binary>> = DocId, _}) ->

Couchdb 3287 pluggable storage engines #496

Couchdb 3287 pluggable storage engines #496

Conversation

davisp commented Apr 27, 2017 • edited Loading

Overview

Testing recommendations

JIRA issue number

Related Pull Requests

Checklist

davisp commented Apr 27, 2017

davisp commented Apr 27, 2017

wohali commented Apr 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

garrensmith commented May 4, 2017 • edited Loading

Erlang dirty scheduler

Setup dependancies

Edit local.ini

Build CouchDB

Run CouchDB

Create database with couch_ngen

davisp commented May 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iilyak May 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nickva commented May 12, 2017

davisp commented May 13, 2017

nickva commented May 16, 2017 • edited by davisp Loading

tonysun83 commented May 17, 2017 • edited Loading

garrensmith commented May 17, 2017

garrensmith commented May 17, 2017

davisp commented May 17, 2017

tonysun83 commented May 17, 2017 • edited Loading

tonysun83 commented May 17, 2017 • edited Loading

davisp commented May 17, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davisp commented Feb 21, 2018

eiri left a comment

Choose a reason for hiding this comment

jiangphcn commented Feb 27, 2018

davisp commented Feb 28, 2018

davisp commented Apr 27, 2017 •

edited

Loading

garrensmith commented May 4, 2017 •

edited

Loading

iilyak May 4, 2017 •

edited

Loading

nickva commented May 16, 2017 •

edited by davisp

Loading

tonysun83 commented May 17, 2017 •

edited

Loading

tonysun83 commented May 17, 2017 •

edited

Loading

tonysun83 commented May 17, 2017 •

edited

Loading