Couchdb 3288 mixed cluster upgrade #495

davisp · 2017-04-27T14:40:19Z

Overview

This is preparatory work for adding pluggable storage engines. The motivation for this ground work is to enable users to perform rolling reboots of clusters. As such there are a few things that need to change to make that happen.

Remove the public definition of #db{}
Add a new fabric_rpc:create_db/2 that accepts an options list
Create a template of the new #db{} for forwards compatibility

The removal of the #db{} definition is what's responsible for 99% of these changes. And while there is a decent amount of code churn here its nearly all quite straightforward changes from Db#db.name to couch_db:name(Db) type changes so theoretically this should be a pretty quick review.

The only somewhat difficult bits are when we were matching a #db{} record in a function head. For the most part this was just to extract fields from the record so those are easily reviewed. Though there are a few cases in the replicator that were using it to tell the different between #db{} and #httpd{}. For those we have to switch the order of clauses to pull out the #httpd{} logic first.

Testing recommendations

There's no behavior changes so the existing test suite covers the existing logic.

JIRA issue number

https://issues.apache.org/jira/browse/COUCHDB-3288

Related Pull Requests

I'm going to open a PR for COUCHDB-3287 which is the main PSE work that will be asking to merge into this PR. This is purely for the sake of review-ability. When we go to merge these we'll merge this PR first, and then merge 3287 into master. I expect the review of this work to take at least a couple weeks since it is quite a large change to the core bit of the database.

Checklist

Code is written and works correctly;
Changes are covered by tests;
Documentation reflects the changes;

iilyak · 2017-04-28T15:49:23Z

src/couch/src/couch_db_int.hrl

+
+
+-record(new_pse_db, {
+ name,


I would recommend adding vsn field and follow the convention that vsn is always a first field in the record.

Thinking about this I can see how it could be good for upgrades, I'll add it.

iilyak · 2017-04-28T15:49:33Z

src/couch/src/couch_db_int.hrl

+}).
+
+
+-define(NEW_PSE_DB, {


I would recommend adding vsn field and follow the convention that vsn is always a first field in the record.

What convention is that? I know we pondered it once but I don't think I've ever actually seen that used anywhere.

We don't have this convention yet. I am just proposing to fix the index of a vsn field so we can write a generic get_record_version which would work with any versioned record.

Ah, so its not a convention but you think its a good idea ;P I'm not against it but I wonder if we shouldn't make that change separate across a bunch of records rather than do it piece meal?

iilyak · 2017-04-28T16:17:13Z

src/couch_index/src/couch_index_server.erl

- false -> get_index(Module, DbName, DDoc)
- end;
+get_index(Module, <<"shards/", _/binary>> = DbName, DDoc)
+ when is_binary(DbName), is_record(DDoc, doc) ->


What is the reason to add is_binary(DbName) guard?

None. Will remove it. Probably got pulled in from a different clause or before I added the shards/ prefix check.

iilyak · 2017-04-28T18:06:53Z

src/couch_mrview/src/couch_mrview_compactor.erl

@@ -53,8 +53,7 @@ compact(State) ->
 {ok, Fd} = couch_mrview_util:open_file(CompactFName),
 ESt = couch_mrview_util:reset_index(Db, Fd, State),

- {ok, DbReduce} = couch_btree:full_reduce(Db#db.id_tree),
- Count = element(1, DbReduce),
+ {ok, Count} = couch_db:get_doc_count(Db),


This could be a problem. The reduce result could be a tuple of either size 2 or 3 (depending on age of view file). get_doc_count uses match so it would crash for tuples of size 2. See https://github.com/cloudant/dbcore/commit/a452a4ab3a46f88a203d0eaac69317e5069f7b1f#diff-0568f6a6952a18b42d6a57157a930333R255. We've found such views in production IRC (@eiri knows more details).

Ah. We'll want to patch couch_db and then couch_bt_engine for PSE for 3287

iilyak · 2017-04-28T18:16:45Z

src/couch_replicator/src/couch_replicator_api_wrap.erl

@@ -59,11 +59,11 @@
 db_uri(#httpdb{url = Url}) ->
 couch_util:url_strip_password(Url);

-db_uri(#db{name = Name}) ->
- db_uri(Name);
+db_uri(DbName) when is_binary(DbName) ->


should we add a clause with when is_list(DbName) guard?
Is there any possibility that couch_db:name would return list instead of binary (for example for fake_db or something).
In such case the current implementation would crash with badmatch.

It seems like there are a lot of places like this. Please ignore it.

@iilyak For clarity on these sorts of clause ordering changes I was trying to keep fairly consistent with the existing logic. Previously this either accepted a #db{} or a binary. Though the binary function clause/badmatch would have come from binary_to_list rather the db_uri. So there are subtle changes like that but the domain of acceptable values should be identical.

nickva · 2017-05-09T21:28:32Z

src/couch/src/couch_db.erl

+ clustered_db(DbName, UserCtx, []).
+
+clustered_db(DbName, UserCtx, SecProps) ->
+ {ok, #db{name = DbName, user_ctx = UserCtx, security = SecProps}}.


Would it be simpler just return the Db instead of {ok, Db} since it can never be anything but ok? Or can this return error at some point.

It will simplify some places where we have to use a separate line for {ok, Db} = couch_db:clustered_db(...)

I did that to mimic the couch_db:open/2 format. I've not got a huge preference either way though I'd lean towards keeping the same pattern for consistency.

Makes sense

nickva · 2017-05-09T21:30:03Z

src/couch/src/couch_db.erl

+
+is_db(#db{}) ->
+ true;
+is_db(_) ->


Do we need a special clause for the PSE db record here? Something like is_tuple() then element(1, Db) =:= db

Narp. There are very few operations that we want to support on shared database records (and eventually I'd like to remove them but didn't want to mix PRs). But its limited on purpose to be name, user_ctx, and security objects that can be read.

nickva · 2017-05-11T05:44:38Z

src/couch/src/couch_server.erl

 StartTime ->
- true = ets:insert(couch_dbs, Db),
+ true = ets:update_element(couch_dbs, DbName, {#entry.db, Db}),


ets:update_element will return false if it could not find the entry and it would crash on a bad match, while previously ets:insert always returned true and inserted the entry in the table (even if it wasn't in the table already).

Is there any change the entry could be deleted between the call to ets:lookup and ets:update_element?

Nope. We just asserted it was in the table when we extracted StartTime and this is the only process that modifies the table so it should be there. Or, rather, if its not there then we should crash hard cause ets has lost its mind.

Got it. Makes sense

davisp · 2017-05-12T16:07:47Z

Rebased onto master for the new replicator and couch_server work. Also squashed the few requested changes to their respective commits.

nickva · 2017-05-12T22:50:15Z

This is a nice cleanup of code even outside the context of PSE. More functions instead of records everywhere, couch_server looks more sane, some new helpers in couch_db module.

I ran couchdyno's replicator tests https://github.com/cloudant-labs/couchdyno/blob/master/README_tests.md

Then a few custom scripts which do concurrent db and doc creations. Here is one of them:

https://gist.github.com/nickva/d0e0813c7039ba824489fb017159f1be

It turns out 500 conflict errors is something we return on db creation sometimes, that was a surprise for me. But master behaves the same, so it's not an issue with this PR

Then with ./dev/run setup killed node3 and re-spawned it with master code and ran some tests on the mixed cluster.

Didn't find any issues during testing

+1

Nice work @davisp

davisp · 2017-09-13T16:15:28Z

This has been rebased onto master as of now. Is anyone against me merging this to master in preparation for merging PSE in a week or two? Note that this PR is totes not scary as its mostly just moving code and removing the public definition of the #db{} record.

iilyak

+1

wohali · 2017-09-20T08:10:42Z

@davisp have at it, while we wait for Fauxton to be remediated, now is the perfect time to merge it.

Since we're getting ready to add API functions to couch_db.erl now is a good time to clean up the exports list so that changes are more easily tracked. COUCHDB-3288

These functions were originally implemented in fabric_rpc.erl where they really didn't belong. Moving them to couch_db.erl allows us to keep the unit tests intact rather than just removing them now that the #db record is being made private. COUCHDB-3288

This removes introspection of the #db record by couch_server. While its required for the pluggable storage engine upgrade, its also nice to remove the hacky overloading of #db record fields for couch_server logic. COUCHDB-3288

COUCHDB-3288

This completes the removal of public access to the db record from the couch application. The large majority of which is removing direct access to the #db.name, #db.main_pid, and #db.update_seq fields. COUCHDB-3288

A mixed cluster (i.e., during a rolling reboot) will want to include this commit in a release before deploying PSE code to avoid spurious erros during the upgrade. COUCHDB-3288

This change is to account for differences in the #db record when a cluster is operating in a mixed version state (i.e., when running a rolling reboot to upgrade). There are only a few operations that are valid on #db records that are shared between nodes so rather than attempt to map the entire API between the old and new records we're limiting to just the required API calls. COUCHDB-3288

Previously attachment uploading from a PSE to non-PSE node would fail as the attachment streaming API changed between version. This commit handles downgrading attachment streams from PSE nodes so that non-PSE nodes can write them. COUCHDB-3288

* minor bump to release notes * Document Clouseau install process Co-authored-by: Joan Touzet <[email protected]>

davisp mentioned this pull request Apr 27, 2017

Couchdb 3287 pluggable storage engines #496

Merged

3 tasks

iilyak reviewed Apr 28, 2017

View reviewed changes

wohali added dbcore enhancement labels Apr 30, 2017

nickva reviewed May 9, 2017

View reviewed changes

nickva reviewed May 11, 2017

View reviewed changes

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch 2 times, most recently from 1558cb7 to 18f5226 Compare May 12, 2017 15:33

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch 6 times, most recently from fc3939c to 56ce55a Compare September 13, 2017 16:13

iilyak approved these changes Sep 13, 2017

View reviewed changes

davisp added 6 commits September 27, 2017 14:57

Reorganize exports from couch_db.erl

bd13ca2

Since we're getting ready to add API functions to couch_db.erl now is a good time to clean up the exports list so that changes are more easily tracked. COUCHDB-3288

Update couch_server to not use the db record

81705fb

This removes introspection of the #db record by couch_server. While its required for the pluggable storage engine upgrade, its also nice to remove the hacky overloading of #db record fields for couch_server logic. COUCHDB-3288

Add a test helper for creating fake db records

fe13871

COUCHDB-3288

Avoid bad match on really old databases

1c5e136

COUCHDB-3288

Remove public access to the db record

7b329b4

This completes the removal of public access to the db record from the couch application. The large majority of which is removing direct access to the #db.name, #db.main_pid, and #db.update_seq fields. COUCHDB-3288

davisp and others added 3 commits September 27, 2017 14:57

Add clause for mixed cluster upgrades

af06128

A mixed cluster (i.e., during a rolling reboot) will want to include this commit in a release before deploying PSE code to avoid spurious erros during the upgrade. COUCHDB-3288

davisp force-pushed the COUCHDB-3288-mixed-cluster-upgrade branch from 56ce55a to 49b05fe Compare September 27, 2017 19:57

davisp merged commit c07f3f5 into master Sep 27, 2017

nickva deleted the COUCHDB-3288-mixed-cluster-upgrade branch December 18, 2019 19:46

nickva pushed a commit to nickva/couchdb that referenced this pull request Sep 7, 2022

Document Clouseau install process (apache#495)

dcffd3c

* minor bump to release notes * Document Clouseau install process Co-authored-by: Joan Touzet <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Couchdb 3288 mixed cluster upgrade #495

Couchdb 3288 mixed cluster upgrade #495

davisp commented Apr 27, 2017 •

edited

Loading

iilyak Apr 28, 2017

davisp May 10, 2017

iilyak Apr 28, 2017

davisp Apr 28, 2017

iilyak Apr 28, 2017

davisp Apr 28, 2017

iilyak Apr 28, 2017

davisp Apr 28, 2017

iilyak Apr 28, 2017

davisp Apr 28, 2017

iilyak Apr 28, 2017

iilyak Apr 28, 2017

davisp Apr 28, 2017

nickva May 9, 2017 •

edited

Loading

davisp May 10, 2017

nickva May 10, 2017

nickva May 9, 2017 •

edited

Loading

davisp May 10, 2017

nickva May 11, 2017

davisp May 12, 2017

nickva May 12, 2017

davisp commented May 12, 2017

nickva commented May 12, 2017

davisp commented Sep 13, 2017

iilyak left a comment

wohali commented Sep 20, 2017



		-record(new_pse_db, {
		name,

		}).


		-define(NEW_PSE_DB, {

Couchdb 3288 mixed cluster upgrade #495

Couchdb 3288 mixed cluster upgrade #495

Conversation

davisp commented Apr 27, 2017 • edited Loading

Overview

Testing recommendations

JIRA issue number

Related Pull Requests

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nickva May 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nickva May 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davisp commented May 12, 2017

nickva commented May 12, 2017

davisp commented Sep 13, 2017

iilyak left a comment

Choose a reason for hiding this comment

wohali commented Sep 20, 2017

davisp commented Apr 27, 2017 •

edited

Loading

nickva May 9, 2017 •

edited

Loading

nickva May 9, 2017 •

edited

Loading