-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couchdb 3288 mixed cluster upgrade #495
Conversation
src/couch/src/couch_db_int.hrl
Outdated
|
||
|
||
-record(new_pse_db, { | ||
name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend adding vsn
field and follow the convention that vsn is always a first field in the record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this I can see how it could be good for upgrades, I'll add it.
}). | ||
|
||
|
||
-define(NEW_PSE_DB, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend adding vsn
field and follow the convention that vsn is always a first field in the record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What convention is that? I know we pondered it once but I don't think I've ever actually seen that used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have this convention yet. I am just proposing to fix the index of a vsn field so we can write a generic get_record_version which would work with any versioned record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so its not a convention but you think its a good idea ;P I'm not against it but I wonder if we shouldn't make that change separate across a bunch of records rather than do it piece meal?
false -> get_index(Module, DbName, DDoc) | ||
end; | ||
get_index(Module, <<"shards/", _/binary>> = DbName, DDoc) | ||
when is_binary(DbName), is_record(DDoc, doc) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason to add is_binary(DbName)
guard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None. Will remove it. Probably got pulled in from a different clause or before I added the shards/ prefix check.
@@ -53,8 +53,7 @@ compact(State) -> | |||
{ok, Fd} = couch_mrview_util:open_file(CompactFName), | |||
ESt = couch_mrview_util:reset_index(Db, Fd, State), | |||
|
|||
{ok, DbReduce} = couch_btree:full_reduce(Db#db.id_tree), | |||
Count = element(1, DbReduce), | |||
{ok, Count} = couch_db:get_doc_count(Db), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a problem. The reduce result could be a tuple of either size 2 or 3 (depending on age of view file). get_doc_count uses match so it would crash for tuples of size 2. See https://github.com/cloudant/dbcore/commit/a452a4ab3a46f88a203d0eaac69317e5069f7b1f#diff-0568f6a6952a18b42d6a57157a930333R255. We've found such views in production IRC (@eiri knows more details).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. We'll want to patch couch_db and then couch_bt_engine for PSE for 3287
@@ -59,11 +59,11 @@ | |||
db_uri(#httpdb{url = Url}) -> | |||
couch_util:url_strip_password(Url); | |||
|
|||
db_uri(#db{name = Name}) -> | |||
db_uri(Name); | |||
db_uri(DbName) when is_binary(DbName) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add a clause with when is_list(DbName)
guard?
Is there any possibility that couch_db:name
would return list instead of binary (for example for fake_db or something).
In such case the current implementation would crash with badmatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there are a lot of places like this. Please ignore it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iilyak For clarity on these sorts of clause ordering changes I was trying to keep fairly consistent with the existing logic. Previously this either accepted a #db{}
or a binary. Though the binary function clause/badmatch would have come from binary_to_list rather the db_uri. So there are subtle changes like that but the domain of acceptable values should be identical.
clustered_db(DbName, UserCtx, []). | ||
|
||
clustered_db(DbName, UserCtx, SecProps) -> | ||
{ok, #db{name = DbName, user_ctx = UserCtx, security = SecProps}}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler just return the Db
instead of {ok, Db}
since it can never be anything but ok? Or can this return error at some point.
It will simplify some places where we have to use a separate line for {ok, Db} = couch_db:clustered_db(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that to mimic the couch_db:open/2 format. I've not got a huge preference either way though I'd lean towards keeping the same pattern for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
|
||
is_db(#db{}) -> | ||
true; | ||
is_db(_) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a special clause for the PSE db record here? Something like is_tuple()
then element(1, Db) =:= db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Narp. There are very few operations that we want to support on shared database records (and eventually I'd like to remove them but didn't want to mix PRs). But its limited on purpose to be name, user_ctx, and security objects that can be read.
StartTime -> | ||
true = ets:insert(couch_dbs, Db), | ||
true = ets:update_element(couch_dbs, DbName, {#entry.db, Db}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ets:update_element
will return false if it could not find the entry and it would crash on a bad match, while previously ets:insert
always returned true
and inserted the entry in the table (even if it wasn't in the table already).
Is there any change the entry could be deleted between the call to ets:lookup and ets:update_element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. We just asserted it was in the table when we extracted StartTime and this is the only process that modifies the table so it should be there. Or, rather, if its not there then we should crash hard cause ets has lost its mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Makes sense
1558cb7
to
18f5226
Compare
Rebased onto master for the new replicator and couch_server work. Also squashed the few requested changes to their respective commits. |
This is a nice cleanup of code even outside the context of PSE. More functions instead of records everywhere, I ran couchdyno's replicator tests https://github.com/cloudant-labs/couchdyno/blob/master/README_tests.md Then a few custom scripts which do concurrent db and doc creations. Here is one of them: https://gist.github.com/nickva/d0e0813c7039ba824489fb017159f1be It turns out 500 conflict errors is something we return on db creation sometimes, that was a surprise for me. But master behaves the same, so it's not an issue with this PR Then with Didn't find any issues during testing +1 Nice work @davisp |
fc3939c
to
56ce55a
Compare
This has been rebased onto master as of now. Is anyone against me merging this to master in preparation for merging PSE in a week or two? Note that this PR is totes not scary as its mostly just moving code and removing the public definition of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@davisp have at it, while we wait for Fauxton to be remediated, now is the perfect time to merge it. |
Since we're getting ready to add API functions to couch_db.erl now is a good time to clean up the exports list so that changes are more easily tracked. COUCHDB-3288
These functions were originally implemented in fabric_rpc.erl where they really didn't belong. Moving them to couch_db.erl allows us to keep the unit tests intact rather than just removing them now that the #db record is being made private. COUCHDB-3288
This removes introspection of the #db record by couch_server. While its required for the pluggable storage engine upgrade, its also nice to remove the hacky overloading of #db record fields for couch_server logic. COUCHDB-3288
COUCHDB-3288
COUCHDB-3288
This completes the removal of public access to the db record from the couch application. The large majority of which is removing direct access to the #db.name, #db.main_pid, and #db.update_seq fields. COUCHDB-3288
A mixed cluster (i.e., during a rolling reboot) will want to include this commit in a release before deploying PSE code to avoid spurious erros during the upgrade. COUCHDB-3288
This change is to account for differences in the #db record when a cluster is operating in a mixed version state (i.e., when running a rolling reboot to upgrade). There are only a few operations that are valid on #db records that are shared between nodes so rather than attempt to map the entire API between the old and new records we're limiting to just the required API calls. COUCHDB-3288
Previously attachment uploading from a PSE to non-PSE node would fail as the attachment streaming API changed between version. This commit handles downgrading attachment streams from PSE nodes so that non-PSE nodes can write them. COUCHDB-3288
56ce55a
to
49b05fe
Compare
* minor bump to release notes * Document Clouseau install process Co-authored-by: Joan Touzet <[email protected]>
Overview
This is preparatory work for adding pluggable storage engines. The motivation for this ground work is to enable users to perform rolling reboots of clusters. As such there are a few things that need to change to make that happen.
#db{}
#db{}
for forwards compatibilityThe removal of the
#db{}
definition is what's responsible for 99% of these changes. And while there is a decent amount of code churn here its nearly all quite straightforward changes fromDb#db.name
tocouch_db:name(Db)
type changes so theoretically this should be a pretty quick review.The only somewhat difficult bits are when we were matching a
#db{}
record in a function head. For the most part this was just to extract fields from the record so those are easily reviewed. Though there are a few cases in the replicator that were using it to tell the different between#db{}
and#httpd{}
. For those we have to switch the order of clauses to pull out the#httpd{}
logic first.Testing recommendations
There's no behavior changes so the existing test suite covers the existing logic.
JIRA issue number
https://issues.apache.org/jira/browse/COUCHDB-3288
Related Pull Requests
I'm going to open a PR for COUCHDB-3287 which is the main PSE work that will be asking to merge into this PR. This is purely for the sake of review-ability. When we go to merge these we'll merge this PR first, and then merge 3287 into master. I expect the review of this work to take at least a couple weeks since it is quite a large change to the core bit of the database.
Checklist