Bad Security Object Error After Moving Shards #1611

arifcse019 · 2018-09-19T20:37:44Z

Security objects for some databases fail to sync properly in a new node after all shards are moved from an old node.

Expected Behavior

Security objects for all databases should sync when shards are moved to a new node

Current Behavior

Security objects for some databases fail to sync properly in a new node after all shards are moved from an old node. The log says things like:

" [error] 2018-09-19T17:24:05.388202Z [email protected] <0.19944.2> -------- Bad security object in <<"db-name">>: [{{[{<<"_id">>,<<"_security">>},{<<"admins">>,{[{<<"names">>,[]},{<<"roles">>,[]}]}},{<<"members">>,{[{<<"names">>,[<<"user-name">>]},{<<"roles">>,[]}]}}]},13},{{[]},7}] "

Steps to Reproduce (for bugs)

Add a new node to the cluster
Move All Shards from an old node to this new one
Shut down and delete the old node
Verify Security Objects on all databases

Context

We are trying to replace couch cluster instances with new instances as part of preparing for a scenario where one instance can go away abruptly

Your Environment

Version used: Couch 2.1.2, 3 node cluster

wohali · 2018-09-19T23:52:51Z

@arifcse019 Can you provide a minimal example for us with a script that uses cURL? I've personally performed the steps you mention above (1-4) many times and have never run across this.

What technique are you using to move the shards? There is newly updated documentation on the approved approach online, can you follow that?

http:https://docs.couchdb.org/en/stable/cluster/sharding.html

arifcse019 · 2018-09-20T15:52:16Z

@wohali I am using the following two ruby scripts to move shards: first to update cluster metadata to add the shards to the new node, second one to update cluster metadata to stop looking for those shards in the old one.

https://gist.github.com/arifcse019/43a638e4ce837b029d62d59fd0b9a20f (move_shards_in.rb)
https://gist.github.com/arifcse019/c8a7096275e16d344f6c53ad884716ea (move_shards_out.rb)

And the steps are as I described in my issue. These two scripts are run as part of step 2

skeyby · 2020-07-30T11:05:36Z

Hello, I'm facing the same issue on CouchDB 3.1:

[error] 2020-07-30T10:47:23.718632Z [email protected] <0.23319.995> -------- Bad security object in <<"_users">>: [{{[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}]},8},{{[{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}]},8}]

This is the path I followed:

I added a new node to the cluster
I added all the shards/nodes to the metadata for the db
I invoked _sync_shards for the database

The shards correctly appeared on the other node but the security object got lost and the error started to appear in the log.

I went in the db with Fauxton and the permission were back to basic _admin/_admin.

I modified them and the error has now gone away.

So my guess is that there's something missing in the _sync_shards code when it comes to copying database permissions.

kripper · 2020-10-17T05:44:17Z

Same problem here.

Steps to reproduce:

Create a DB on single-node.
Add permissions to DB
Add a second-node to the cluster
Add the second-node to all shards

Current Behavior:

The DB is visible in the second-node, but it's security object is not synchronized/copied from the first-node.

Expected Behaviour:

The DB on the second-node should see the same permissions as in the first-node. When you change permissions on the first-node, they are copied to the second-node. The same is expected when applying the steps above.

Version:

couchdb-3.1.1-1.el7.x86_64 running on CentOS 7

sergey-safarov · 2024-03-09T17:01:44Z

More info about the error message.
Normal database security should look like

[root@esrp-0a ~]# curl -s http:https://login:[email protected]:5984/db_name/_security| jq
{
  "members": {
    "roles": [
      "_admin"
    ]
  },
  "admins": {
    "roles": [
      "_admin"
    ]
  }
}

For database where error present the same command produce

[root@esrp-0a ~]# curl -s http:https://login:[email protected]:5984/db_name/_security| jq
{}

sergey-safarov · 2024-03-09T17:50:12Z

To display the status of security objects on my server I have created a script. Required to edit login and pass in the script.

#!/bin/sh

db_url=http:https://login:[email protected]:5984
fix_db=false

escape_dbname() {
	local DBNAME=$1
	echo $DBNAME | sed -e 's:/:%2f:g' -e 's:\+:%2B:'
}

security_json() {
cat << EOF
{"members":{"roles":["_admin"]},"admins":{"roles":["_admin"]}}
EOF
}

get_db_list() {
curl -s ${db_url}/_all_dbs | jq -r '.[]'
}

check_db_security() {
	local dbname=$1
	local esc_dbname=$(escape_dbname ${dbname})
	curl -s ${db_url}/${esc_dbname}/_security | jq 'if . == {} then false else true end'
}

maybe_fix_db_security() {
	local dbname=$1
	local esc_dbname=$(escape_dbname ${dbname})
	if [ "${fix_db}" == "false" ]; then
		echo "need to fix database: ${dbname}"
		return
	fi
	echo "fixing database: ${dbname}"
	security_json | curl -X PUT -H 'content-type: application/json' -H 'accept: application/json' -d@- -s ${db_url}/${esc_dbname}/_security
}

for i in $(get_db_list)
do
	sec_status=$(check_db_security $i)
	if [ "${sec_status}" == "false" ]; then
		maybe_fix_db_security $i
	fi
done

To fix security objects need to set "fix_db" variable to "true" value.

wohali added bug and removed bug labels Sep 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad Security Object Error After Moving Shards #1611

Bad Security Object Error After Moving Shards #1611

arifcse019 commented Sep 19, 2018

wohali commented Sep 19, 2018

arifcse019 commented Sep 20, 2018 •

edited

Loading

skeyby commented Jul 30, 2020

kripper commented Oct 17, 2020 •

edited

Loading

sergey-safarov commented Mar 9, 2024 •

edited

Loading

sergey-safarov commented Mar 9, 2024

Bad Security Object Error After Moving Shards #1611

Bad Security Object Error After Moving Shards #1611

Comments

arifcse019 commented Sep 19, 2018

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Context

Your Environment

wohali commented Sep 19, 2018

arifcse019 commented Sep 20, 2018 • edited Loading

skeyby commented Jul 30, 2020

kripper commented Oct 17, 2020 • edited Loading

Steps to reproduce:

Current Behavior:

Expected Behaviour:

Version:

sergey-safarov commented Mar 9, 2024 • edited Loading

sergey-safarov commented Mar 9, 2024

arifcse019 commented Sep 20, 2018 •

edited

Loading

kripper commented Oct 17, 2020 •

edited

Loading

sergey-safarov commented Mar 9, 2024 •

edited

Loading