Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response of /_up is Not valid Json on one Node #5009

Closed
Sliosh opened this issue Mar 22, 2024 · 5 comments · Fixed by #5025
Closed

Response of /_up is Not valid Json on one Node #5009

Sliosh opened this issue Mar 22, 2024 · 5 comments · Fixed by #5025

Comments

@Sliosh
Copy link
Contributor

Sliosh commented Mar 22, 2024

Description

We have a CouchDB Cluster with 3 nodes. If we query /_up we get good responses on node 2 and 3, but on node 1 we get invalid json.

{"status":"ok","seeds":{"[email protected]":{"timestamp":"2024-03-22T14:06:46.891827Z","last_replication_status":"error"},"[email protected]":{"timestamp":"2024-03-22T14:06:47.909114Z","last_replication_status":"ok","pending_updates":{"_nodes":0,"_dbs":0,"_users":0}},"[email protected]":{}}}

The Key [email protected] is duplicated in the output and because of that, json parsers error out.

Steps to Reproduce

I don't know. I can reproduce this on our non prod cluster and can provide more logs or test things if needed.

Expected Behaviour

I expect the Json to be valid, just like the on below, returned by node3
{"status":"ok","seeds":{"node1.cust.local":{"timestamp":"2024-03-22T09:39:10.048682Z","last_replication_status":"ok","pending_updates":{"_nodes":0,"_dbs":0,"_users":0}},"[email protected]":{}}}

Your Environment

Three Nodes in one Cluster. The config of the affected node is down below. The config for all nodes are the same, except the node name and uuids and so on.

vm.args

# use this file except in compliance with the License. You may obtain a copy of
# the License at
#
#   https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under
# the License.

# Each node in the system must have a unique name. These are specified through
# the Erlang -name flag, which takes the form:
#
#    -name nodename@<FQDN>
#
# or
#
#    -name nodename@<IP-ADDRESS>
#
# CouchDB recommends the following values for this flag:
#
# 1. If this is a single node, not in a cluster, use:
#    -name [email protected]
#
# 2. If DNS is configured for this host, use the FQDN, such as:
#    -name [email protected]
#
# 3. If DNS isn't configured for this host, use IP addresses only, such as:
#    -name [email protected]
#
# Do not rely on tricks with /etc/hosts or libresolv to handle anything
# other than the above 3 approaches correctly. They will not work reliably.
#
# Multiple CouchDBs running on the same machine can use couchdb1@, couchdb2@,
# etc.
-name [email protected]

# All nodes must share the same magic cookie for distributed Erlang to work.
# Uncomment the following line and append a securely generated random value.
-setcookie 'asdfgh'

# Which interfaces should the node listen on?
-kernel inet_dist_use_interface {0,0,0,0}

# Tell kernel and SASL not to log anything
-kernel error_logger silent
-sasl sasl_error_logger false

# This will toggle to true in Erlang 25+. However since we don't use global
# any longer, and have our own auto-connection module, we can keep the
# existing global behavior to avoid surprises. See
# https://github.com/erlang/otp/issues/6470#issuecomment-1337421210 for more
# information about possible increased coordination and messages being sent on
# disconnections when this setting is enabled.
#
-kernel prevent_overlapping_partitions false

# Increase the pool of dirty IO schedulers from 10 to 16
# Dirty IO schedulers are used for file IO.
+SDio 16

# Increase distribution buffer size from default of 1MB to 32MB. The default is
# usually a bit low on busy clusters. Has no effect for single-node setups.
# The unit is in kilobytes.
+zdbbl 32768

# When running on Docker, Kubernetes or an OS using CFS (Completely Fair
# Scheduler) with CPU quota limits set, disable busy waiting for schedulers to
# avoid busy waiting consuming too much of Erlang VM's CPU time-slice shares.
#+sbwt none
#+sbwtdcpu none
#+sbwtdio none

# Comment this line out to enable the interactive Erlang shell on startup
+Bd -noinput

# Set maximum SSL session lifetime to reap terminated replication readers
-ssl session_lifetime 300

## TLS Distribution
## Use TLS for connections between Erlang cluster members.
## https://erlang.org/doc/apps/ssl/ssl_distribution.html
##
## Generate Cert(PEM) File
## This is just an example command to generate a certfile (PEM).
## This is not an endorsement of specific expiration limits, key sizes, or algorithms.
##    $ openssl req -newkey rsa:2048 -new -nodes -x509 -days 3650 -keyout key.pem -out cert.pem
##    $ cat key.pem cert.pem > dev/erlserver.pem && rm key.pem cert.pem
##
## Generate a Config File (couch_ssl_dist.conf)
##    [{server,
##      [{certfile, "</path/to/erlserver.pem>"},
##       {secure_renegotiate, true}]},
##     {client,
##      [{secure_renegotiate, true}]}].
##
## CouchDB recommends the following values for no_tls flag:
## 1. Use TCP only, set to true, such as
-couch_dist no_tls true
## 2. Use TLS only, set to false, such as:
##      -couch_dist no_tls false
## 3. Specify which node to use TCP, such as:
##      -couch_dist no_tls \"*@127.0.0.1\"
##
## To ensure search works, make sure to set 'no_tls' option for the clouseau node.
## By default that would be "[email protected]".
## Don't forget to override the paths to point to your certificate(s) and key(s)!
##
#-proto_dist couch
#-couch_dist no_tls '"[email protected]"'
#-ssl_dist_optfile <path/to/couch_ssl_dist.conf>

# Enable FIPS mode
#   https://www.erlang.org/doc/apps/crypto/fips.html
#   Ensure that:
#    - Erlang is built with --enable-fips configuration option
#    - Crypto library (e.g. OpenSSL) supports this mode
#
# When the mode is successfully enabled "Welcome" message should show `fips`
# in the features list.
#
#-crypto fips_mode true

# OS Mon Settings

# only start disksup
-os_mon start_cpu_sup false
-os_mon start_memsup false

# Check disk space every 5 minutes
-os_mon disk_space_check_interval 5

# don't let disksup send alerts
-os_mon disk_almost_full_threshold 1.0
local.ini

; Custom settings should be made in this file. They will override settings
; in default.ini, but unlike changes made to default.ini, this file won't be
; overwritten on server upgrade.

[couchdb]
;max_document_size = 4294967296 ; bytes
;os_process_timeout = 5000
uuid = b68e360edfb3ca7a105f8d9d61785848

[couch_peruser]
; If enabled, couch_peruser ensures that a private per-user database
; exists for each document in _users. These databases are writable only
; by the corresponding user. Databases are in the following form:
; userdb-{hex encoded username}
;enable = true

; If set to true and a user is deleted, the respective database gets
; deleted as well.
;delete_dbs = true

; Set a default q value for peruser-created databases that is different from
; cluster / q
;q = 1

[chttpd]
;port = 5984
bind_address = 0.0.0.0

; Options for the MochiWeb HTTP server.
;server_options = [{backlog, 128}, {acceptor_pool_size, 16}]

; For more socket options, consult Erlang's module 'inet' man page.
;socket_options = [{sndbuf, 262144}, {nodelay, true}]

[httpd]
; NOTE that this only configures the "backend" node-local port, not the
; "frontend" clustered port. You probably don't want to change anything in
; this section.
; Uncomment next line to trigger basic-auth popup on unauthorized requests.
;WWW-Authenticate = Basic realm="administrator"

; Uncomment next line to set the configuration modification whitelist. Only
; whitelisted values may be changed via the /_config URLs. To allow the admin
; to change this value over HTTP, remember to include {httpd,config_whitelist}
; itself. Excluding it from the list would require editing this file to update
; the whitelist.
;config_whitelist = [{httpd,config_whitelist}, {log,level}, {etc,etc}]

[ssl]
;enable = true
;cert_file = /full/path/to/server_cert.pem
;key_file = /full/path/to/server_key.pem
;password = somepassword

; set to true to validate peer certificates
;verify_ssl_certificates = false

; Set to true to fail if the client does not send a certificate. Only used if verify_ssl_certificates is true.
;fail_if_no_peer_cert = false

; Path to file containing PEM encoded CA certificates (trusted
; certificates used for verifying a peer certificate). May be omitted if
; you do not want to verify the peer.
;cacert_file = /full/path/to/cacertf

; The verification fun (optional) if not specified, the default
; verification fun will be used.
;verify_fun = {Module, VerifyFun}

; maximum peer certificate depth
;ssl_certificate_max_depth = 1

; Reject renegotiations that do not live up to RFC 5746.
;secure_renegotiate = true

; The cipher suites that should be supported.
; Can be specified in erlang format "{ecdhe_ecdsa,aes_128_cbc,sha256}"
; or in OpenSSL format "ECDHE-ECDSA-AES128-SHA256".
;ciphers = ["ECDHE-ECDSA-AES128-SHA256", "ECDHE-ECDSA-AES128-SHA"]

; The SSL/TLS versions to support
;tls_versions = [tlsv1, 'tlsv1.1', 'tlsv1.2']

; To enable Virtual Hosts in CouchDB, add a vhost = path directive. All requests to
; the Virtual Host will be redirected to the path. In the example below all requests
; to https://example.com/ are redirected to /database.
; If you run CouchDB on a specific port, include the port number in the vhost:
; example.com:5984 = /database
[vhosts]
;example.com = /database/

; To create an admin account uncomment the '[admins]' section below and add a
; line in the format 'username = password'. When you next start CouchDB, it
; will change the password to a hash (so that your passwords don't linger
; around in plain-text files). You can add more admin accounts with more
; 'username = password' lines. Don't forget to restart CouchDB after
; changing this.
[admins]
admin = XXXX

[cluster]
q=1
n=3
w=1
r=1
seedlist = [email protected],[email protected],[email protected]

[chttpd_auth]
secret = XXX

 {
    "couchdb": "Welcome",
    "version": "3.3.3-2d12ab0",
    "git_sha": "2d12ab0",
    "uuid": "714ae23fd1419d4331251a79c94f01d5",
    "features": [
        "nouveau",
        "access-ready",
        "partitioned",
        "pluggable-storage-engines",
        "reshard",
        "scheduler"
    ],
    "vendor": {
        "name": "The Apache Software Foundation"
    }
}

/_node/_local/_versions:

{
    "javascript_engine": {
        "version": "78",
        "name": "spidermonkey"
    },
    "erlang": {
        "version": "25.2.3",
        "supported_hashes": [
            "blake2s",
            "blake2b",
            "sha3_512",
            "sha3_384",
            "sha3_256",
            "sha3_224",
            "sha512",
            "sha384",
            "sha256",
            "sha224",
            "sha",
            "ripemd160",
            "md5",
            "md4"
        ]
    },
    "collation_driver": {
        "name": "libicu",
        "library_version": "72.1",
        "collator_version": "153.120",
        "collation_algorithm_version": "15"
    }
}
  • CouchDB version used: 3.3.3-2d12ab0
  • Browser name and version: (Not relevant) Curl, Firefox, OkHttp
  • Operating system and version: Debian 12
@nickva
Copy link
Contributor

nickva commented Mar 22, 2024

Thank you for your report. Yeah that looks like a bug we should fix.

@rnewson
Copy link
Member

rnewson commented Apr 2, 2024

looks simple. we use lists:ukeymerge but did not ensure the original list was in key order. probably change this to a map now, though.

@nickva
Copy link
Contributor

nickva commented Apr 2, 2024

@rnewson good idea to use a map, I had actually started on a PR during the weekend but didn't add tests yet. I'll add some test and push it later to day for the review.

@rnewson
Copy link
Member

rnewson commented Apr 2, 2024

as for why it also uses rotate_list at the start is a mystery to me given the later assumption that it is in key order.

@nickva
Copy link
Contributor

nickva commented Apr 2, 2024

Gave it a try here #5025

We didn't have any tests for _up and few other "misc" chttpd handler likes _uuid so added a few those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants