Scheduler-rotated replication jobs are incorrectly closing sockets #1417

wohali · 2018-06-27T23:35:01Z

Description

This is a bit convoluted, so bear with me.

One deployment of CouchDB in a db-per-user type scenario has many more filtered continuous replication jobs than max_jobs, and increasing max_jobs to match is impractical (CPU limits). As the scheduler rotates through jobs, we are seeing the number of TIME_WAIT sockets on the nodes drastically increase.

Expected behaviour

When the replication scheduler rotates off (i.e. kills) a continuous replication job to round-robin to another waiting replication job, the socket should be closed on the client side before killing replication, triggering proper socket cleanup.

Current behaviour

When the replication scheduler rotates off (i.e. kills) a continuous replication job to round-robin to another waiting replication job, the socket is not closed correctly, leaving a client socket in TIME_WAIT to expire.

Steps to reproduce

First, set up a test environment. In one window:

dev/run -n 1 --with-admin-party-please

In another window, set up a short script to monitor the number of TIME_WAIT TCP connections:

watch -n 1 "netstat -an | grep TIME_WAIT | grep 15984 | wc -l"

Leave off the | wc -l if you want to see a full list.

Now, in a third window, prep 6 test databases:

curl -X PUT localhost:15984/abc
curl -X PUT localhost:15984/one
curl -X PUT localhost:15984/two
curl -X PUT localhost:15984/three
curl -X PUT localhost:15984/four
curl -X PUT localhost:15984/five

Create continuous replication documents to replicate from the shared abc database to each of the "db-per-user" databases:

curl -X PUT localhost:15984/_replicator/one?rev=2-3efcb2be4a2cf5430a56d8919e8fdd79 -d '{"source": "http:https://localhost:15984/abc", "target": "http:https://localhost:15984/one", "continuous": true}'
curl -X PUT localhost:15984/_replicator/two -d '{"source": "http:https://localhost:15984/abc", "target": "http:https://localhost:15984/two", "continuous": true}'
curl -X PUT localhost:15984/_replicator/three -d '{"source": "http:https://localhost:15984/abc", "target": "http:https://localhost:15984/three", "continuous": true}'
curl -X PUT localhost:15984/_replicator/four -d '{"source": "http:https://localhost:15984/abc", "target": "http:https://localhost:15984/four", "continuous": true}'
curl -X PUT localhost:15984/_replicator/five -d '{"source": "http:https://localhost:15984/abc", "target": "http:https://localhost:15984/five", "continuous": true}'

Finally, force the replicator to churn by adjusting the replicator max jobs, interval, and startup jitter to minimal values:

curl -X PUT localhost:15984/_node/_local/_config/replicator/interval -d '"1000"'
curl -X PUT localhost:15984/_node/_local/_config/replicator/max_jobs -d '"1"'
curl -X PUT localhost:15984/_node/_local/_config/replicator/startup_jitter -d '"1"'

Now, the replicator scheduler will only have a single job running at a time, and will rotate through jobs every second, with 1ms of jitter on starting each job.

Sitting at idle, with 0 documents in the database, this is currently showing a steady state of ~50-60 TIME_WAIT sockets. Looking at the output of netstat -an | grep 15984, all of these sockets show port 15984 as the destination. Example:

tcp        0      0 127.0.0.1:46031         127.0.0.1:15984         TIME_WAIT

All of the TIME_WAIT sockets are client-side; that is, I never see 127.0.0.1:15984 in the source column of the netstat output unless I kill CouchDB (obviously)

Some interesting things to try at this point:

In a separate window, perturb the abc database by adding a single document:

curl -X POST -H "Content-Type: application/json" localhost:15984/abc -d '{"field": "1"}'; done

Apply the following tweak to couch_replicator to force it to treat all replications as one-shot, rather than continuous (thanks @rnewson): https://www.irccloud.com/pastebin/dFGMLgqm/
Try the perturbation exercise above after the tweak.

What I'm seeing locally for a TIME_WAIT socket count is, with either approach:

Unpatched (CouchDB master)

Starts at ~0, steady state.
After perturbation, slowly grows to 50-60 (consistent with a 2MSL of 50-60s)
Never drops back down to 0, even after waiting 5 minutes.

Patched (Forced one-shot replications)

Starts at ~6-12, steady state.
After perturbation, stays around 6-12, steady state.
Never drops back down to 0, even after waiting 5 minutes.

The text was updated successfully, but these errors were encountered:

wohali added dbcore replication bug labels Jun 27, 2018

wohali mentioned this issue Jul 8, 2018

Aborting continuous listening on _global_changes is leaking resources #1063

Closed

wohali removed the dbcore label Jun 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduler-rotated replication jobs are incorrectly closing sockets #1417

Scheduler-rotated replication jobs are incorrectly closing sockets #1417

wohali commented Jun 27, 2018

Scheduler-rotated replication jobs are incorrectly closing sockets #1417

Scheduler-rotated replication jobs are incorrectly closing sockets #1417

Comments

wohali commented Jun 27, 2018

Description

Expected behaviour

Current behaviour

Steps to reproduce

Unpatched (CouchDB master)

Patched (Forced one-shot replications)