-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document best configuration settings for low-spec HW #4211
Comments
We are also experiencing slowness whilst running CouchDB inside a Docker Instance, especially on a blockchain network where couchdb is leveraged. FWIW, here's the reference https://jira.hyperledger.org/browse/FAB-12023 , any recommendations/settings that you would recommend as such ? |
@wohali @harsha544 |
@denyeart Thanks, though I see you reverted from 2.2.0 to 2.1.1, which could have a huge impact too. Would you consider a PR against apache/couchdb-docker with this change? |
also @kocolosk see above |
So that change goes directly against the comment
Was that comment just incorrect? It's been hanging around the codebase approximately forever - I tracked it back to apache/couchdb-docker@80b3bb4137. |
I experimented a bit more with this and I suspect we're carrying around some dead weight. The
I don't know what standard we used to come up with these settings but e.g. the I don't have a proper test matrix of Docker engine versions etc. to see if there was a point in time at which these settings were needed. I've also not tried Windows or anything else exotic. |
Thanks for looking into it @kocolosk . We were unsure how many of the grant statements were actually needed so just followed the precedent from the CouchDB docker-entrypoint.sh. We were quite surprised how long the grant statements took during container startup. On good hardware it seems to delay container startup by 5 seconds and on poor hardware it could delay over a minute. This delay is completely gone for us after moving the grants to Dockerfile. @wohali we are not experts in this space so I'm hesitant to make the contribution. Plus it sounds like CouchDB needs to identify exactly which grant statements are in fact needed. Could you spawn a separate issue for your experts to look into it? BTW, we are only moving to 2.1.1 since that is the release that we have system tested the most with our application. We will indeed move up to 2.2.0 after some more testing on it. |
This is the wrong place to have this discussion, as the issue being discussed is slow startup times for Docker. Moving this discussion to apache/couchdb-docker#109 |
In my experience, the biggest problem with couchdb on docker or even my laptop is the q and n values in the clustering section of configuration. @janl pointed me in this direction during an IRC debug session and it has helped tremendously. There is some documentation here: But I am still not quite sure how best to setup the q and n values for smaller standalone instances. Here's my simplistic logic (call it cowboy configuration for couchdb), that I would love to have checked/improved:
(I had to do some research to understand what CPU cores means relative to the output of |
@mikeymckay If you're only running a single server, you are by your very nature As for As always, before putting anything into production, test. :) |
Thanks @wohali, this is interesting and helpful. There are thousands and thousands of use cases for couchdb and at least as many good configurations per scenario. I have been working with couchdb for almost 10 years, and I still don't have a good sense for an optimal deployment strategy for my sort of projects (10-100 pouchdb clients filter replicating data to/from couchdb a few times per day, with somebody analyzing the resulting data in couchdb a few times per week). Since I started with single server couchdb, and because I haven't seen any data about performance advantages of multiple servers (redundancy is nice, but daily snapshots are usually enough for me), and because orchestrating multiple servers is hard (using docker could simplify this, but I and others have seen poor performance (total lockup of machine) on docker)... I've just been using a single couchdb server even though I am now deploying couchdb 2.3.0. So going back to the title of this issue - what is the ideal configuration for low spec HW? Or rather than saying low spec hardware - what about saying $45/month (arbitrarily chosen) hardware budget? Is a single instance (with the right q value) close enough to ideal? Will performance be substantially different if you get 3 - $15/month servers and set them up as a cluster? Given the availability of low cost ARM servers (on AWS or even Raspberry Pi stacks) are 5 servers better than 1 that's five times as fast? Given Erlang's focus on parallelization (and the idea behind map/reduce in general) I have a hunch that more cheap servers will give more performance per dollar than a single beefier server - but I don't know how much and if it is worth the added complexity. Has anybody tried to benchmark this? |
@mikeymckay again, the problem is one of volume and type of load. CouchDB performance is very load-dependent. You need a very different setup if you're doing 100s of simultaneous filtered replications vs. 1000s of PUTs per second vs. 1000s of GETs per second on views whose definitions update semi-often. Multiple servers is more about guaranteeing availability when a node fails (and nodes WILL fail), though obviously the aggregate bandwidth of 3 nodes is greater than that of a single node (assuming the load balancer ahead of the cluster is capable of handling that bandwidth). It's also highly recommended to terminate SSL ahead of CouchDB at the load balancer, which could be haproxy running on the same machine, sure. Erlang keeps getting SSL wrong- in fact, we've just had to blacklist Erlang versions 21.2 through 21.2.2 because SSL is completely bustsed in those versions. (This has happened multiple times in the past.) So I guess we're looking at multiple best configuration settings here for low-spec HW, depending on whether you're running single-node or cluster, and depending on whether you're replication-heavy, view-heavy, write-heavy, or some combination of the above. |
Are there any drafts on this? We're running on a very low power host (800Mhz single core \ 512 RAM). We are idling fine, but occasionally some operations (especially replication after long periods without upstream connectivity) are crippling our system. If a draft isn't available, what are the key configuration options we should we be looking at most, to limit processes, network connections, and cpu consumption? |
There have been a series of issues raised on apache/couchdb (see #1341 and linked issues for instance) on how to run CouchDB 2.x on low-spec hardware, such as single-core Docker instances, or on a Raspberry Pi 1.
The current default settings for CouchDB don't make it easy to do this.
Documenting some of the approaches to simplify this configuration would be best.
The text was updated successfully, but these errors were encountered: