Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase postgres max_connections above 100 connections #2740

Merged
merged 7 commits into from
Mar 6, 2024

Conversation

erfantkerfan
Copy link
Contributor

In my case, the default 100 connections in Postgres is hitting its limit and needs to be set to a higher value.
Since other people might need to change this, appending it to the Postgres flags seems only logical.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

@BYK
Copy link
Member

BYK commented Jan 28, 2024

I think the better approach would be to take the value from an env variable and default to 100

@erfantkerfan
Copy link
Contributor Author

erfantkerfan commented Jan 30, 2024

I made the changes you suggested.
Can you confirm?
@BYK

@BYK
Copy link
Member

BYK commented Jan 30, 2024

@erfantkerfan I'd add the default value (100 btw not 300 here: https://github.com/getsentry/self-hosted/blob/master/.env) and this should be good to go?

Copy link
Collaborator

@aminvakil aminvakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary, how many active connections does your postgres currently have?

@erfantkerfan
Copy link
Contributor Author

erfantkerfan commented Jan 30, 2024

@erfantkerfan I'd add the default value (100 btw not 300 here: https://github.com/getsentry/self-hosted/blob/master/.env) and this should be good to go?

Done. Changed to the default 100.

@erfantkerfan
Copy link
Contributor Author

I don't think this is necessary, how many active connections does your postgres currently have?

I'm getting this error:

I don't think this is necessary, how many active connections does your postgres currently have?

I'm getting the exact error that tells me I have used all my Postgres connections, also I'm using latest self-host sentry version and this change fixed my problem for when we had a very high load of sentry requests.

@BYK
Copy link
Member

BYK commented Jan 30, 2024

@erfantkerfan you still did not add it to the .env file? Also I'd defer to @aminvakil for further discussion

@aminvakil
Copy link
Collaborator

Please paste the error, also please paste the command used and your active connections.

I still think this is a XY problem, could you please state what do you mean by high load? How many requests per second?

@erfantkerfan
Copy link
Contributor Author

erfantkerfan commented Jan 31, 2024

Please paste the error, also please paste the command used and your active connections.

I still think this is a XY problem, could you please state what do you mean by high load? How many requests per second?

Let me recap, I'm having about 50 req/s in normal, but in high traffic hours I'm getting more than 120 req/s and I sometimes I was getting 500 status codes from sentry, after reading sentry logs I found out that sentry is having problems connection to Postgres, after reading Postgres logs I saw this error message:
sorry, too many clients already
My solution was increasing the connection limit of Postgres, since this might happen to other people, I pushed my changed to your upstream.
@aminvakil

you still did not add it to the .env file?

No, I think the default 100 will be sufficient for most people, but for others that might encounter this problem finding out this .env would be easy (since it's present in docker-compose.yml), But if you insist on adding it to the sample env file It's okay.
@BYK

@aminvakil
Copy link
Collaborator

aminvakil commented Jan 31, 2024

Please paste the error, also please paste the command used and your active connections.
I still think this is a XY problem, could you please state what do you mean by high load? How many requests per second?

Let me recap, I'm having about 50 req/s in normal, but in high traffic hours I'm getting more than 120 req/s and I sometimes I was getting 500 status codes from sentry, after reading sentry logs I found out that sentry is having problems connection to Postgres, after reading Postgres logs I saw this error message: sorry, too many clients already My solution was increasing the connection limit of Postgres, since this might happen to other people, I pushed my changed to your upstream. @aminvakil

Postgres is not good in handling lots of connections, that's why lots of postgresql connection poolers exist and they are getting used in production.

Also we're handling about 800 rps without having to tweak postgresql connections, that's why I don't think your "high load" suffices the reason to merge this PR.

you still did not add it to the .env file?

No, I think the default 100 will be sufficient for most people, but for others that might encounter this problem finding out this .env would be easy (since it's present in docker-compose.yml), But if you insist on adding it to the sample env file It's okay. @BYK

Ok, aside from merging this or not, this has to be added in .env, otherwise why are we having an environment variable at all in the first place?

@BYK
Copy link
Member

BYK commented Jan 31, 2024

@erfantkerfan - I want it in the .env file mostly for discoverability reasons. Otherwise you are right, technically we do not need to define it since we have the fallback value in place.

@aminvakil

Postgres is not good in handling lots of connections, that's why lots of postgresql connection poolers exist and they are getting used in production.

Yes but if we say this, then I'd argue this repo should also bake these kinds of things in such as pgbouncer. Or at least provide easy ways to plug them in when needed. Right now we have none of that so I think this patch is a move in the right direction regarding that.

Also we're handling about 800 rps without having to tweak postgresql connections, that's why I don't think your "high load" suffices the reason to merge this PR.

This discrepancy is worth exploring but I'm not sure "we can handle this without changing the setting" is a great argument against adding the flex point to change the setting easily when people need it.

@aminvakil
Copy link
Collaborator

aminvakil commented Jan 31, 2024

@aminvakil

Postgres is not good in handling lots of connections, that's why lots of postgresql connection poolers exist and they are getting used in production.

Yes but if we say this, then I'd argue this repo should also bake these kinds of things in such as pgbouncer. Or at least provide easy ways to plug them in when needed. Right now we have none of that so I think this patch is a move in the right direction regarding that.

Correct.

Also we're handling about 800 rps without having to tweak postgresql connections, that's why I don't think your "high load" suffices the reason to merge this PR.

This discrepancy is worth exploring but I'm not sure "we can handle this without changing the setting" is a great argument against adding the flex point to change the setting easily when people need it.

My argument is that we should make sure that the problem is postgresql max connections prior to adding a configuration for it. For example I think that maybe OP has created too many workers, therefore opening too many connections to postgresql.

From my experience when you add these limits configuration, many users just raise them blindly which results in making their instance perform worse.

See hundreds of tutorial which just disable kernel limits of open files for really no reason, right now I see this PR going that way.

@BYK
Copy link
Member

BYK commented Jan 31, 2024

My argument is that we should make sure that the problem is postgresql max connections prior to adding a configuration for it. For example I think that maybe OP has created too many workers, therefore opening too many connections to postgresql.

Ah, that's a fair point. @erfantkerfan what else did you change from the default configuration we provided?

@erfantkerfan
Copy link
Contributor Author

erfantkerfan commented Feb 1, 2024

My argument is that we should make sure that the problem is postgresql max connections prior to adding a configuration for it. For example I think that maybe OP has created too many workers, therefore opening too many connections to postgresql.

Ah, that's a fair point. @erfantkerfan what else did you change from the default configuration we provided?

That's a fair point. But in the case of Postgres I have to disagree, most of the time it fails at handling large amount of connections.
Basically I'm using the default config, here is my diff:

diff --git a/.env b/.env
index 5fcf6ca..9dc4cf9 100644
--- a/.env
+++ b/.env
@@ -1,5 +1,5 @@
 COMPOSE_PROJECT_NAME=sentry-self-hosted
-SENTRY_EVENT_RETENTION_DAYS=90
+SENTRY_EVENT_RETENTION_DAYS=45
 # You can either use a port number or an IP:PORT combo for SENTRY_BIND
 # See https://docs.docker.com/compose/compose-file/#ports for more
 SENTRY_BIND=9000

I want it in the .env file mostly for discoverability reasons. Otherwise you are right, technically we do not need to define it since we have the fallback value in place.

I added it to the .env with the default 100 limit too in the hope that this PR seems okay now.
@BYK

@aminvakil
Copy link
Collaborator

Please paste the error, also please paste the command used and your active connections.

cd YOUR_SELF_HOSTED_INSTANCE
docker compose exec postgres su postgres
psql
SELECT COUNT(datid) FROM pg_stat_activity WHERE state = 'active';

@erfantkerfan
Copy link
Contributor Author

erfantkerfan commented Feb 3, 2024

SELECT COUNT(datid) FROM pg_stat_activity WHERE state = 'active';

In non-peak hours:

 count 
-------
    11
(1 row)

In peak hours:

 count 
-------
    44
(1 row)

In the past couple of days, I have not found connections above 100, (It's a case that only happens about once or twice a month)
Also, I think this query would be more suitable since we are being limited by number of connections not the actual load on the Postgres

SELECT COUNT(datid) FROM pg_stat_activity;

@aminvakil

@aminvakil
Copy link
Collaborator

SELECT COUNT(datid) FROM pg_stat_activity WHERE state = 'active';

In non-peak hours:

 count 
-------
    11
(1 row)

In peak hours:

 count 
-------
    44
(1 row)

Please be precise, are these the output of the command I've pasted or something else?

In the past couple of days, I have not found connections above 100, (It's a case that only happens about once or twice a month)

How do you know this?

Also, I think this query would be more suitable since we are being limited by number of connections not the actual load on the Postgres

Can you elaborate this sentence? Is this PR about number of connections or "actual load on the Postgres"? Also what does max_connections which we're talking about in this PR changes all connections (active + idle + ?) or only active connections?

SELECT COUNT(datid) FROM pg_stat_activity;

@aminvakil

And please do not mention me on every single comment, I'm watching this PR, thanks.

@erfantkerfan
Copy link
Contributor Author

Please be precise, are these the output of the command I've pasted or something else?

Yes sir.

In the past couple of days, I have not found connections above 100, (It's a case that only happens about once or twice a month)

How do you know this?

How do I know I have not reached above 100 connections? Because in this case I get logs in container output which I can grep and find them.

How do I know this happens once or twice a month? Because I get 500 error this often from sentry and after investigating I see this max_connections error both in Sentry container and Postgres logs.

Can you elaborate this sentence? Is this PR about number of connections or "actual load on the Postgres"? Also what does max_connections which we're talking about in this PR changes all connections (active + idle + ?) or only active connections?

From what I understand from the docs and some blogs, this value changes the combined number of connections, which is:

SELECT COUNT(datid) FROM pg_stat_activity;

@aminvakil
Copy link
Collaborator

I still cannot see why this is the only instance which needs this fix, and explained my arguments on not merging this PR in #2740 (comment) .

I don't see any added value from OP in later comments to change my mind about this.

But I'm not a maintainer of this project, @hubertdeng123 please do as you see fit, I've made my arguments.

@hubertdeng123
Copy link
Member

Just reading up here, and I see valid points from both sides. FWIW, I don't think adding more connections is the best way to go about things when hitting the connection limit since it can result in spikes in resources, but I also don't see the harm in adding this as a configurable option and surfacing that in the .env file. If this works as a solution for OP here, it may be useful to others as well.

Before we merge this, I do think it may be helpful to add a comment along the lines warning users of additional CPU/RAM usage if the maximum connections in postgres is raised. Could you do that @erfantkerfan?

@erfantkerfan
Copy link
Contributor Author

@hubertdeng123
Sure, it's done.

.env Outdated Show resolved Hide resolved
.env Outdated Show resolved Hide resolved
erfantkerfan and others added 2 commits March 2, 2024 12:11
Co-authored-by: Amin Vakil <[email protected]>
Co-authored-by: Amin Vakil <[email protected]>
@hubertdeng123 hubertdeng123 merged commit 7691add into getsentry:master Mar 6, 2024
6 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants