-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable chunkreader buffer size #10
Conversation
Signed-off-by: Artemiy Ryabinkov <[email protected]>
I decided to ask more about the reasons for the buffer size and an interesting discussion began: https://www.postgresql.org/message-id/0cdc5485-cb3c-5e16-4a46-e3b2f7a41322%40ya.ru |
Very interesting info on the buffer sizes. I definitely want to support this use case. But I'm not sure if directly exposing a chunkreader option is the best approach. I'm thinking that we should be able to inject a constructor function for either the chunkreader or the pgproto3.Frontend (#8). Among other things this would enable alternate implementations of chunkreader to support reusing the the receive buffer. This is a potential foot-gun a minute with little performance benefit for normal usage, but could be valuable in narrow circumstances. So I like the idea of allowing it via injecting an alternative implementation rather than supporting it directly. |
To be honest, when I read through the code of driver, I had a thought "hmm, what if reuse all allocated objects during rows.Next and rows.Scan". I believe that it could highly improve driver performance. You are right about "potential foot-gun", but if it will be hidden from end-user, then we will no have problems at all)
If we replace What you think about using type Frontend interface {
Receive() (pgproto3.BackendMessage, error)
}
type PgConn struct {
...
frontend Frontend
...
} And use type BuildFrontendFunc func() Frontend
type Config struct {
...
BuildFrontendFunc BuildFrontendFunc
...
} Btw, just wanna ask, why pointer used for config property ( |
When I was originally looking into it I hacked a crude version that reused memory without any safety guarantees just to see what the potential performance benefits would be. I used a benchmark of a simple select. IIRC it was around 1% improved throughput. It was cool to see 0 allocs per test though. As for wrapping the complexity internally -- I won't say it is impossible -- but at least it would be very challenging to do without introducing other performance regressions. That is because the current model reduces small allocations and copies at the expense of a smaller number of large allocations. Every block of memory that a chunk reader allocates is never reused. This means that it is never necessary to do a copy or additional allocation for To reuse the receive buffer safely would require always copying So where I really see it possible is on an opt-in basis for where it is safe by default and doesn't reuse any memory, but there is a special method that could be called on the chunk reader that promises that the application code will definitely never try to use anything that has previously been returned from To be honest the place that I really see it being useful and safe would be when
I think this is a good idea.
There won't be any measurable performance impact either way. There's only one per connection. But the reason to use a pointer instead of a value is the |
Signed-off-by: Artemiy Ryabinkov <[email protected]>
Thank you for good explanation about allocs) Now I see, that in case of memory reusing, "CommandComplete", "DataRow", "Authentication", etc. will require additional allocation during parsing. As I also see, higher code use only CommandTag data, all another bytes buffers don't leave pgconn. Maybe only strings built as So, if CommandTag is all that we need for higher code, then we can prealloc all commands strings and put them in But, I think we can leave this discussion and real benchmarks for another issue) Regarding this PR, I started to use |
Signed-off-by: Artemiy Ryabinkov <[email protected]>
…s-ci. Signed-off-by: Artemiy Ryabinkov <[email protected]>
Travis has some problems to download deps https://travis-ci.org/jackc/pgconn/jobs/569245290#L455:
So, I decided to optimize travis.yml for working with go mod, using GOPROXY and caching artifacts. |
Thanks!. LGTM. |
This PR adds
MinReadBufferSize
option topgconn.Config
. This options used to configurechunkreader.Config.MinBufLen
.I made some benchmarks to measure the impact of buffer size on memory and performance.
$ go test -v -run=XXX -bench=. -benchmem goos: linux goarch: amd64 pkg: github.com/furdarius/pgxexperiments/bufsize BenchmarkBufferSize/4KB 5 315763978 ns/op 53112832 B/op 12967 allocs/op BenchmarkBufferSize/8KB 5 300140961 ns/op 53082521 B/op 6479 allocs/op BenchmarkBufferSize/16KB 5 298477972 ns/op 52910489 B/op 3229 allocs/op BenchmarkBufferSize/1MB 5 299602670 ns/op 52848230 B/op 50 allocs/op PASS ok github.com/furdarius/pgxexperiments/bufsize 10.964s
Benchmarks shows that perfomance not changing so much, but the number of memory allocations decreases depending on the size of the buffer.
Cpu profiles also show similar results: read syscall takes the most time.
8KB_bufsize_cpuprof.pdf
1MB_bufsize_cpuprof.pdf
Using
strace
we can see, that application try to read up to1MB
in the buffer, but OS returns only8192
bytes in most cases:With
tcpdump
we can confirm, that network packets have size8192
:So, with well-tuned networking stack, the limit is 8KB. The reason is the hardcoded size of Postgres write buffer: https://github.com/postgres/postgres/blob/249d64999615802752940e017ee5166e726bc7cd/src/backend/libpq/pqcomm.c#L134
Same results we will see, If to use pgBouncer with increased
pkt_buf
.But, I think the best option for postgres instance configured for OLTP with pgbouncer as connection pooler would be to increase
pkt_buf
value and driver buffer. So, if PgBouncer have configurable buffer, then driver must also has configurable buffer. Fine-tuning is not possible otherwise)RelatedTo: jackc/pgx#502