Configurable chunkreader buffer size #10

furdarius · 2019-07-26T14:21:17Z

This PR adds MinReadBufferSize option to pgconn.Config. This options used to configure chunkreader.Config.MinBufLen.

I made some benchmarks to measure the impact of buffer size on memory and performance.

$ go test -v -run=XXX -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/furdarius/pgxexperiments/bufsize
BenchmarkBufferSize/4KB         	       5	 315763978 ns/op	53112832 B/op	   12967 allocs/op
BenchmarkBufferSize/8KB         	       5	 300140961 ns/op	53082521 B/op	    6479 allocs/op
BenchmarkBufferSize/16KB        	       5	 298477972 ns/op	52910489 B/op	    3229 allocs/op
BenchmarkBufferSize/1MB         	       5	 299602670 ns/op	52848230 B/op	      50 allocs/op
PASS
ok  	github.com/furdarius/pgxexperiments/bufsize	10.964s

Benchmarks shows that perfomance not changing so much, but the number of memory allocations decreases depending on the size of the buffer.

Cpu profiles also show similar results: read syscall takes the most time.
8KB_bufsize_cpuprof.pdf
1MB_bufsize_cpuprof.pdf

Using strace we can see, that application try to read up to 1MB in the buffer, but OS returns only 8192 bytes in most cases:

[pid 23127] connect(3, {sa_family=AF_INET, sin_port=htons(5432), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 23126] read(3, "c3a2692a0e02bc7f604d6d7056b057fD"..., 201990) = 8192
[pid 23126] read(3, "2D\0\0\0004\0\2\0\0\0\006665751\0\0\0 753c4e1152"..., 193798) = 8192
[pid 23126] read(3, 0xc0001ceafa, 185606) = -1 EAGAIN (Resource temporarily unavailable)
[pid 23126] read(3, "b79428ccdf1fc915fc9c7abdD\0\0\0004\0\2\0"..., 185606) = 8192
[pid 23126] read(3, "\2\0\0\0\006666060\0\0\0 d2a60043d3adfe6cc"..., 177414) = 8192
[pid 23126] read(3, "476dc1abcf7e7b136D\0\0\0004\0\2\0\0\0\0066662"..., 169222) = 1421

With tcpdump we can confirm, that network packets have size 8192:

21:56:06.373948 IP localhost.postgresql > localhost.56868: Flags [P.], seq 65652109:65660301, ack 23, win 86, options [nop,nop,TS val 2015091686 ecr 2015091686], length 8192
21:56:06.373951 IP localhost.56868 > localhost.postgresql: Flags [.], ack 65660301, win 2039, options [nop,nop,TS val 2015091686 ecr 2015091686], length 0
21:56:06.374006 IP localhost.postgresql > localhost.56868: Flags [P.], seq 65660301:65668493, ack 23, win 86, options [nop,nop,TS val 2015091686 ecr 2015091686], length 8192
21:56:06.374061 IP localhost.postgresql > localhost.56868: Flags [P.], seq 65668493:65676685, ack 23, win 86, options [nop,nop,TS val 2015091686 ecr 2015091686], length 8192
21:56:06.374064 IP localhost.56868 > localhost.postgresql: Flags [.], ack 65676685, win 2039, options [nop,nop,TS val 2015091686 ecr 2015091686], length 0
21:56:06.374120 IP localhost.postgresql > localhost.56868: Flags [P.], seq 65676685:65684877, ack 23, win 86, options [nop,nop,TS val 2015091687 ecr 2015091686], length 8192

So, with well-tuned networking stack, the limit is 8KB. The reason is the hardcoded size of Postgres write buffer: https://github.com/postgres/postgres/blob/249d64999615802752940e017ee5166e726bc7cd/src/backend/libpq/pqcomm.c#L134

Same results we will see, If to use pgBouncer with increased pkt_buf.

But, I think the best option for postgres instance configured for OLTP with pgbouncer as connection pooler would be to increase pkt_buf value and driver buffer. So, if PgBouncer have configurable buffer, then driver must also has configurable buffer. Fine-tuning is not possible otherwise)

RelatedTo: jackc/pgx#502

Sync

Signed-off-by: Artemiy Ryabinkov <[email protected]>

furdarius · 2019-07-29T17:43:28Z

I decided to ask more about the reasons for the buffer size and an interesting discussion began: https://www.postgresql.org/message-id/0cdc5485-cb3c-5e16-4a46-e3b2f7a41322%40ya.ru

jackc · 2019-08-03T16:03:29Z

Very interesting info on the buffer sizes.

I definitely want to support this use case. But I'm not sure if directly exposing a chunkreader option is the best approach.

I'm thinking that we should be able to inject a constructor function for either the chunkreader or the pgproto3.Frontend (#8).

Among other things this would enable alternate implementations of chunkreader to support reusing the the receive buffer. This is a potential foot-gun a minute with little performance benefit for normal usage, but could be valuable in narrow circumstances. So I like the idea of allowing it via injecting an alternative implementation rather than supporting it directly.

furdarius · 2019-08-04T15:07:21Z

reusing the receive buffer.

To be honest, when I read through the code of driver, I had a thought "hmm, what if reuse all allocated objects during rows.Next and rows.Scan". I believe that it could highly improve driver performance.

You are right about "potential foot-gun", but if it will be hidden from end-user, then we will no have problems at all)

I'm thinking that we should be able to inject a constructor function for either the chunkreader or the pgproto3.Frontend

If we replace *pgproto3.Frontend to Frontend interface, that would be a great step to allow reusing allocated buffers when we read data from the connection.

What you think about using Frontend interface inside PgConn struct.

type Frontend interface {
	Receive() (pgproto3.BackendMessage, error)
}

type PgConn struct {
    ...
    frontend Frontend
    ...
}

And use BuildFrontendFunc inside Config struct.

type BuildFrontendFunc func() Frontend

type Config struct {
    ...
    BuildFrontendFunc      BuildFrontendFunc
    ...
}

Btw, just wanna ask, why pointer used for config property (Config *Config) inside PgConn? I didn't check to escape analysis traces and don't sure about performance benefit, but best practices say to use value instead of pointer where possible.

jackc · 2019-08-05T22:51:32Z

reusing the receive buffer.

To be honest, when I read through the code of driver, I had a thought "hmm, what if reuse all allocated objects during rows.Next and rows.Scan". I believe that it could highly improve driver performance.

You are right about "potential foot-gun", but if it will be hidden from end-user, then we will no have problems at all)

When I was originally looking into it I hacked a crude version that reused memory without any safety guarantees just to see what the potential performance benefits would be. I used a benchmark of a simple select. IIRC it was around 1% improved throughput. It was cool to see 0 allocs per test though.

As for wrapping the complexity internally -- I won't say it is impossible -- but at least it would be very challenging to do without introducing other performance regressions. That is because the current model reduces small allocations and copies at the expense of a smaller number of large allocations. Every block of memory that a chunk reader allocates is never reused. This means that it is never necessary to do a copy or additional allocation for []byte data. Scanning into a *[]byte only has to update the slice. pgproto3 also takes advantage of this -- in particular to avoid an allocation with the command tag.

To reuse the receive buffer safely would require always copying []byte data out (presumably incurring a malloc as well) or for the higher code to signal when it is done with the memory and it is safe to reuse. The first approach would almost certainly have an overall negative performance effect and the second is the huge foot-gun if you don't really understand the internals of pgconn and pgproto3.

So where I really see it possible is on an opt-in basis for where it is safe by default and doesn't reuse any memory, but there is a special method that could be called on the chunk reader that promises that the application code will definitely never try to use anything that has previously been returned from pgconn.

To be honest the place that I really see it being useful and safe would be when pgproto3 is being used directly to implement a PG proxy or load balancer. Once a message is relayed it would definitely not be used again, so it would be possible to safely reuse memory.

If we replace *pgproto3.Frontend to Frontend interface, that would be a great step to allow reusing allocated buffers when we read data from the connection.

What you think about using Frontend interface inside PgConn struct.

And use BuildFrontendFunc inside Config struct.

I think this is a good idea.

Btw, just wanna ask, why pointer used for config property (Config *Config) inside PgConn? I didn't check to escape analysis traces and don't sure about performance benefit, but best practices say to use value instead of pointer where possible.

There won't be any measurable performance impact either way. There's only one per connection. But the reason to use a pointer instead of a value is the Config struct contains pointers itself. So using a value would make it appear to create a copy of the struct but it would only do a shallow copy. That could be very confusing.

Signed-off-by: Artemiy Ryabinkov <[email protected]>

furdarius · 2019-08-08T09:40:50Z

Thank you for good explanation about allocs) Now I see, that in case of memory reusing, "CommandComplete", "DataRow", "Authentication", etc. will require additional allocation during parsing. As I also see, higher code use only CommandTag data, all another bytes buffers don't leave pgconn. Maybe only strings built as string(src[rp : len(src)-1]) will leave pgconn packet, but idk about allocations here, cuz in a modern version of Go, compiler has some kind of optimizations for this case. (Benchmarks needed)

So, if CommandTag is all that we need for higher code, then we can prealloc all commands strings and put them in CommandTag property.

But, I think we can leave this discussion and real benchmarks for another issue)

Regarding this PR, I started to use BuildFrontendFunc, now this part is fully decoupled.

Signed-off-by: Artemiy Ryabinkov <[email protected]>

…s-ci. Signed-off-by: Artemiy Ryabinkov <[email protected]>

furdarius · 2019-08-08T10:31:27Z

Travis has some problems to download deps https://travis-ci.org/jackc/pgconn/jobs/569245290#L455:

go: finding github.com/pmezard/go-difflib v1.0.0
go: finding github.com/stretchr/objx v0.1.0
go: golang.org/x/[email protected]: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in /home/travis/gopath/pkg/mod/cache/vcs/76a8992ccba6d77c6bcf031ff2b6d821cf232e4ad8d1f2362404fbd0a798d846: exit status 128:
	error: RPC failed; curl 56 GnuTLS recv error (-9): A TLS packet with unexpected length was received.
	fatal: the remote end hung up unexpectedly
	fatal: early EOF
	fatal: index-pack failed
go: error loading module requirements
The command "./travis/install.bash" failed and exited with 1 during .

So, I decided to optimize travis.yml for working with go mod, using GOPROXY and caching artifacts.
This changes allows to decrease time of build:
~ 3min before (https://travis-ci.org/jackc/pgconn/builds/568581481)
~ 1.5min after (https://travis-ci.org/jackc/pgconn/builds/569275835)

jackc · 2019-08-08T20:49:39Z

Thanks!. LGTM.

furdarius added 2 commits July 26, 2019 10:58

Merge pull request #1 from jackc/master

b599a26

Sync

Add MinReadBufferSize option to Config

fa7e064

Signed-off-by: Artemiy Ryabinkov <[email protected]>

furdarius mentioned this pull request Jul 29, 2019

Set 8KB as default size of ChunkReader buffer jackc/pgx#565

Merged

jackc added this to the v1 milestone Aug 3, 2019

Add BuildFrontendFunc in Config

0a99b54

Signed-off-by: Artemiy Ryabinkov <[email protected]>

furdarius added 2 commits August 8, 2019 12:52

Add GOPROXY to travis builds to mitigate problems with github and etc

dbb7aa8

Signed-off-by: Artemiy Ryabinkov <[email protected]>

Use go mod download to install deps on travis-ci. Add cache for travi…

c9660e3

…s-ci. Signed-off-by: Artemiy Ryabinkov <[email protected]>

jackc merged commit 0a2ed72 into jackc:master Aug 8, 2019

furdarius deleted the configurable-chunkreader-buf branch August 10, 2019 08:18

furdarius mentioned this pull request Aug 10, 2019

Review PgConn public attributes #7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable chunkreader buffer size #10

Configurable chunkreader buffer size #10

furdarius commented Jul 26, 2019 •

edited

Loading

furdarius commented Jul 29, 2019

jackc commented Aug 3, 2019

furdarius commented Aug 4, 2019

jackc commented Aug 5, 2019

furdarius commented Aug 8, 2019

furdarius commented Aug 8, 2019 •

edited

Loading

jackc commented Aug 8, 2019

Configurable chunkreader buffer size #10

Configurable chunkreader buffer size #10

Conversation

furdarius commented Jul 26, 2019 • edited Loading

furdarius commented Jul 29, 2019

jackc commented Aug 3, 2019

furdarius commented Aug 4, 2019

jackc commented Aug 5, 2019

furdarius commented Aug 8, 2019

furdarius commented Aug 8, 2019 • edited Loading

jackc commented Aug 8, 2019

furdarius commented Jul 26, 2019 •

edited

Loading

furdarius commented Aug 8, 2019 •

edited

Loading