Better load balancing of Envoys across Pilot instances #11181

elevran · 2019-01-23T08:03:10Z

Describe the feature request

This is in continuation of #7878
Envoys maintain long lived connections to Pilot. In HA scenarios, instances are ready at different times and thus earlier instances receive a disproportionate number of connections. This imbalance is exacerbated during rolling upgrades.
The request is to create a more balanced split of connections between Envoy and Pilots, one that has some intelligence to balance the loads among all Pilot replicas.

Describe alternatives you've considered

#10838, #10870 and #11126 provide a short fix by capping the maximum connection age, allowing rebalancing of load over time. This solution is not ideal:

when maximum age is set high, the system responds slowly to imbalance and does not actually respond to changing load in an online manner.
when maximum age is set low, it introduces a lot of churn.
connections are disconnected even when load is correctly balanced.

Additional context

grpc-lb has been suggested as an alternative solution. Client side LB distributes logic to all Envoys (in addition, it may not solve this problem since it relies on name resolution which would replicate the imbalance as instances come and go. See discussion here).
It may be preferrable if we could encapsulate the implementation server-side, entirely in Pilot.
This comment on the original issue provides some additional context.

elevran · 2019-02-28T15:23:51Z

The solution needs to consider:

minimize global coordination needed and still achieve fairness
require little "global coordination" available out of box (e.g., minimize communication, limited global state,...)
we have no control over client selection of servers (randomized by k8s and service VIP)
expect pilots to come and go causing imbalance (e.g., during startup and upgrade, having small number of pilots)

Following Slack conversation with @Stono, we suggest the following.
For each Pilot:

keep track of number of connected envoys (e.g., in grpc middleware or from fronting envoy)
keep track of total number of pilots (count and IP's are in the service endpoints)
get count of envoys connected to other pilots (e.g., existing envoy stat or new pilot API)
determine "fair share" (e.g., total envoys / total pilots * allowed imbalance)
if over "fair share", randomly drop some of the connections until reaching fair share (can be throttled)
reject new connections while over fair share (some dropped connections would reconnect to the pilot)

The above can be run periodically or on event (change in pilot count).

Feedback welcomed!
CC people on original issue: @costin @rshriram @duderino @Stono @mandarjog @linsun @ja30278 @louiscryan @morvencao

morvencao · 2019-03-04T15:24:02Z

@elevran
Have tested the keepaliveMaxServerConnectionAge parameter for pilot with Istio 1.1.0.RC2, looks like it does't works well.
The test scenario is that:

Deploy pilot(with keepaliveMaxServerConnectionAge set to 30s) with 1 instance
Deploy 80 pods with sidecar
All sidecars connect to the pilot instance
Then scale the pilot to 2 instances

After a few minutes, all sidecar are still connecting the old pilot instance.

[root@master istio-1.1.0-rc.2]# kubectl -n istio-system get pod | grep pilot
istio-pilot-8f9b7bf96-7p424                   2/2       Running     0          24m
istio-pilot-8f9b7bf96-8gpdn                   2/2       Running     0          51m
[root@master istio-1.1.0-rc.2]# ./bin/istioctl ps
NAME                                                   CDS        LDS        EDS               RDS          PILOT                           VERSION
details-v1-5fbc6bc87c-5lzb2.default                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
istio-egressgateway-769d84965b-vznzw.istio-system      SYNCED     SYNCED     SYNCED (100%)     NOT SENT     istio-pilot-8f9b7bf96-8gpdn     1.1.0
istio-ingressgateway-757b68b569-hv8gl.istio-system     SYNCED     SYNCED     SYNCED (100%)     SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-0-7f97b55584-chn6s.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-1-5d898645d5-vjv5h.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-10-98c578d6c-zdmrn.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-11-7cd867684d-thrl9.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-12-86b7f54977-vcqpq.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-13-649545d66b-rvkd9.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-14-5b8cf9cc5f-djcg7.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-15-586d9bdc4d-7v5hc.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-16-68bff68d5c-xwz7h.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-17-7dc9c687bb-pd5mw.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-18-5c4fbb845f-v92gk.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-19-576fd7fd68-stsk6.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-2-6d8d667774-54fl6.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-20-c966759dd-bst6q.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-21-845f7fbb67-x5j76.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-22-cd7bc8f66-5djft.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-23-67b575c857-ntnzb.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-24-59945bb9cb-tbfxh.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-25-8b7d4f8c-48qnw.istio-apps                     SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-26-5dd5687458-wmrm5.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-27-79696f8c57-ktc8d.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-28-5b68c879d4-tsjfz.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-29-d945447f5-9cw4g.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-3-58846c4cdc-rntxm.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-30-5759c44689-s9rq5.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-31-cff88c88b-vftns.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-32-bc7f94f99-zptz5.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-33-7bbdcf54cb-h6x86.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-34-6d985b9dc7-fn84h.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-35-69747f77f8-f66kd.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-36-567846799f-lw6s5.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-37-68b5c9cb76-v4d2s.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-38-78c7785cdf-c8bfm.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-39-6dbf6fff5d-b6255.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-4-5f99f95748-pnxz4.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-40-76c6cdcd74-962jp.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-41-69b5588c5c-xlcz2.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-42-7ccb6d6ff6-6sgql.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-43-64996dfd8c-gz28f.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-44-579848cfcd-xrg6h.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-45-7ddd4666fc-qzxvw.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-46-67ccf7946d-f49kr.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-47-d6f4fc8c9-pz5w2.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-48-86b9b6d4cd-w29pg.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-49-599fb887df-mk686.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-5-84c5564b77-g5rbp.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-50-844fcd57ff-mq79w.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-51-55644d7dd-fl29b.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-52-6bbd444d5-qkhl2.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-53-589d998946-rtkvp.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-54-bff448b47-ptn7m.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-55-6b6c45bb44-9xzv5.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-56-5dc65c9c77-c9dph.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-57-69d865c4-7qwm8.istio-apps                     SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-58-68687bc4dc-bkpxz.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-59-58949cdd8-gdrd9.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-6-85688f5f9d-lz2lw.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-60-766c6b4484-8bhrc.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-61-86bd4db7f9-b8bdt.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-62-7d464978c4-9jh5j.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-63-5854579b69-x7xvs.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-64-6d97f7b4cb-zlnbs.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-65-67d6b59b99-4sjld.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-66-78ccbb749d-kw47t.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-67-756975c8f7-wzr9s.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-68-746f77c9bd-9h2v2.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-69-669764b9f-tvgsv.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-7-5bf57b7c95-rf769.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-70-68c8b54bc7-vb7hz.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-71-7d6c57fdc8-pnhg4.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-72-85965f7667-4r9s7.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-73-ff8879f88-tbfkr.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-74-655fc6dffc-7xp4z.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-75-6655554c9d-cxv8b.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-76-6fdbc5f6f5-fzfc8.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-77-6c84c997dd-krhbr.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-78-5cb67df669-slwl5.istio-apps                   SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-79-954bc8bd6-sn5ms.istio-apps                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-8-5c6968bd8-7g88r.istio-apps                     SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
nginx-9-c8db4c77-f2l4r.istio-apps                      SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
productpage-v1-7656c67555-f7vj2.default                SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
ratings-v1-54d9644bfc-zlrsj.default                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
reviews-v1-6d78f9fc98-2pk27.default                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
reviews-v2-79f48686f-vhz2r.default                     SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0
reviews-v3-74bfd9bdd8-44bhf.default                    SYNCED     SYNCED     SYNCED (50%)      SYNCED       istio-pilot-8f9b7bf96-8gpdn     1.1.0

Anything I missing? Or Is there any recommended value for the parameter?

Stono · 2019-03-22T23:10:49Z

Seeing as this is enabled by default in 1.1.0 at 30minutes, I'd really like to understand:

Is it even working?
Is it safe to use?

I'm half inclined to edit the injector config and remove it because the potential increased churn worries me.

Also, can anyone confirm if keepaliveMaxServerConnectionAge has some jitter? Otherwise we'll get all pilots reconnecting 30minutes after a rolling deployment, which is pretty big bag.

hzxuzhonghu · 2019-03-23T01:39:01Z

can anyone confirm if keepaliveMaxServerConnectionAge has some jitter?

I think it has, but not much. With at most MaxServerConnectionAgeGrace, which is default to 10s. IMO, this option should be disabled by default.

rmichela · 2019-03-25T16:41:16Z

grpc-lb has been suggested as an alternative solution.

grpc-lb is now considered deprecated.

elevran · 2019-03-26T10:07:11Z

Anything I missing? Or Is there any recommended value for the parameter?

@morvencao what was the maximum age configured? Rebalance won't happen before the expiration of the maximum age. The default (if unspecified) is infinity.

Also, can anyone confirm if keepaliveMaxServerConnectionAge has some jitter? Otherwise we'll get all pilots reconnecting 30minutes after a rolling deployment, which is pretty big bag.

@Stono according to grpc-go/keepalive.go there is a +/- 10% jitter on the configured value to avoid connection storms. So a 30 min maximum age will spread reconnects over 6 minutes.

	// The current default value is infinity.
	// MaxConnectionAge is a duration for the maximum amount of time a
	// connection may exist before it will be closed by sending a GoAway. A
	// random jitter of +/-10% will be added to MaxConnectionAge to spread out
	// connection storms.

morvencao · 2019-03-26T10:24:28Z

@elevran keepaliveMaxServerConnectionAge set to 30s

hzxuzhonghu · 2019-03-26T12:29:32Z

Is affinity set for istio-pilot service?

hzxuzhonghu · 2019-03-26T12:29:56Z

sessionAffinity

morvencao · 2019-03-26T15:29:33Z

@hzxuzhonghu No sessionAffinity set for pilot.

stale · 2019-06-24T15:46:25Z

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 30 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

hzxuzhonghu · 2019-06-25T04:01:51Z

I've tested with 3 pilots and total tens of pods, and keepaliveMaxServerConnectionAge works for me .

howardjohn · 2019-06-28T17:24:55Z

Another related problem, if you are on the border of 1 or 2 pilots needed, you get this really bad behavior where we keep flipping between 1 and 2 replicas and the pilot takes 30minutes to fully shed the load

https://snapshot.raintank.io/dashboard/snapshot/SceOCrNpdOr4qmTUk1UHF20xMiNqGk6K?panelId=4&fullscreen&orgId=2

howardjohn · 2020-02-07T16:17:18Z

This is pretty broken now, even the max connection age.

If on 15010, max conn age seems to work
If on 15011 in 1.4-, max conn age is broken. The connection for pilot sidecar - pilot is broken, but the connection between envoy -> pilot sidecar remains, so no balancing occurs
If on 15011 in 1.5, or 15012, max conn age is broken. We use gRPC ServeHTTP which seems to ignore this setting

Setting max_requests_per_connection on the envoy (client) seems to fix this in case (2)

Context: istio#11181 (comment)

hzxuzhonghu · 2020-02-10T02:17:13Z

One question: what does one request mean? A xds request is one? Will the connection always break when any new xds request comes.

howardjohn · 2020-02-10T02:38:59Z

Request is one gRPC stream, its an http level setting not XDS.

hzxuzhonghu · 2020-02-10T02:40:38Z

Got it.

* Fix load balancing of pilot connections Context: #11181 (comment) * fix repetitive code * Update goldens

Context: istio#11181 (comment)

* Fix load balancing of pilot connections Context: #11181 (comment) * fix repetitive code * Update goldens Co-authored-by: John Howard <[email protected]>

* Fix load balancing of pilot connections Context: istio#11181 (comment) * fix repetitive code * Update goldens

hzxuzhonghu · 2020-07-25T01:45:48Z

Just to sync up: it seems work for me:

hzxuzhonghu · 2020-07-25T01:47:02Z

istioctl ps
NAME                                                   CDS        LDS        EDS        RDS          PILOT                       VERSION
istio-egressgateway-c968fc7d6-sz844.istio-system       SYNCED     SYNCED     SYNCED     NOT SENT     istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
istio-ingressgateway-5f77578484-scp7n.istio-system     SYNCED     SYNCED     SYNCED     NOT SENT     istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-2srqg.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-6r8lf.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-7fms8.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-2df96     1.7.0-alpha.0
sleep-8f795f47d-j9v9b.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-gk28j     1.7.0-alpha.0
sleep-8f795f47d-ksf2j.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-nj8sz.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-qz8gf.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-gk28j     1.7.0-alpha.0
sleep-8f795f47d-s2tv2.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-2df96     1.7.0-alpha.0
sleep-8f795f47d-sxnbd.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-gk28j     1.7.0-alpha.0
sleep-8f795f47d-wqq46.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-gk28j     1.7.0-alpha.0

And hours later:

istioctl ps
NAME                                                   CDS        LDS        EDS        RDS          PILOT                       VERSION
istio-egressgateway-c968fc7d6-sz844.istio-system       SYNCED     SYNCED     SYNCED     NOT SENT     istiod-774bbfdf5d-2df96     1.7.0-alpha.0
istio-ingressgateway-5f77578484-scp7n.istio-system     SYNCED     SYNCED     SYNCED     NOT SENT     istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-2srqg.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-2df96     1.7.0-alpha.0
sleep-8f795f47d-6r8lf.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-2df96     1.7.0-alpha.0
sleep-8f795f47d-7fms8.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-gk28j     1.7.0-alpha.0
sleep-8f795f47d-j9v9b.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-gk28j     1.7.0-alpha.0
sleep-8f795f47d-ksf2j.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-2df96     1.7.0-alpha.0
sleep-8f795f47d-nj8sz.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-gk28j     1.7.0-alpha.0
sleep-8f795f47d-qz8gf.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-s2tv2.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-2df96     1.7.0-alpha.0
sleep-8f795f47d-sxnbd.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-xcpwz     1.7.0-alpha.0
sleep-8f795f47d-wqq46.default                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-774bbfdf5d-2df96     1.7.0-alpha.0

And no restart

k get pod
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-6b474476c4-9vc42   1/1     Running   0          17d
sleep-8f795f47d-2srqg               2/2     Running   0          16h
sleep-8f795f47d-6r8lf               2/2     Running   0          16h
sleep-8f795f47d-7fms8               2/2     Running   0          16h
sleep-8f795f47d-j9v9b               2/2     Running   0          11d
sleep-8f795f47d-ksf2j               2/2     Running   0          16h
sleep-8f795f47d-nj8sz               2/2     Running   0          16h
sleep-8f795f47d-qz8gf               2/2     Running   0          16h
sleep-8f795f47d-s2tv2               2/2     Running   0          16h
sleep-8f795f47d-sxnbd               2/2     Running   0          16h
sleep-8f795f47d-wqq46               2/2     Running   0          16h

hzxuzhonghu · 2020-07-25T01:47:20Z

istioctl version
client version: 1.7.0-alpha.0
control plane version: 1.7.0-alpha.0-37119973c952151e269110170f2fda8c6a34fb5e
data plane version: 1.7.0-alpha.0 (12 proxies)

srmars · 2022-11-09T14:16:32Z

@howardjohn I can see default keepaliveMaxServerConnectionAge values is 30m in istiod deployment. I have couple of query regarding this. Can you please check the below. Thank you.

Is there any there any recommended values for productions.
Is there any disadvantage of setting this value as 24h.
From the Istio document I can see the below values has default. Is that correct. I think based on 1st point it should be 30m right.

https://istio.io/latest/docs/reference/commands/pilot-discovery/

Maximum duration a connection will be kept open on the server before a graceful close. (default 2562047h47m16.854775807s)

howardjohn · 2022-11-09T15:14:24Z

30min :-) that is why its the default...
Yes, load will not balance across istiod instances for >24hrs
That is the default in the binary, the helm/istioctl install overrides it

elevran added this to the 1.2 milestone Jan 23, 2019

elevran added kind/enhancement area/networking area/perf and scalability labels Jan 23, 2019

elevran self-assigned this Jan 23, 2019

elevran mentioned this issue Jan 23, 2019

1/4 istio-pilot instances getting basically no load #7878

Closed

duderino modified the milestones: 1.2, 1.1 Mar 21, 2019

stale bot added the stale label Jun 24, 2019

stale bot removed the stale label Jun 25, 2019

howardjohn added the no stalebot label Jun 28, 2019

howardjohn mentioned this issue Jul 29, 2019

Horizontal autoscaling of pilot not working properly #15892

Closed

howardjohn modified the milestones: 1.1, Nebulous Future Aug 5, 2019

vadimeisenbergibm added the bug bashed label Oct 6, 2019

howardjohn self-assigned this Oct 6, 2019

geeknoid added the lifecycle/staleproof Indicates a PR or issue has been deemed to be immune from becoming stale and/or automatically closed label Oct 28, 2019

geeknoid removed the no stalebot label Nov 2, 2019

howardjohn mentioned this issue Feb 7, 2020

Proof of concept: Active load shedding in Pilot #20938

Closed

howardjohn added a commit to howardjohn/istio that referenced this issue Feb 7, 2020

Fix load balancing of pilot connections

6858b16

Context: istio#11181 (comment)

howardjohn mentioned this issue Feb 7, 2020

Fix load balancing of pilot connections #20957

Merged

istio-testing pushed a commit that referenced this issue Feb 14, 2020

Fix load balancing of pilot connections (#20957)

6bff985

* Fix load balancing of pilot connections Context: #11181 (comment) * fix repetitive code * Update goldens

istio-testing pushed a commit to istio-testing/istio that referenced this issue Feb 14, 2020

Fix load balancing of pilot connections

528dd7d

Context: istio#11181 (comment)

sdake pushed a commit to sdake/istio that referenced this issue Feb 21, 2020

Fix load balancing of pilot connections (istio#20957)

c054e20

* Fix load balancing of pilot connections Context: istio#11181 (comment) * fix repetitive code * Update goldens

howardjohn mentioned this issue Jun 19, 2020

Adding back envoy next to Istiod, part 1 #24821

Closed

howardjohn mentioned this issue Jul 23, 2020

Istio ingressgateway MEM has linear growth and isn't auto-released when to create and delete 500 knative svc (consumes about 500k ~ 1M for 1 ksvc) #25145

Closed

ramaraochavali mentioned this issue Jul 24, 2020

Istio Pilot consuming huge CPU resources #20262

Closed

howardjohn removed the bug bashed label Jul 31, 2020

howardjohn added this to P1 in Prioritization Sep 4, 2020

howardjohn unassigned howardjohn and elevran Oct 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better load balancing of Envoys across Pilot instances #11181

Better load balancing of Envoys across Pilot instances #11181

elevran commented Jan 23, 2019

elevran commented Feb 28, 2019

morvencao commented Mar 4, 2019 •

edited

Loading

Stono commented Mar 22, 2019 •

edited

Loading

hzxuzhonghu commented Mar 23, 2019

rmichela commented Mar 25, 2019

elevran commented Mar 26, 2019

morvencao commented Mar 26, 2019

hzxuzhonghu commented Mar 26, 2019

hzxuzhonghu commented Mar 26, 2019

morvencao commented Mar 26, 2019

stale bot commented Jun 24, 2019

hzxuzhonghu commented Jun 25, 2019

howardjohn commented Jun 28, 2019

howardjohn commented Feb 7, 2020

hzxuzhonghu commented Feb 10, 2020

howardjohn commented Feb 10, 2020

hzxuzhonghu commented Feb 10, 2020

hzxuzhonghu commented Jul 25, 2020

hzxuzhonghu commented Jul 25, 2020

hzxuzhonghu commented Jul 25, 2020

srmars commented Nov 9, 2022

howardjohn commented Nov 9, 2022

Better load balancing of Envoys across Pilot instances #11181

Better load balancing of Envoys across Pilot instances #11181

Comments

elevran commented Jan 23, 2019

elevran commented Feb 28, 2019

morvencao commented Mar 4, 2019 • edited Loading

Stono commented Mar 22, 2019 • edited Loading

hzxuzhonghu commented Mar 23, 2019

rmichela commented Mar 25, 2019

elevran commented Mar 26, 2019

morvencao commented Mar 26, 2019

hzxuzhonghu commented Mar 26, 2019

hzxuzhonghu commented Mar 26, 2019

morvencao commented Mar 26, 2019

stale bot commented Jun 24, 2019

hzxuzhonghu commented Jun 25, 2019

howardjohn commented Jun 28, 2019

howardjohn commented Feb 7, 2020

hzxuzhonghu commented Feb 10, 2020

howardjohn commented Feb 10, 2020

hzxuzhonghu commented Feb 10, 2020

hzxuzhonghu commented Jul 25, 2020

hzxuzhonghu commented Jul 25, 2020

hzxuzhonghu commented Jul 25, 2020

srmars commented Nov 9, 2022

howardjohn commented Nov 9, 2022

morvencao commented Mar 4, 2019 •

edited

Loading

Stono commented Mar 22, 2019 •

edited

Loading