Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

envoy: use upstream http filter for metadata #51206

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

kyessenov
Copy link
Contributor

@kyessenov kyessenov commented May 22, 2024

Change-Id: Idcc1f98fa1242d93fdb19c8c9838a7748856ebed

Please provide a description of this PR:

Change-Id: Idcc1f98fa1242d93fdb19c8c9838a7748856ebed
Signed-off-by: Kuat Yessenov <[email protected]>
@kyessenov kyessenov requested review from a team as code owners May 22, 2024 20:34
@istio-testing istio-testing added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 22, 2024
Copy link
Member

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like a good direction, just not sure how/if it works as-is

cc @keithmattix

}
metadata := base64.RawStdEncoding.EncodeToString(bytes)
// Set metadata exchange environment variables
_ = os.Setenv("ISTIO_PEER_METADATA_ID", node.ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we set this on the envoy process call instead of in the ambient environment? This feels a bit sketchy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm checking whether this is feasible - will clean up once it works.

@@ -321,6 +322,15 @@ func (a *Agent) initializeEnvoyAgent(_ context.Context) error {
}
a.envoyOpts.ConfigPath = out
a.envoyOpts.ConfigCleanup = true

bytes, err := proto.Marshal(node.Metadata.ToStruct())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we used flatbuffers? does it encode to the same or am I just wrong about that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flatbuffers is an internal representation, that we can remove without user facing changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about cross-version calls?

Copy link
Contributor Author

@kyessenov kyessenov May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never send flatbuffer on the wire if that's the question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why still use flat buffers then? Would be nice if we could standardize on proto if perf impact is minimal (not for this PR obviously)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was used because Wasm can't handle protos efficiently. Now that we don't recommend Wasm for telemetry, it could be a cleanup, although some users are probably still using internal flatbuffers and would be impacted - including our own SD extension.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to be breaking the internal protocol between proxies this needs broad discussion around the cost vs benefits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@howardjohn There's no difference on the network between this and before. Flatbuffers were always internal to Istio and never sent on the wire. I don't think I understand the concern here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh so the wire format is the same?

We can do

meta := ...
roundtrip := flatbuffer_unmarshal(proto_marshal(meta))
meta == roundtrip

?

If so, no concern

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, flatbuffer is just a subset of struct metadata with 4-5 fields and converted internally in Envoy from proto to flatbuffers because Wasm. We can construct the subset here as well, and not even use flatbuffers to roundtrip.

@@ -443,9 +443,5 @@ func appendMxFilter(httpOpts *httpListenerOpts, filters []*hcm.HttpFilter) []*hc
if httpOpts.class == istionetworking.ListenerClassSidecarInbound {
return append(filters, xdsfilters.SidecarInboundMetadataFilter)
}

if httpOpts.skipIstioMXHeaders {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we seem to lose this? skipIstioMXHeaders

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will add it later.

"istio_headers": map[string]any{},
},
map[string]any{
"workload_discovery": map[string]any{},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does workload_discovery work now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be moved to network metadata exchange.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine that needs to be done in this PR, yes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this PR is about sending headers, not reading them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this removed code is what enabled workload discovery for waypoints. If so, how are we going to get that information now

"upstream_propagation": []any{
map[string]any{
"istio_headers": map[string]any{
"skip_external_clusters": true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we lose this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets a lot easier actually because skipping is done per cluster.

SidecarOutboundMetadataFilterSkipHeaders = &hcm.HttpFilter{
Name: MxFilterName,

InjectIstioHeaders = &hcm.HttpFilter{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any performance impact? I'd imagine not but would be good to check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so, but we can check and probably need both ways temporarily.

@howardjohn howardjohn added the do-not-merge/hold Block automatic merging of a PR. label May 22, 2024
Change-Id: If50ce31054b5cba895c95e7767cef3deec6a96c6
Signed-off-by: Kuat Yessenov <[email protected]>
Change-Id: I4875e1bcc373570ddf8ac5a45d292660b6bbb7cc
Signed-off-by: Kuat Yessenov <[email protected]>
Change-Id: I5fd8be85a6650c233edb1ec28cd4e9e32a2d17e9
Signed-off-by: Kuat Yessenov <[email protected]>
@istio-testing istio-testing added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 22, 2024
Change-Id: I69f03d414658079bed72e1db7eee731352bccbe7
Signed-off-by: Kuat Yessenov <[email protected]>
Change-Id: Ie041c77f420927d9071755d4b413946fb98c9b2d
Signed-off-by: Kuat Yessenov <[email protected]>
@sschepens
Copy link
Contributor

Is this changing HTTP MX to use upstream filters?

I was wondering if using upstream filters could allow us to only send headers on the first request within a connection as an optimization to prevent sending MX headers on every request.

@kyessenov
Copy link
Contributor Author

@sschepens Yeah, but I think it would be better to use METADATA frames.

@sschepens
Copy link
Contributor

@sschepens Yeah, but I think it would be better to use METADATA frames.

Cool, using METADATA frames would be nice, but only be supported for h2, I guess that's fine though.

Were you planning on taking this changes as well?

@kyessenov
Copy link
Contributor Author

@sschepens First thing first, let's start using upstream HTTP filters for metadata, because there seems to be some bugs based on the results of the PR here.

Change-Id: Iab97c7e2ce047c4543501c322fa0c58cb3e2ab3a
Signed-off-by: Kuat Yessenov <[email protected]>
Change-Id: Ib28f1400c41f46d98bf93a4c0261f20667ca7224
Signed-off-by: Kuat Yessenov <[email protected]>
@istio-testing
Copy link
Collaborator

@kyessenov: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
release-notes_istio 19cbc35 link true /test release-notes
lint_istio 19cbc35 link true /test lint
unit-tests-arm64_istio 19cbc35 link true /test unit-tests-arm64
unit-tests_istio 19cbc35 link true /test unit-tests
integ-ipv6_istio 19cbc35 link true /test integ-ipv6
integ-basic-arm64_istio 19cbc35 link true /test integ-basic-arm64
integ-distroless_istio 19cbc35 link true /test integ-distroless
integ-ds_istio 19cbc35 link true /test integ-ds
integ-telemetry-istiodremote_istio 19cbc35 link true /test integ-telemetry-istiodremote
integ-telemetry_istio 19cbc35 link true /test integ-telemetry
integ-pilot-istiodremote_istio 19cbc35 link true /test integ-pilot-istiodremote
integ-telemetry-mc_istio 19cbc35 link true /test integ-telemetry-mc
integ-cni_istio 19cbc35 link true /test integ-cni
integ-pilot_istio 19cbc35 link true /test integ-pilot
integ-pilot-istiodremote-mc_istio 19cbc35 link true /test integ-pilot-istiodremote-mc
integ-pilot-multicluster_istio 19cbc35 link true /test integ-pilot-multicluster

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@istio-testing
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@istio-testing istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Jun 20, 2024
@istio-policy-bot istio-policy-bot added the lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while label Jun 22, 2024
@keithmattix
Copy link
Contributor

Not stale

@istio-policy-bot istio-policy-bot removed the lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while label Jun 22, 2024
@istio-policy-bot istio-policy-bot added the lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while label Jul 23, 2024
@keithmattix
Copy link
Contributor

@kyessenov do you need some help picking this back up?

@istio-policy-bot istio-policy-bot removed the lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while label Jul 23, 2024
@kyessenov
Copy link
Contributor Author

@keithmattix Maybe, or we can skip this iteration and try to use "header operations" directly in the upstream HTTP filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Block automatic merging of a PR. needs-rebase Indicates a PR needs to be rebased before being merged size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants