Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata: Use strings.EqualFold for ValueFromIncomingContext #6743

Merged
merged 2 commits into from
Oct 30, 2023
Merged

metadata: Use strings.EqualFold for ValueFromIncomingContext #6743

merged 2 commits into from
Oct 30, 2023

Conversation

evanj
Copy link
Contributor

@evanj evanj commented Oct 19, 2023

This is approximately 30% faster in a microbenchmark I added. The previous version using strings.ToLower() looks at every key byte. This version will typically evaluate "not equal" on the first byte.

The test and benchmark had no coverage for this loop, so I added it. I also revised the comments in FromIncomingContext and ValueFromIncomingContext. The problem is not that the MD is created using the helper functions. The MD must be attached to the context using NewIncomingContext, which could lower case the keys. The problem is that NewIncomingContext does not make a copy of its argument, so callers can mutate it after.

This change adds a test for FromIncomingContext that tests this case, and modifies the existing TestValueFromIncomingContext to test it.

Benchstat output from 10 runs of the benchmark on my laptop:

goos: darwin
goarch: arm64
pkg: google.golang.org/grpc/metadata
                                          │ before.txt  │              after.txt              │
                                          │   sec/op    │   sec/op     vs base                │
ValueFromIncomingContext/key-found-10       26.56n ± 2%   26.74n ± 2%        ~ (p=0.078 n=10)
ValueFromIncomingContext/key-not-found-10   50.48n ± 1%   36.23n ± 2%  -28.23% (p=0.000 n=10)
geomean                                     36.62n        31.13n       -15.00%

RELEASE NOTES: none

This is approximately 30% faster in a microbenchmark I added. The
previous version using strings.ToLower() looks at every key byte.
This version will typically evaluate "not equal" on the first byte.

The test and benchmark had no coverage for this loop, so I added it.
I also revised the comments in FromIncomingContext and
ValueFromIncomingContext. The problem is not that the MD is created
using the helper functions. The MD must be attached to the context
using NewIncomingContext, which could lower case the keys. The
problem is that NewIncomingContext does not make a copy of its
argument, so callers can mutate it after.

This change adds a test for FromIncomingContext that tests this case,
and modifies the existing TestValueFromIncomingContext to test it.

Benchstat output from 10 runs of the benchmark on my laptop:

goos: darwin
goarch: arm64
pkg: google.golang.org/grpc/metadata
                                          │ before.txt  │              after.txt              │
                                          │   sec/op    │   sec/op     vs base                │
ValueFromIncomingContext/key-found-10       26.56n ± 2%   26.74n ± 2%        ~ (p=0.078 n=10)
ValueFromIncomingContext/key-not-found-10   50.48n ± 1%   36.23n ± 2%  -28.23% (p=0.000 n=10)
geomean                                     36.62n        31.13n       -15.00%
@evanj
Copy link
Contributor Author

evanj commented Oct 19, 2023

I may have misunderstood the comments that I removed, so please feel free to reject/correct this. I noticed this because I was looking at profiles for a program that does a lot of tiny gRPC requests, and noticed with our various libraries, we call metadata.FromIncomingContext 3 times for each request, for a total of about 2.5% of all CPU consumed by this service. It would be nice if this could just return an immutable object instead.

I am very happy to see ValueFromIncomingContext instead! It should be a nice easy improvement across all our services. Although it is a bit unfortunate that the values are in a map that has to be linearly searched on a "not found" request.

I think it is worth considering changing NewIncomingContext and NewOutgoingContext to make a copy of their argument, because we could avoid some of these annoying lower case tricks. This could break programs that mutate the MD values after calling them though. Another alternative would be to define a new API, and provide automatic conversion to/from the old APIs. If you are interested in either of these options, I may be able to try to contribute them.

@codecov
Copy link

codecov bot commented Oct 19, 2023

Codecov Report

Merging #6743 (1270202) into master (e88e849) will increase coverage by 0.13%.
The diff coverage is 100.00%.

Additional details and impacted files

arvindbr8
arvindbr8 previously approved these changes Oct 19, 2023
Copy link
Member

@arvindbr8 arvindbr8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. LGTM as I see the benefit of using EqualFold over lowercasing every key.

However, I would && my approval with @dfawley's opinion on this patch-set.

@arvindbr8 arvindbr8 added the Type: Performance Performance improvements (CPU, network, memory, etc) label Oct 19, 2023
@arvindbr8 arvindbr8 added this to the 1.60 Release milestone Oct 19, 2023
@arvindbr8
Copy link
Member

In the future, it would be better if we have an issue to discuss proposed changes. That way we can have more elaborate discussions in the issue and keep PRs for code-related comments only.

Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, it would be better if we have an issue to discuss proposed changes. That way we can have more elaborate discussions in the issue and keep PRs for code-related comments only.

I am fine with PR-as-proposal as long as the expectations are clear that large proposal PRs might be rejected and a bunch of work could be wasted. But sometimes showing what the code will look like can be helpful for discussion.

I think it is worth considering changing NewIncomingContext and NewOutgoingContext to make a copy of their argument

The downside of this is that it punishes those that use the API correctly because of others that don't.

because we could avoid some of these annoying lower case tricks.

Not really. The need for the tricks is because the metadata type is not protected; it's a bare map that users can access. If we controlled it via accessors then we could guarantee it was always correct.

This could break programs that mutate the MD values after calling them though.

This should not happen. I don't actually see where this is documented, but modifying MD after adding it to the context is not intended.

Another alternative would be to define a new API

Yeah I think a new API would be nice, but it's not a priority for us right now, and it will ultimately complicate things for users, since this version of the API will also need to exist forever.

// created using our helper functions.
if strings.ToLower(k) == key {
// Case-insensitive compare: MD can be modified after NewIncomingContext().
if strings.EqualFold(k, key) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should remove key must be lower-case from the godoc for the function now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point thanks for pointing it out. I've edited the comment to that effect.

One advantage to not requiring key to be lower case is it makes this function work more like other functions (most relevant here, MD.Get). A disadvantage is that it now has to always handle upper case keys, even if we eventually change the internal representation in the Context. Maybe this is an over-optimization that isn't worth worrying about.

// We need to manually convert all keys to lower case, because MD is a
// map, and there's no guarantee that the MD attached to the context is
// created using our helper functions.
// Convert keys to lower case: MD can be modified after NewIncomingContext().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly here, but the intent of the original comment was to say that .. if only our methods are used to create/modify metadata, then everything is always lowercase. But users can just do metadata.NewOutgoingContext(ctx, metadata.MD{"mY-KeY": []string{"v"}) and we can't prevent that.

MD should not be modified after NewIncomingContext. (Actually, users shouldn't need to call NewIncomingContext in the first place.) It also shouldn't be modified after calling NewOutgoingContext. I thought we documented this long ago, but I don't see it now... If it's mutated in place then very bad things could happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a sentence to NewIncomingContext and NewOutgoingContext. I also thought I had seen that documented at some point.

I reverted this comment change. The reason this confused me is that the only way the MD can get in the context is via NewIncomingContext and NewOutgoingContext, so those functions could ensure the MD is in the correct form. The problem with that is since MD is a mutable map, it could be modified later.

I'm going to experiment with this a bit and I'll come back with an issue if I think there is something that is "easy" to change here that would help avoid copies.

Comment on lines 244 to 246
// Check that we lowercase if callers modify md after NewIncomingContext
md["X-INCORRECT-UPPERCASE"] = []string{"foo"}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callers should NOT mutate md after calling New[Incoming/Outgoing]Context. Doing so could cause races or data inconsistencies. Let's not do this in our tests since it is not supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I've moved this to before the call to NewIncomingContext. It still has the same effect of requiring this loop over the map keys.

@evanj
Copy link
Contributor Author

evanj commented Oct 21, 2023

Thank you for the detailed discussion! I suspect that the ~2.5% CPU overhead I'm seeing is going to be greatly reduced when I change our internal helpers to use ValueFromIncomingContext. Unfortunately, Datadog's Go tracing library needs to iterate over all keys, so it can't use this. I'll try to find an hour next week to take a more careful look at this API. I'll file a separate issue if I think there is a "reasonable" way to evolve it in some way that might be able to avoid cloning the map on each call. Maybe a new call that returns an immutable view wrapper around the MD?

[1] https://github.com/DataDog/dd-trace-go/blob/00055c7bdeed864cd0008ba1216c3b508dce0874/contrib/google.golang.org/grpc/grpc.go#L69

@dfawley
Copy link
Member

dfawley commented Oct 24, 2023

Maybe a new call that returns an immutable view wrapper around the MD?

Something like that might be a good idea and not require a total rework of the metadata package (or a v2). Thanks!

@dfawley
Copy link
Member

dfawley commented Oct 27, 2023

PTAL @arvindbr8

Copy link
Member

@arvindbr8 arvindbr8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@arvindbr8 arvindbr8 merged commit d7ea67b into grpc:master Oct 30, 2023
14 checks passed
arvindbr8 pushed a commit to arvindbr8/grpc-go that referenced this pull request Nov 7, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Type: Performance Performance improvements (CPU, network, memory, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants