Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics to auth #7840

Merged
merged 7 commits into from
Jun 9, 2024
Merged

Add metrics to auth #7840

merged 7 commits into from
Jun 9, 2024

Conversation

arielshaqed
Copy link
Contributor

  • Auto-generate a service wrapper for monitoring
  • Use that wrapper to monitor auth

Closes #7789.

How to review

  • Review auth/monitored.go and cmd/lakefs changes.
  • Generate and browse through wrapper.gen.go. It's not important to read all 50 funcs, only 1 func with an error return and 1 func without, to understand what the wrapper does.
  • Now review tools/wrapgen/main.go, which actually generates that file.

How we generate the code

Most of wrapper.gen.go is a text/template in backquotes. It tries to generate a nice wrapper for every method that it finds in the interface. The rest of the file si some driver code that extracts information from the Go parser.

Why not write it directly?

There are currently 43 methods in wrapper.gen.go. I started by coding these directly, but the code rapidly became a mess: there are 2 kinds of functions, those that perform an action and return an error and helpers that only compute. Every function takes different args, and needs a snake-cased operation code. Copy-paste failed me, and my regexp-fu was not strong enough. And what happens when I need to change the parameters of the "Observe" func?

Now think what happens when someone adds a method to the interface: they and the reviewer must ensure that the wrapper is correctly updated.

Luckily Go is yucky enough that it exposes its parser, and we can use that to write a reasonable generator.

Questions

  1. Do we want to commit wrapper.gen.go? By now I don't care either way, and will do whatever the reviewers say.
  2. Do we want to put the wrapper generator in tools/ or elsewhere? In future it will likely move out to another module, of course, but I'm asking what to do now.
  3. Would the text template be more readable in a separate file?

In future can and will take this to a separate package.
TODO: checks-validator
- Generate monitoring wrapper
- Wrap generated auth service inside that wrapper
@arielshaqed arielshaqed added performance area/auth IAM, authorization, authentication, audit, AAA, and integrations with all those include-changelog PR description should be included in next release changelog go Pull requests that update Go code feature labels Jun 6, 2024
It's a quick step, and is now needed to build a Docker image during the
build for Esti.  (Unfortunately we ignore the Makefile here, so changes to
the Makefile need to be copied here, too.)
@arielshaqed arielshaqed force-pushed the feature/7789-auth-metrics branch 3 times, most recently from 81ad21d to c56f7bb Compare June 9, 2024 09:46
@Isan-Rivkin
Copy link
Contributor

@arielshaqed

  1. Do we want to commit wrapper.gen.go? By now I don't care either way, and will do whatever the reviewers say.

I think that we should - unless the build is inconsistent between runs and will always show changes even when there are none. For example if the builds add a timestamp every time or something.

  1. Do we want to put the wrapper generator in tools/ or elsewhere? In future it will likely move out to another module, of course, but I'm asking what to do now.

i like where it's now.

  1. Would the text template be more readable in a separate file?

The current way is great

Copy link
Contributor

@Isan-Rivkin Isan-Rivkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arielshaqed this overall looks good, but I have not finished yet. Just added here minor question I have will continue reviewing a bit later today 🙏

pkg/auth/service.go Show resolved Hide resolved
tools/wrapgen/main.go Show resolved Hide resolved
Copy link
Contributor Author

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I still need to remove wrinkles from running Esti. I will promote it to non-draft once I figure that out.

Also: I did not commit pkg/auth/wrapper.gen.go because we do not commit pkg/auth/client.gen.go. I will defer committing it until we have a better decision on what we want to do here.

pkg/auth/service.go Show resolved Hide resolved
tools/wrapgen/main.go Show resolved Hide resolved
Copy link
Contributor

@Isan-Rivkin Isan-Rivkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks wonderful.
Single non-blocking request but since still in Draft mode not approved since I expect changes (?)

pkg/auth/service.go Show resolved Hide resolved
Copy link

github-actions bot commented Jun 9, 2024

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed

- Depend on auth/ package to generate code
- Always compress generated code

  Previously it _said_ it was compressed, but actually wasn't!
- Add some better debugging for what generated code was fetched
- Do the same in other Spark and other system test deployments
@arielshaqed
Copy link
Contributor Author

Thanks; PTAL!

I fixed some issues with running Esti, some in this code and some in weird caching that Esti performs on generated codes. Other than that, this is the same code you saw before.

Why no tests?

I would like to delay adding tests because they are really painful to write. A Go test cannot generate its needed files before running, because go generate needs to be a separate command. So I am not really sure how best to run this kind of half-unit / half-system test, and would like some more time.

In the meantime I believe that running this code and seeing metrics is the best that I can easily do. It is quite comprehensive, and I can show results now. On my machine, after a few web UI accesses, it looks like this:

❯ http get localhost:8000/metrics | grep auth_request_duration_seconds
# HELP auth_request_duration_seconds request durations for auth
# TYPE auth_request_duration_seconds histogram
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.001"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.002"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.005"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.01"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.02"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.05"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.1"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.2"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="0.5"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="1"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="2"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="5"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="10"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="30"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="60"} 9
auth_request_duration_seconds_bucket{operation="authorize",success="success",le="+Inf"} 9
auth_request_duration_seconds_sum{operation="authorize",success="success"} 0.001700304
auth_request_duration_seconds_count{operation="authorize",success="success"} 9
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.001"} 0
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.002"} 0
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.005"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.01"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.02"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.05"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.1"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.2"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="0.5"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="1"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="2"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="5"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="10"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="30"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="60"} 1
auth_request_duration_seconds_bucket{operation="get_credentials",success="success",le="+Inf"} 1
auth_request_duration_seconds_sum{operation="get_credentials",success="success"} 0.003426277
auth_request_duration_seconds_count{operation="get_credentials",success="success"} 1
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.001"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.002"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.005"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.01"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.02"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.05"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.1"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.2"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="0.5"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="1"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="2"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="5"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="10"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="30"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="60"} 11
auth_request_duration_seconds_bucket{operation="get_user",success="success",le="+Inf"} 11
auth_request_duration_seconds_sum{operation="get_user",success="success"} 9.617400000000002e-05
auth_request_duration_seconds_count{operation="get_user",success="success"} 11

@arielshaqed arielshaqed marked this pull request as ready for review June 9, 2024 12:39
@arielshaqed
Copy link
Contributor Author

Thanks!

@arielshaqed arielshaqed merged commit 4775777 into master Jun 9, 2024
35 checks passed
@arielshaqed arielshaqed deleted the feature/7789-auth-metrics branch June 9, 2024 12:40
arielshaqed added a commit that referenced this pull request Jun 9, 2024
Compatibility tests are broken after #7840 - fixed generated.tar.gz to be
compressed, but forgot to tell compatibility-tests about it.
arielshaqed added a commit that referenced this pull request Jun 9, 2024
Compatibility tests are broken after #7840 - fixed generated.tar.gz to be
compressed, but forgot to tell compatibility-tests about it.
arielshaqed added a commit that referenced this pull request Jun 9, 2024
Compatibility tests are broken after #7840 - fixed generated.tar.gz to be
compressed, but forgot to tell compatibility-tests about it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/auth IAM, authorization, authentication, audit, AAA, and integrations with all those feature go Pull requests that update Go code include-changelog PR description should be included in next release changelog performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

authAPI metrics in lakeFS
2 participants