Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MGDSTRM-3724 Add the status code to the keycloak_request_duration_bucket metric #108

Merged
merged 4 commits into from
Sep 6, 2021

Conversation

CathalOConnorRH
Copy link
Contributor

@CathalOConnorRH CathalOConnorRH commented Jul 26, 2021

Motivation

https://issues.redhat.com/browse/MGDSTRM-3724
https://issues.redhat.com/browse/MGDSTRM-4318

What

Add status code and uri to the keycloak_request_duration_bucket metric

Why

This will allow filtering on specific statuses
Currently no status are returned in the metric
keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="50.0",} 0.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="100.0",} 0.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="250.0",} 0.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="500.0",} 0.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="1000.0",} 1.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="2000.0",} 1.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="10000.0",} 1.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="30000.0",} 1.0 keycloak_request_duration_bucket{method="POST",resource="realms,realms/master/protocol/openid-connect",le="+Inf",} 1.0 keycloak_request_duration_count{method="POST",resource="realms,realms/master/protocol/openid-connect",} 1.0 keycloak_request_duration_sum{method="POST",resource="realms,realms/master/protocol/openid-connect",} 625.0

How

added status code and uri to the to the bucket output.

Verification Steps

Option 1

  1. Build the code
  2. Copy the artifact into $KC/standalone/deployments
  3. Make sure the metrics logger is turned on
  4. Play with Keycloak a bit
  5. Look at the following url: "$KC/auth/realms/master/metrics"

Option 2

update keycloak cr to use the jar built and released from my fork
https://github.com/CathalOConnorRH/keycloak-metrics-spi/releases/tag/v2.4.1-beta

Metrics should look similar to
keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="50.0",} 0.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="100.0",} 0.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="250.0",} 1.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="500.0",} 1.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="1000.0",} 1.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="2000.0",} 1.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="10000.0",} 1.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="30000.0",} 1.0 keycloak_request_duration_bucket{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",le="+Inf",} 1.0 keycloak_request_duration_count{code="200",method="POST",resource="realms,realms/master/protocol/openid-connect",uri="realms/master/protocol/openid-connect/token",} 1.0

Checklist:

  • Code has been tested locally by PR requester
  • Changes have been successfully verified by another team member

Progress

  • Finished task
  • TODO

Additional Notes

PS.: Add images and/or .gifs to illustrate what was changed if this pull request modifies the appearance/output of something presented to the users.

@CathalOConnorRH CathalOConnorRH force-pushed the MGDSTRM-3724 branch 3 times, most recently from ced6587 to c982927 Compare August 6, 2021 11:24
@@ -265,28 +265,28 @@ public void shouldCorrectlyRecordGenericAdminEvents() throws IOException {

@Test
public void shouldCorrectlyRecordResponseDurations() throws IOException {
PrometheusExporter.instance().recordRequestDuration(5, "GET", "admin,admin/serverinfo");
PrometheusExporter.instance().recordRequestDuration(200, 5, "GET", "admin,admin/serverinfo", "auth/realm");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A doubt : should the URI be only "auth/realm" or it can be other paths as well? (since all the tests use this , wanted to understand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the auth/realm URI is just an example as part of the test, Any URI path can be used

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok got it. thanks!

Copy link

@maleck13 maleck13 Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to wait for @pb82. As I understand it just using the RAW URI caused issues in the past especially when there were unique ids in the path as it created too many timeseries. Ideally these URIs would be translated into something like realms/{realm}/protocol/{protocol}/token however that could be a lot of effort so perhaps better to choose some important APIs and just covers those initially.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The openid-connect ones would be important ones along with service account mgmt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CathalOConnorRH the raw URL should not be added as a label, this is what caused #64 in the past

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CathalOConnorRH but it looks like you're already using the resource extractor instead of the raw URL. That's better, but keep in mind that every combination of labels leads to a new exported histogram (and that means one additional mertric per bucket plus sum and total). So if we keep adding labels to histograms the amount of metrics will grow (and has grown to huge numbers in the past).

@Rajagopalan-Ranganathan
Copy link

/lgtm

@akoserwal
Copy link

@pb82 Would like to take a look as well?

@akoserwal
Copy link

LGTM

@pb82
Copy link
Contributor

pb82 commented Aug 9, 2021

@CathalOConnorRH if it is absolutely required to add those labels to the histogram, can we make that opt-in behind a flag?

@CathalOConnorRH
Copy link
Contributor Author

@pb82 I've updated the ResourceExtractor to include a URI_METRICS_ENABLED boolean and a check for /token URI.
an example of the metrics output is attached

Let me know what you think

KeycloakMetrics.txt

@CathalOConnorRH
Copy link
Contributor Author

Hey @pb82 Have you had any thoughts on the above changes ?

@pb82
Copy link
Contributor

pb82 commented Aug 17, 2021

thanks @CathalOConnorRH I'll give it a final try and then merge

@Rajagopalan-Ranganathan

@pb82 did you find time to try it out? Just curious about the status. Thanks!

* @return The resource uri.
*/
static String getURI(UriInfo uriInfo) {
if (URI_METRICS_ENABLED) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense here to replace certain parts of the URI with a place holder to reduce uniqueness? For example replace /realm/master with /realm/{realm} for example. Looking for ways to reduce uniqueness while still providing the needed insight

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's possible, We could do another flag to enable more detailed realm output if needed ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CathalOConnorRH if we add that, we can also add the change were we remove the labels completely if URI_METRICS_ENABLED is not set (at the moment they are put on all metrics with an empty value).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, I'll work on both of those changes today

@@ -23,6 +23,8 @@
private static final String METRICS_REQUEST_TIMESTAMP = "metrics.requestTimestamp";
private static final MetricsFilter INSTANCE = new MetricsFilter();

private static final boolean URI_METRICS_ENABLED = Boolean.getBoolean("URI_METRICS_ENABLED");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a JAva guy, so trying to understand would it default to false, in case the variable is not defined?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the docs:

Returns true if and only if the system property named by the argument exists and is equal to the string "true". (Beginning with version 1.0.2 of the JavaTM platform, the test of this string is case insensitive.) A system property is accessible through getProperty, a method defined by the System class.

If there is no property with the specified name, or if the specified name is empty or null, then false is returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if the value is set and set to true it will return true, otherwise it returns false
https://www.tutorialspoint.com/java/lang/boolean_getboolean.htm

@CathalOConnorRH
Copy link
Contributor Author

@pb82 I've implemented the detailed output and removed the uri when it's not enabled.
Some sample output of each of the three scenarios are attached.
Let me know what you think
URIEnabledURIDetailed.txt
URIDisabled.txt
URIEnabled.txt

@Rajagopalan-Ranganathan
Copy link

@pb82 We need this change in MAS SSO soon, to build some metrics - dashboards and also to investigate and get precise information regarding our load (which end points are used most). Can you take a look and close this at the earliest?

@pb82 pb82 merged commit 4dd6b7b into aerogear:master Sep 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants