-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metric to count search responses #110070
Conversation
Pinging @elastic/es-search (Team:Search) |
meterRegistry.registerLongCounter( | ||
RESPONSE_COUNT_TOTAL_COUNTER_NAME, | ||
"The cumulative total of search responses with an attribute to describe" | ||
+ "success, partial failure, or failure, expressed as a counter and an attribute", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think you need a space after "describe" or before "success"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asking to learn: what does "expressed as a counter and an attribute" mean? How is it both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attributes can be attached to counters in this case which afaik will add additional fields for each document with the count of each attribute as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the description a bit based on this feedback.
public void incrementResponseCount(ResponseCountTotalStatus responseCountTotalStatus) { | ||
responseCountTotalCounter.incrementBy(1L, Map.of(RESPONSE_COUNT_TOTAL_STATUS_ATTRIBUTE_NAME, responseCountTotalStatus)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what's causing the following build failure:
Caused by: java.lang.IllegalArgumentException: attributes do not support value type of [org.elasticsearch.rest.action.search.SearchResponseMetrics.ResponseCountTotalStatus]
» at [email protected]/org.elasticsearch.telemetry.apm.internal.metrics.OtelHelper.lambda$fromMap$0(OtelHelper.java:50)
You could change this to:
responseCountTotalCounter.incrementBy(1L, Map.of(RESPONSE_COUNT_TOTAL_STATUS_ATTRIBUTE_NAME, responseCountTotalStatus.name()));
}
Or,
public enum ResponseCountTotalStatus {
SUCCESS("success"),
PARTIAL_FAILURE("partial_failure"),
FAILURE("failure");
private final String value;
ResponseCountTotalStatus(String value) {
this.value = value;
}
public String value() {
return value
}
}
...
responseCountTotalCounter.incrementBy(1L, Map.of(RESPONSE_COUNT_TOTAL_STATUS_ATTRIBUTE_NAME, responseCountTotalStatus.value()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After chatting with @original-brownbear my concerns around the assert triggered by this error have been alleviated. My understanding is the assert is related to ensuring we don't except any exceptions to be triggered as part of onResponse
except for unusual cases. I will fix this and then expect the tests will pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to your suggestion of value
but used displayName
instead to not overload the term value
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Non-blocking: @quux00 Does https://github.com/jdconrad/elasticsearch/blob/c252b40ea63aa764c4a88e12d37f1e51af9af768/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/termsenum/action/TransportTermsEnumAction.java#L158 need to be instrumented as well?
} | ||
if ((f.status().getStatus() >= 500 || causeHas500Status) | ||
&& ExceptionsHelper.isNodeOrShardUnavailableTypeException(f.getCause()) == false) { | ||
logger.warn("TransportSearchAction shard failure (partial results response)", f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was done in the context of serverless, where we don't log or alert on "user errors" (4xx) (other than 429s, but that doesn't apply here). The goal of the logging epic was to get a sense of how often 5xx "system" type errors are occurring when users get partial results.
Yeah, probably. We could bring that to an es search foundations team meeting for discussion and follow on ticket. |
This adds a metric to count the total number of search responses generated by the search action and the scroll search action. This uses the key
es.search_response.response_count.total
. Each response also has an attribute to specify whether the response was a success, a partial failure, or a failure.