-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ESQL] Make query wrapped by SingleValueQuery cacheable #110116
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Hi @iverase, I've created a changelog YAML for you. |
|
||
@Override | ||
public Scorer get(long leadCost) { | ||
TwoPhaseIterator iterator = new TwoPhaseIterator(DocIdSetIterator.all(maxDoc)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can do something better here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hand rolling each of the subclasses of these because I was worried the indirection from the Predicate would be expensive so deep in the hot path.
For doing something better, do you mean the DISI.all
call? I imagine it's ok because this is always going to be used with another query that's going to lead iteration naturally. I think the worst case scenario is when only a few docs have multiple values and the other query matches almost everything. I feel like we can live with it.
@elasticmachine update branch |
|
||
@Override | ||
public Scorer get(long leadCost) { | ||
TwoPhaseIterator iterator = new TwoPhaseIterator(DocIdSetIterator.all(maxDoc)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hand rolling each of the subclasses of these because I was worried the indirection from the Predicate would be expensive so deep in the hot path.
For doing something better, do you mean the DISI.all
call? I imagine it's ok because this is always going to be used with another query that's going to lead iteration naturally. I think the worst case scenario is when only a few docs have multiple values and the other query matches almost everything. I feel like we can live with it.
assertThat(builder.stats().ordinalsMultiNoApprox(), equalTo(0)); | ||
assertThat(builder.stats().ordinalsMultiApprox(), equalTo(0)); | ||
assertThat(builder.stats().bytesApprox(), equalTo(0)); | ||
assertThat(builder.stats().bytesNoApprox(), equalTo(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the stats would be useful in debugging slow queries. It's certainly nice to know things like "this rewrote itself to match all in 7/8 segments".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't think it's a blocker.
@elasticmachine update branch |
This commit simplifies SingleValueQuery by introducing a new lucene query called SingleValueMatchQuery. This query iterates over doc values and return only the fields that are single-valued.
This commit simplifies SingleValueQuery by introducing a new lucene query called SingleValueMatchQuery. This query iterates over doc values and return only the fields that are single-valued. Then SingleValueQuery can generate a lucene query which is a boolean query with two filters, one with the original query and the other with a SingleValueMatchQuery.