-
Notifications
You must be signed in to change notification settings - Fork 24.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support synthetic source together with ignore_malformed in histogram fields #109882
Support synthetic source together with ignore_malformed in histogram fields #109882
Conversation
Documentation preview: |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
Hi @lkts, I've created a changelog YAML for you. |
* Typical use case is to gather field values from doc_values and append malformed values | ||
* stored in a different field in case of ignore_malformed being enabled. | ||
*/ | ||
public class CompositeSyntheticFieldLoader implements SourceLoader.SyntheticFieldLoader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed after implementing this that this is very close to what ObjectMapper.SyntheticSourceFieldLoader
does. Maybe we can unify some code later.
This is also an alternative approach to current implementation of f.e. SortedNumericDocValuesSyntheticFieldLoader
where malformed values handling is implemented explicitly. That logic is repeated in multiple loaders that handle different doc values types. I obviously didn't refactor that in this PR but wanted to gather some thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I think in a followup we can explore how to have a common base class for this class and ObjectMapper.SyntheticSourceFieldLoader
.
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Typical use case is to gather field values from doc_values and append malformed values | ||
* stored in a different field in case of ignore_malformed being enabled. | ||
*/ | ||
public class CompositeSyntheticFieldLoader implements SourceLoader.SyntheticFieldLoader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I think in a followup we can explore how to have a common base class for this class and ObjectMapper.SyntheticSourceFieldLoader
.
private List<Object> values; | ||
|
||
public MalformedValuesLayer(String fieldName) { | ||
this.fieldName = fieldName + "._ignore_malformed"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"._ignore_malformed" should be a const somewhere.
if (v instanceof BytesRef r) { | ||
XContentDataHelper.decodeAndWrite(b, r); | ||
} else { | ||
b.value(v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case for this one? I thought malformed values are always encoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's f.e. text we skip encoding in some fields. This is for compatibility with existing code.
if (binaryValue == null) { | ||
return; | ||
} | ||
b.startObject(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed from b.startObject(simpleName());
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because composite loader writes that now. It is possible that there are malformed values so this is now not an object but an array that contains an object.
id: 2 | ||
- match: | ||
_source: | ||
latency: [{"values": [2.0], "counts": [2]}, {"values": [1.0], "counts": [1], "hello": "world"}, 123, 456, "fox"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We miss that we got [123, 456]
as a pair.. Not a biggie, wonder if there's an easy way to catch the array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional, this is how it works everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, just a few minor ones.
Contributes to #106483.