Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthetic source should load fields sequentially #94003

Open
iverase opened this issue Feb 22, 2023 · 4 comments
Open

Synthetic source should load fields sequentially #94003

iverase opened this issue Feb 22, 2023 · 4 comments

Comments

@iverase
Copy link
Contributor

iverase commented Feb 22, 2023

Synthetic source generally reads the values of a document from doc values. It does synthesise documents one by one by reading one doc value for each field at a time.

On the other hand doc values are optimised to read values sequentially so the access pattern of synthetic source is problematic as it will pollute the local cache and it can cause slow reads.

We should investigate the possibility of reading synthetic source fields sequentially to avoid cache issues.

@iverase iverase added >enhancement :StorageEngine/TSDB You know, for Metrics labels Feb 22, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

1 similar comment
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Feb 22, 2023
@wchaparro wchaparro removed the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 21, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@nik9000
Copy link
Member

nik9000 commented Jun 24, 2024

Synthetic source has some ability to do that. The interface is:

DocValuesLoader docValuesLoader(LeafReader leafReader, int[] docIdsInLeaf) throws IOException;

Some of the doc values implementations read the values out of doc values into an array up front and then replay them when they are building doc values. IIRC single valued numbers work that way but multivalued ones don't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants