Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthetic source should not load fields that are not required #94001

Open
iverase opened this issue Feb 22, 2023 · 3 comments
Open

Synthetic source should not load fields that are not required #94001

iverase opened this issue Feb 22, 2023 · 3 comments

Comments

@iverase
Copy link
Contributor

iverase commented Feb 22, 2023

The current implementation synthesise the full document regardless if the value is required on the final response doing unnecessary work that can be very expensive on slow disk.

Consider the following example:

public void testSyntheticSourceAllFields() throws Exception {

        XContentBuilder indexSpec = XContentBuilder.builder(XContentType.JSON.xContent()).startObject();
        {
            indexSpec.startObject("_source").field("mode", "synthetic").endObject();
            indexSpec.startObject("properties")
                .startObject("kwd1").field("type", "keyword").endObject()
                .startObject("kwd2").field("type", "keyword").endObject()
                .startObject("kwd3").field("type", "keyword").endObject()
                .startObject("kwd4").field("type", "keyword").endObject()
            .endObject();
        }
        assertAcked(admin().indices().prepareCreate("synthetic").setMapping(indexSpec.endObject()).get());
        indexRandom(true, client().prepareIndex("synthetic").setSource("kwd1", "val1", "kwd2", "val2", "kwd3", "val3", "kwd4", "val4"));
        SearchResponse response = client().prepareSearch("synthetic").addFetchField("kwd3").setFetchSource(false).get();
    }

The query is not asking for source and only requesting the value of one of the fields. Still in this case we synthesise source with all the fields, accessing 4 doc values. We should be able to read just the requested value in this case.

@iverase iverase added :StorageEngine/TSDB You know, for Metrics >enhancement labels Feb 22, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 22, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@martijnvg
Copy link
Member

Note that the source is completely synthesized if source filtering is active. For example in this case:

public void testSyntheticSourcePartialSource() throws Exception {
        XContentBuilder indexSpec = XContentBuilder.builder(XContentType.JSON.xContent()).startObject();
        //indexSpec.startObject("mappings");
        {
            indexSpec.startObject("_source").field("mode", "synthetic").endObject();
            indexSpec.startObject("properties")
                .startObject("kwd1").field("type", "keyword").endObject()
                .startObject("kwd2").field("type", "keyword").endObject()
                .startObject("kwd3").field("type", "keyword").endObject()
                .startObject("kwd4").field("type", "keyword").endObject()
                .endObject();
        }
        assertAcked(admin().indices().prepareCreate("synthetic").setMapping(indexSpec.endObject()).get());
        indexRandom(true, client().prepareIndex("synthetic").setSource("kwd1", "kwd1", "kwd2", "kwd2", "kwd3", "kwd3", "kwd4", "kwd4"));

        SearchResponse response = client().prepareSearch("synthetic").setFetchSource("kwd3", null).get();
        logger.info("response=" + response);
    }

Ideally only fields that match with the source filtering should be loaded.

@wchaparro wchaparro removed the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 21, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants