Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access aggregate_metric_double fields from runtime fields and scripts #96480

Open
salvatore-campagna opened this issue Jun 1, 2023 · 6 comments
Labels
>enhancement :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:StorageEngine

Comments

@salvatore-campagna
Copy link
Contributor

salvatore-campagna commented Jun 1, 2023

Description

Customers need the ability to process indices and/or data streams including both raw data and downsampled data using ingest pipelines (configured, for instance with index.default_pipeline and index.final_pipeline).

When an index is downsampled, downsampling target indices have slightly different mappings and settings which might prevent pipeline scripts or runtime fields to work correctly. Ideally they would like to have a mechanism that is as easy as possible to use and that allows them to seamlessly process raw data and downsampled data using runtime fields and/or scripts without having to care much about whether the index includes raw data or downsampled data and such that they don't have to maintain multiple scripts (one for raw data and one for downsampled data).

Right now the main obstacle is about having the ability to access fields of type aggregate_metric_double that the downsampling operation creates when downsampling some metric fields. We need support for accessing such fields from pipeline scripts and runtime fields to start with.

Related to #96478.

@salvatore-campagna salvatore-campagna added >enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data labels Jun 1, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 1, 2023
@salvatore-campagna
Copy link
Contributor Author

Related to #88534

@salvatore-campagna
Copy link
Contributor Author

salvatore-campagna commented Jun 1, 2023

Right now the ability to access these fields is available through the Fields API like demonstrated by the following YAML test:

setup:
  - skip:
      version: " - 8.0.99"
      reason: introduced in 8.1.0

  - do:
      indices.create:
        index: test
        body:
          settings:
            index:
              mode: time_series
              routing_path: [ metricset ]
              time_series:
                start_time: 2021-04-28T00:00:00Z
                end_time: 2021-04-29T00:00:00Z
          mappings:
            properties:
              "@timestamp":
                type: date
              metricset:
                type: keyword
                time_series_dimension: true
              aggregate:
                type: aggregate_metric_double
                metrics: [ min, max, sum, value_count ]
                default_metric: sum

  - do:
      bulk:
        refresh: true
        index: test
        body:
          - '{"index": {}}'
          - '{"@timestamp": "2021-04-28T18:50:00.000Z", "metricset": "pod", "counter": 1, "gauge": 100, "aggregate": { "min": 1, "max": 10, "sum": 23, "value_count": 5 }}'

---
"painless on aggregate_metric_double field":
  - do:
      search:
        index: test
        body:
          _source: false
          fields: [ "aggregate_sum" ]
          runtime_mappings:
            aggregate_sum:
              type: double
              script:
                source: "emit(doc['aggregate'].value)"
                lang: painless
          query:
            match_all: {}

  - match: { hits.total.value: 1 }
  - match: { hits.hits.0.fields.aggregate_sum.0: 23.0 }

Access to one of the aggregate values is provided by taking advantage of default_metric into the mapping.
Note that all metrics are of type double.
Note also that if there is a single metric, that is used without having to specify a default one.

@salvatore-campagna
Copy link
Contributor Author

My understanding is that, at the moment, we are not able to expose all metrics under the aggregate metric field because accessing the value returns an array and we have no way to index into that array by means of a keyword like min or max.

@wchaparro
Copy link
Member

@giladgal FYI

@wchaparro wchaparro added :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data and removed :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data labels Jun 23, 2023
@wchaparro wchaparro removed the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 21, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data Team:StorageEngine
Projects
None yet
Development

No branches or pull requests

3 participants