Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide ability for additional "trip" meter alongside the existing "odometer" for cumulative stats metrics #91059

Open
geekpete opened this issue Oct 21, 2022 · 2 comments
Labels
:Data Management/Stats Statistics tracking and retrieval APIs >enhancement Team:Data Management Meta label for data/management team

Comments

@geekpete
Copy link
Member

geekpete commented Oct 21, 2022

Description

When trying to work out if some counted failures or events are recent or not takes multiple polls to run a difference against or a jvm restart to zero out all the counters.
It's hard to tell initially if a high count is because one node has a larger uptime or of that node is doing something different to others.

If we could have the ability to have just one separate (or customizable) additional counters associated with existing ones, that can be either deleted/recreated or just reset to zero with an an associated age/timestamp of when the last reset was triggered, this would give us an idea of a "trip" meter vs an "odometer" for particular counters.

This might be useful for:

  • pipeline metrics, especially failure rates
  • circuit breaker trip metrics
  • threadpool metrics

eg for ingest pipeline stats:
They currently look like:

 "some_processor" : {
                  "type" : "some_type",
                  "stats" : {
                    "count" : 27790210822,
                    "time" : "2.2d",
                    "time_in_millis" : 195188912,
                    "current" : 0,
                    "failed" : 31031
                  }
                }

If you had just one other counter or however many custom cumulative counters as subsets of the parent counter type:

 "some_processor" : {
                  "type" : "some_processor",
                  "stats" : {
                    "count" : 27790210822,
                    "time" : "2.2d",
                    "time_in_millis" : 195188912,
                    "current" : 0,
                    "failed" : 31031,
                    {
                      "trip_stats": {
                       "count" : 2779021,
                       "time" : "3h",
                       "time_in_millis" : 10800000,
                       "last_reset" : "2022-10-21T01:05:47.361Z",
                       "failed" : 27,
                       }
                    }
                  }
                }

eg using meta fields to record whatever additional meta detail perhaps.

"some_processor" : {
    "type" : "some_type",
    "stats" : {
        "count" : 27790210822,
        "time" : "2.2d",
        "time_in_millis" : 195188912,
        "current" : 0,
        "failed" : 31031,
        "custom_counter_2022-10-21T01:05:47.361Z": {
            "count": 51234,
            "failed": 5
        },
        "custom_counter_test2": {
            "count": 234,
            "failed": 1,
            "meta": {
                "last_reset": 2022-10-21T01:10:51.361Z",
                "added by": "pete"
            }
        }
    }
}

Then some way to reset or overwrite to reset the counters.

It means the metrics are still not doing aggs/combinations, so they're not calculated metrics which we generally want to avoid at foundational metrics level, and are still just plain counters, doing the same metrics/counts as the parent counter they're created under.

@geekpete geekpete added >enhancement needs:triage Requires assignment of a team area label labels Oct 21, 2022
@tylerperk tylerperk added the :Data Management/Stats Statistics tracking and retrieval APIs label Oct 24, 2022
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team and removed needs:triage Requires assignment of a team area label labels Oct 24, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@geekpete
Copy link
Member Author

Possibly related: #108987

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Stats Statistics tracking and retrieval APIs >enhancement Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

3 participants