-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Use polling model for workflow phase metric #4557
Conversation
Signed-off-by: Simon Behar <[email protected]>
const enoughTimeForInformerSync = 1 * time.Second | ||
|
||
const semaphoreConfigIndexName = "bySemaphoreConfigMap" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some pork-barrel style changes in this PR
workflow/controller/controller.go
Outdated
go wfc.metrics.RunServer(ctx) | ||
go wait.Until(wfc.syncWorkflowPhaseMetrics, 5*time.Second, ctx.Done()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 5 seconds is a good balance here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So real-time for Prometheus means up to 15s old. Plus whatever delay the app has.
Every 15s would mean Prometheus would be up to 30s out of date. @jessesuen I'd like to do as little polling as possible.
Signed-off-by: Simon Behar <[email protected]>
workflow/controller/controller.go
Outdated
go wfc.metrics.RunServer(ctx) | ||
go wait.Until(wfc.syncWorkflowPhaseMetrics, 5*time.Second, ctx.Done()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So real-time for Prometheus means up to 15s old. Plus whatever delay the app has.
Every 15s would mean Prometheus would be up to 30s out of date. @jessesuen I'd like to do as little polling as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without @jessesuen to discus, can we set the polling interval to 1m. I'd rather have lower system load than up-to-date values. I will approve that.
I'm okay waiting for @jessesuen to discuss. In my opinion 1m is too high, esp. since the load is minimal since we read from the informer |
Good point. This is a blocking issue for v2.12, so I suggest set it to 15s as a compromise and these discus with @jesse next week. |
Signed-off-by: Simon Behar <[email protected]>
Signed-off-by: Simon Behar <[email protected]>
Signed-off-by: [email protected] <[email protected]> feat(ui): Add Template/Cron workflow filter to workflow page. Closes argoproj#4532 (argoproj#4543) Signed-off-by: Tianchu Zhao <[email protected]> feat(executor): Auto create s3 bucket if not present. Signed-off-by: Alex Capras <[email protected]> Apply codegen Signed-off-by: Alex Capras <[email protected]> Add argo-e2e label to test wf Signed-off-by: Alex Capras <[email protected]> chore: Updated stress test YAML (argoproj#4569) Signed-off-by: Alex Collins <[email protected]> docs: Updated kubectl apply command in manifests README (argoproj#4577) Signed-off-by: Stefan Gloutnikov <[email protected]> feat(controller): Make MAX_OPERATION_TIME configurable. Close argoproj#4239 (argoproj#4562) Signed-off-by: Alex Collins <[email protected]> docs: Fix a typo in example (argoproj#4590) Signed-off-by: Takayoshi Nishida <[email protected]> feat(controller): Retry transient offload errors. Resolves argoproj#4464 (argoproj#4482) Signed-off-by: Alex Collins <[email protected]> fix(server): use the correct name when downloading artifacts (argoproj#4579) Signed-off-by: Daniel Herman <[email protected]> fix(server): serve artifacts directly from disk to support large artifacts (argoproj#4589) Signed-off-by: Daniel Herman <[email protected]> fix(executor): Handle sidecar killing in a process-namespace-shared pod (argoproj#4575) Signed-off-by: Daisuke Taniwaki <[email protected]> docs: Add JSON schema for IDE validation (argoproj#4581) Signed-off-by: Paul Brabban <[email protected]> refactor: Use polling model for workflow phase metric (argoproj#4557) Signed-off-by: Simon Behar <[email protected]> Addressing reviewers comments Signed-off-by: Alex Capras <[email protected]> Addressing reviewers comments docs: Minor typo fix (argoproj#4610) Signed-off-by: Paavo Pokkinen <[email protected]> fix(controller): Prevent tasks with names starting with digit to use either 'depends' or 'dependencies' (argoproj#4598) Signed-off-by: terrytangyuan <[email protected]> fix(docs): Bring minio chart instructions up to date (argoproj#4586) Signed-off-by: Ranga Krishnan <[email protected]> fix(executor): Fixed waitMainContainerStart returning prematurely. Closes argoproj#4599 (argoproj#4601) Signed-off-by: fsiegmund <[email protected]> feat(controller): Enhanced artifact repository ref. See argoproj#3184 (argoproj#4458) Signed-off-by: Alex Collins <[email protected]> fix: Null check pagination variable (argoproj#4617) Signed-off-by: Simon Behar <[email protected]> fix: Perform fields filtering server side (argoproj#4595) Signed-off-by: Simon Behar <[email protected]> fix(server): Correct webhook event payload marshalling. Fixes argoproj#4572 (argoproj#4594) Signed-off-by: Alex Collins <[email protected]> feat(ui): Add columns--narrower-height to AttributeRow (argoproj#4371) fix: Fix TestCleanFieldsExclude (argoproj#4625) Signed-off-by: Simon Behar <[email protected]> fix(argo-server): fix global variable validation error with reversed dag.tasks (argoproj#4369) Signed-off-by: chenyu.zheng <[email protected]> fix: derive jsonschema and fix up issues, validate examples dir… (argoproj#4611) Signed-off-by: Paul Brabban <[email protected]> fix(ui): Reference secrets in EnvVars. Fixes argoproj#3973 (argoproj#4419) Signed-off-by: Alejandro Tejera <[email protected]> fix(ui): Fix Snyk issues (argoproj#4631) Signed-off-by: Alex Collins <[email protected]> feat(executor): More informative log when executors do not support output param from base image layer (argoproj#4620) Signed-off-by: terrytangyuan <[email protected]> Codegen patch. Signed off by [email protected] Codegen patch. Signed off by [email protected] Delete test.patch
Signed-off-by: [email protected] <[email protected]> feat(ui): Add Template/Cron workflow filter to workflow page. Closes argoproj#4532 (argoproj#4543) Signed-off-by: Tianchu Zhao <[email protected]> feat(executor): Auto create s3 bucket if not present. Signed-off-by: Alex Capras <[email protected]> Apply codegen Signed-off-by: Alex Capras <[email protected]> Add argo-e2e label to test wf Signed-off-by: Alex Capras <[email protected]> chore: Updated stress test YAML (argoproj#4569) Signed-off-by: Alex Collins <[email protected]> docs: Updated kubectl apply command in manifests README (argoproj#4577) Signed-off-by: Stefan Gloutnikov <[email protected]> feat(controller): Make MAX_OPERATION_TIME configurable. Close argoproj#4239 (argoproj#4562) Signed-off-by: Alex Collins <[email protected]> docs: Fix a typo in example (argoproj#4590) Signed-off-by: Takayoshi Nishida <[email protected]> feat(controller): Retry transient offload errors. Resolves argoproj#4464 (argoproj#4482) Signed-off-by: Alex Collins <[email protected]> fix(server): use the correct name when downloading artifacts (argoproj#4579) Signed-off-by: Daniel Herman <[email protected]> fix(server): serve artifacts directly from disk to support large artifacts (argoproj#4589) Signed-off-by: Daniel Herman <[email protected]> fix(executor): Handle sidecar killing in a process-namespace-shared pod (argoproj#4575) Signed-off-by: Daisuke Taniwaki <[email protected]> docs: Add JSON schema for IDE validation (argoproj#4581) Signed-off-by: Paul Brabban <[email protected]> refactor: Use polling model for workflow phase metric (argoproj#4557) Signed-off-by: Simon Behar <[email protected]> Addressing reviewers comments Signed-off-by: Alex Capras <[email protected]> Addressing reviewers comments docs: Minor typo fix (argoproj#4610) Signed-off-by: Paavo Pokkinen <[email protected]> fix(controller): Prevent tasks with names starting with digit to use either 'depends' or 'dependencies' (argoproj#4598) Signed-off-by: terrytangyuan <[email protected]> fix(docs): Bring minio chart instructions up to date (argoproj#4586) Signed-off-by: Ranga Krishnan <[email protected]> fix(executor): Fixed waitMainContainerStart returning prematurely. Closes argoproj#4599 (argoproj#4601) Signed-off-by: fsiegmund <[email protected]> feat(controller): Enhanced artifact repository ref. See argoproj#3184 (argoproj#4458) Signed-off-by: Alex Collins <[email protected]> fix: Null check pagination variable (argoproj#4617) Signed-off-by: Simon Behar <[email protected]> fix: Perform fields filtering server side (argoproj#4595) Signed-off-by: Simon Behar <[email protected]> fix(server): Correct webhook event payload marshalling. Fixes argoproj#4572 (argoproj#4594) Signed-off-by: Alex Collins <[email protected]> feat(ui): Add columns--narrower-height to AttributeRow (argoproj#4371) fix: Fix TestCleanFieldsExclude (argoproj#4625) Signed-off-by: Simon Behar <[email protected]> fix(argo-server): fix global variable validation error with reversed dag.tasks (argoproj#4369) Signed-off-by: chenyu.zheng <[email protected]> fix: derive jsonschema and fix up issues, validate examples dir… (argoproj#4611) Signed-off-by: Paul Brabban <[email protected]> fix(ui): Reference secrets in EnvVars. Fixes argoproj#3973 (argoproj#4419) Signed-off-by: Alejandro Tejera <[email protected]> fix(ui): Fix Snyk issues (argoproj#4631) Signed-off-by: Alex Collins <[email protected]> feat(executor): More informative log when executors do not support output param from base image layer (argoproj#4620) Signed-off-by: terrytangyuan <[email protected]> Codegen patch. Signed off by [email protected] Codegen patch. Signed off by [email protected] Delete test.patch Signed-off-by: Alex Capras <[email protected]>
Signed-off-by: Simon Behar <[email protected]>
Fixes: #4551
Signed-off-by: Simon Behar [email protected]
Checklist: