You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
Requests which are timed out by the service profile timeout parameter are not subject to retry.
What you expected to happen:
I would like timeouts which are governed by service profile to be treated as failures, and I would like them to be retried.
How to reproduce it (as minimally and precisely as possible):
Create a simple Linkerd service profile, and set a timeout on any route to be a small value, say 500ms. Now if you call this endpoint from a meshed pod, and when it happens that the call takes more than 500ms to complete, the pod will see 504 gateway timeout. No retries are happening.
Anything else we need to know?:
Imagine you have a service endpoint /database/products. When you do GET /database/products you will receive a list of products from a database. Sometimes this call takes 50ms to complete, sometimes it takes 3sec to complete. I would like to be able to tell linkerd hey, if this request takes longer than X, cancel it AND do a retry (according to a retry budget). This situation would be analogues to a situation where this endpoint returns 500, in which case linkerd (if configured) will do a retry according to the budget.
Why is this important? Say I configured above route to timeout after 500ms. I have a meshed pod which does GET /database/products. Say the first request takes more than 500ms, in which case the request is canceled. But linkerd will do a retry, so the request is retried and lets say it will take 100ms this time. So what happened is first requestd timed out after 500ms but it was retried and the second time it took 100ms to complete. So from the calling pod point of view the request took ~600ms to complete. If linkerd didint interfere it would take 3 sec to complete. That is what should happen.
At this point, the fact that Linkerd cancels the request after X and returns 504 gateway timeout is pretty much useless.
Environment:
linkerd version: 2.6
Platform, version, and config files (Kubernetes, DC/OS, etc): Kubernetes 1.14
Cloud provider or hardware configuration: Docker Desktop with Kubernetes
The text was updated successfully, but these errors were encountered:
Issue Type:
Bug report
Feature request
What happened:
Requests which are timed out by the service profile timeout parameter are not subject to retry.
What you expected to happen:
I would like timeouts which are governed by service profile to be treated as failures, and I would like them to be retried.
How to reproduce it (as minimally and precisely as possible):
Create a simple Linkerd service profile, and set a timeout on any route to be a small value, say 500ms. Now if you call this endpoint from a meshed pod, and when it happens that the call takes more than 500ms to complete, the pod will see 504 gateway timeout. No retries are happening.
Anything else we need to know?:
Imagine you have a service endpoint /database/products. When you do GET /database/products you will receive a list of products from a database. Sometimes this call takes 50ms to complete, sometimes it takes 3sec to complete. I would like to be able to tell linkerd hey, if this request takes longer than X, cancel it AND do a retry (according to a retry budget). This situation would be analogues to a situation where this endpoint returns 500, in which case linkerd (if configured) will do a retry according to the budget.
Why is this important? Say I configured above route to timeout after 500ms. I have a meshed pod which does GET /database/products. Say the first request takes more than 500ms, in which case the request is canceled. But linkerd will do a retry, so the request is retried and lets say it will take 100ms this time. So what happened is first requestd timed out after 500ms but it was retried and the second time it took 100ms to complete. So from the calling pod point of view the request took ~600ms to complete. If linkerd didint interfere it would take 3 sec to complete. That is what should happen.
At this point, the fact that Linkerd cancels the request after X and returns 504 gateway timeout is pretty much useless.
Environment:
The text was updated successfully, but these errors were encountered: