-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Updated the TTLcontroller to fix issue related to TTL default settings. Fixes #1923 #2044
Conversation
I will fix the broken tests. |
Codecov Report
@@ Coverage Diff @@
## master #2044 +/- ##
=======================================
Coverage 13.13% 13.13%
=======================================
Files 72 72
Lines 25084 25084
=======================================
Hits 3294 3294
Misses 21375 21375
Partials 415 415 Continue to review full report at Codecov.
|
@alexec thanks for the help! |
@alexec what is codecov? Do you have any feedback on the way it is implemented? Is there something that you think could be improved, change or improved? |
codeCov indicates how much unit test coverage your code has. This is only applicable to change which can have unit tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving this, but I'd like to hold it to v2.5 so we won't merge just yet.
|
||
func TestTTLDefault(t *testing.T) { | ||
controller := newTTLControllerDefaultTTL() | ||
var ten int32 = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a discussion on what the best way to provide for custom values are. Many users have requested this feature for different specs, so this pattern of adding a field on the config map could end up not scaling well. It would be easy to see a different default*
field for every aspect of the workflow spec, and in my opinion that would create clutter.
Perhaps we should add override fields under a struct called workflowOverrides
or the like?
In practice this would look something like:
...
workflowOverrides:
ttlStrategy: ...
...
|
I think it sounds good to do a solution that is applicable for a wider scope than the TTLcontroller :) . I don't think it would be good to overwrite the workflow with the default values based upon when argo finds a new resource since it will then not be possible to backtrack what the original input workflow was then. Or do you mean to keep both the original input workflow and an updated copy with the values populated/overwritten based upon the defaults?
Cool, I would be happy to join if possible :). I would also be happy to contribute and work on the wider implementation of default values. |
I can start to look at a solution for handling default values in general for argo workflows, should I create a new issue for that @alexec where we could have the discussion about it? |
How about adding a field in https://github.com/argoproj/argo/blob/3293c83f6170ad4dc022067bb37f12d07d2834c1/workflow/config/config.go#L11 of type https://github.com/argoproj/argo/blob/3293c83f6170ad4dc022067bb37f12d07d2834c1/pkg/apis/workflow/v1alpha1/workflow_types.go#L140 ? Then instead of adding a bunch of default fields in Also, as I mentioned here and here, compared with the method of adding default logic in separate places, the code will be cleaner to handle all these default logics in a function called something like |
@NikeNano @zhujl1991 I will bring this issue and all your suggestions up at an internal meeting soon to find the right direction for this |
This sounds good!
I also see scenarios where user specified values should be over written by defaults. As a admin I for example don't want to have workflows living for ever, but set a maximum time to live and if users have specs exceed this cap it to the maximum. |
Would be happy to work on this together @zhujl1991 if you are interested :) |
Any updates on this @simster7? |
@jessesuen @sarabala1979 @simster7 @dtaniwaki every time I see something like this I think we should have a config item that is a workflow spec and we should use that as the base and mail-merge each template with it when they are created. That would allow cluster wide or namespace wide features to be added to every workflow, such as exit handlers, TTL, PDBs, etc etc - thoughts? Alex |
Sure. I think what's at issue here is how this should be implemented. Are we thinking of some sort of JSON or strategic patch logic? Or simply add fields present in the config map to workflows when they are created, and override the config items with workflow items if they are present? |
The solution I'm proposing would fix this issue, but also a class of issue where you want to have defaults in your workflow. Instead of having to code each one, we could simply have a " blueprint workflow spec" (yes - I know this is a bad name for it) that all created workflow are based on. Example: apiVersion: v1
kind: ConfigMap
metadata:
name: workflow-controller-configmap
data:
workflowBlueprint: |
spec:
ttlStrategy:
secondsAfterSuccess: 30 Submitted workflow: metadata:
generateName: hello-world-
spec:
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay:latest argo submit hello-world.yaml Created workflow: metadata:
generateName: hello-world-
spec:
ttlStrategy:
secondsAfterSuccess: 30
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay:latest |
Sounds good! argo submit hello-world.yaml But you still mean that the handling of the default should be in the controller and not the CLI right? How would senarios with contradicting values be handled? Or Scenario where the uses suggest metadata:
generateName: hello-world-
spec:
ttlStrategy:
secondsAfterSuccess: 30
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay:latest and and the "Blueprint" is : apiVersion: v1
kind: ConfigMap
metadata:
name: workflow-controller-configmap
data:
workflowBlueprint: |
spec:
ttlStrategy:
secondsAfterSuccess: 25 Which should be selected? |
I think workflow must take precedence over the blueprint. I.e. the blue print provides defaults. |
I think it would be good to be able to override. We use kubeflow and as a admin it would be great to be able to set default that over rides the user for example to limit the TTL time for workflows to avoid having to clean up. |
@alexec If I understand you correctly, your proposal is the same as the approach I proposed here, right? This feature is hi-pri for us. If everyone can agree on this approach, I can implement it. |
Yes. That is what I think. I'd like to get feedback from @jessesuen @sarabala1979 @simster7 @dtaniwaki, E.g. how could misuse bite people in the bum? |
That would be my ideal solution as well. Feel free to start work on this @zhujl1991 @NikeNano |
@simster7 I will start to work on this and then update this/or make a new PR related to setting max min ttl times. |
With this PR I intend to fix PR #1923
After having a discussion with @simster7 (who also pointed out that further internal discussions are needed) I have updated the TTLcontroller to also handle default settings which can be supplied to the controller config map.
Please see the discussion here for background.
This is an initial approach, I think that it might be suitable to merge or combine:
https://github.com/argoproj/argo/blob/faa9dbb59753a068c64a1aa5923e3e359c0866d8/workflow/ttlcontroller/ttlcontroller.go#L200
and
https://github.com/argoproj/argo/blob/faa9dbb59753a068c64a1aa5923e3e359c0866d8/workflow/ttlcontroller/ttlcontroller.go#L218
Since it is a lot of duplicated code. Especially with the updates in this PR.
Last but not least I love feedback and @simster7 pointer out that a completely different solutions might be the best, however I leave this here as a start to the discussion.
Checklist:
"fix(controller): Updates such and such. Fixes #1234"
.Fixes #1923