Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Kubelet RunOnce mode #125649

Closed

Conversation

HirazawaUi
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

Deprecate kubelet support for RunOnce mode.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Added a alpha feature, behind the `LegacyNodeRunOnceMode` feature gate.
The feature gate is enabled by default. When the feature gate is disabled, we do not allow users to run in RunOnce mode.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/4580-deprecate-kubelet-runonce/README.md

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 23, 2024
@HirazawaUi
Copy link
Contributor Author

/hold
Until kubernetes/enhancements#4736 merged.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 23, 2024
@HirazawaUi
Copy link
Contributor Author

/retest

2 similar comments
@HirazawaUi
Copy link
Contributor Author

/retest

@HirazawaUi
Copy link
Contributor Author

/retest

if !utilfeature.DefaultFeatureGate.Enabled(features.LegacyNodeRunOnceMode) {
panic("Cannot run runonce mode when LegacyNodeRunOnceMode feature gate is disabled")
}
klog.InfoS("WARNING: RunOnce mode has been deprecated, and will be removed in a future release")
Copy link
Contributor

@toVersus toVersus Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the KEP-4580, why don't you log this as a warning level message?

Suggested change
klog.InfoS("WARNING: RunOnce mode has been deprecated, and will be removed in a future release")
klog.Warning("WARNING: RunOnce mode has been deprecated, and will be removed in a future release")

I was just wondering, if kubelet's log level is set to error or higher, the deprecation warning is missed. Should we consider always writing deprecation message without relying on klog? Never mind, since klog only has the -v option, it will probably always be logged even with -v=0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the KEP-4580, why don't you log this as a warning level message?

This is what I originally wanted to do, but lint doesn't like the warning function, so I chose to use InfoS instead.

If the PR is approved, I will modify the KEP

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could just make it an error.

Suggested change
klog.InfoS("WARNING: RunOnce mode has been deprecated, and will be removed in a future release")
klog.ErrorS("RunOnce mode is deprecated, and will be removed in a future release")

@HirazawaUi
Copy link
Contributor Author

/retest

@a-mccarthy
Copy link

Hi @HirazawaUi, I'm Abbie the comms lead for 1.31. Can you share the status of this PR? We are close to publishing the mid-cycle blog and this deprecation is listed. I'd like to make sure we should still include this in the blog.

Also, can you make sure this PR is linked in kubernetes/enhancements#4580? thanks!

cc: @neoaggelos

@HirazawaUi
Copy link
Contributor Author

@a-mccarthy This PR is waiting for review. I will remind the reviewer of sig-node to review it later in slack.

@HirazawaUi
Copy link
Contributor Author

Also, can you make sure this PR is linked in kubernetes/enhancements#4580? thanks!

added.

@HirazawaUi
Copy link
Contributor Author

/cc @tallclair
May you take a look at it?

if !utilfeature.DefaultFeatureGate.Enabled(features.LegacyNodeRunOnceMode) {
panic("Cannot run runonce mode when LegacyNodeRunOnceMode feature gate is disabled")
}
klog.InfoS("WARNING: RunOnce mode has been deprecated, and will be removed in a future release")
Copy link
Contributor

@mrunalp mrunalp Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thockin I see this akin to #126084. So, we should drop the language around removal, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HirazawaUi Also, we should adjust the language to encourage migrating to alternatives covered in the KEP for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we should adjust the language to encourage migrating to alternatives covered in the KEP for this.

Added. Do you think we should add the URL for podman kube? (I believe it's not necessary because the URL for podman kube might change).

@bart0sh
Copy link
Contributor

bart0sh commented Jul 20, 2024

/triage accepted
/priority important-soon

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 20, 2024
@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 20, 2024
@HirazawaUi HirazawaUi force-pushed the deprecate-kubelet-runonce branch 2 times, most recently from 317d21c to 7411183 Compare July 22, 2024 14:10
@HirazawaUi
Copy link
Contributor Author

/test pull-kubernetes-unit

@thockin
Copy link
Member

thockin commented Jul 22, 2024

Unlike the gitRepo volume case, this is not a part of the API, so it counts as "admin-facing" (as per https://kubernetes.io/docs/reference/using-api/deprecation-policy). Thus it is possible to deprecate, but the KEP does not discuss the actual timeline. I'd appreciate clarity on that (see https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-a-feature-or-behavior in particular)

@@ -1260,6 +1260,10 @@ func RunKubelet(ctx context.Context, kubeServer *options.KubeletServer, kubeDeps

// process pods and exit.
if runOnce {
if !utilfeature.DefaultFeatureGate.Enabled(features.LegacyNodeRunOnceMode) {
panic("Cannot run runonce mode when LegacyNodeRunOnceMode feature gate is disabled")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

panic will dump a backtrace - is that really what we need? It will appear to users as a bug or crash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not uncommon for panic to occur in kubelet, a similar use cases like:

if utilfeature.DefaultFeatureGate.Enabled(features.InPlacePodVerticalScaling) {
stateImpl, err := state.NewStateCheckpoint(m.stateFileDirectory, podStatusManagerStateFile)
if err != nil {
// This is a crictical, non-recoverable failure.
klog.ErrorS(err, "Could not initialize pod allocation checkpoint manager, please drain node and remove policy state file")
panic(err)
}
m.state = stateImpl
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave this for.kubele reviewers, but IMO panic is not a very obvious failure mechanism

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree that panic is a bit much. Can you exist with an error and log an error instead? Personally I think panics should be reserved for coding errors, whereas this is a configuration error.

@@ -1066,6 +1073,8 @@ var defaultKubernetesFeatureGates = map[featuregate.Feature]featuregate.FeatureS

LegacyServiceAccountTokenCleanUp: {Default: true, PreRelease: featuregate.GA, LockToDefault: true}, // GA in 1.30; remove in 1.32

LegacyNodeRunOnceMode: {Default: true, PreRelease: featuregate.Alpha},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/feature-gates.md#deprecation

It describes how to deprecate with gates (though in this case you want to start with it enabled, to satisfy deprecation policy)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave a TODO here to change default to false in or after 1.34

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/feature-gates.md#deprecation

It describes how to deprecate with gates (though in this case you want to start with it enabled, to satisfy deprecation policy)

Thanks for the reminder.

I would leave a TODO here to change default to false in or after 1.34

Since the KEP describes that it will be set to false by default in v1.32, I have added comment according to the KEP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deprecation policy is a little vague about this. It says "Deprecated behaviors must function for no less than 1 year after their announced deprecation." Do we think that "function" means "by default" or "with effort" ?

My read of it was "by default"

Copy link
Contributor Author

@HirazawaUi HirazawaUi Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, I have updated the comment.

cmd/kubelet/app/server.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 23, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if OK with node folks

/approve
/hold

Clear hold when fully reviewed

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: HirazawaUi, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 23, 2024
@haircommander
Copy link
Contributor

can you update the merge conflict too please @HirazawaUi ?

@HirazawaUi
Copy link
Contributor Author

/close
ref: kubernetes/enhancements#4736 (comment)

@k8s-ci-robot
Copy link
Contributor

@HirazawaUi: Closed this PR.

In response to this:

/close
ref: kubernetes/enhancements#4736 (comment)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

9 participants