Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add rollout API spec #897

Merged
merged 4 commits into from
Sep 13, 2024
Merged

Conversation

ryanzhang-oss
Copy link
Contributor

@ryanzhang-oss ryanzhang-oss commented Aug 5, 2024

Description of your changes

An overall idea of rollout API

Fixes #

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

Copy link
Contributor

@michaelawyu michaelawyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments, PTAL 🙏

apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1beta1/clusterresourceplacement_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
// StageRolloutConditionType identifies a specific condition of the StageRollout.
type StageRolloutConditionType string

const (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! Just for clarification, from the comments it seems that StageRolloutWaiting is just a special case of StageRollingOut being true?

apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
@ryanzhang-oss ryanzhang-oss marked this pull request as ready for review August 14, 2024 18:09
@ryanzhang-oss ryanzhang-oss changed the title feat: rollout API feat: add rollout API spec Aug 15, 2024
// the common post rollout tasks. This setting will override the common post rollout tasks if there are
// duplicate type of tasks with different settings.
// +optional
PostRolloutTasks PostRolloutTask `json:"postRolloutTasks,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we will run the PostRolloutTasks first then the common PostRolloutTasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the common one for now to simplify

}

// StageRolloutConfig describes a single rollout stage group configuration.
type StageRolloutConfig struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we rollout the selected clusters within a stage?

sequentially or concurrently?

apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
Comment on lines 83 to 93
type PostRolloutTask struct {
// The name of the RolloutApprovalRequest we need to create to get the approval for the next stage.
// An RolloutApprovalRequest object will be created if this is not empty.
// +optional
ApprovalRequestName string `json:"approvalTask,omitempty"`

// The time to wait after all the clusters in the current stage complete the update
// before we move to the next stage.
// +optional
WaitTime *metav1.Duration `json:"waitTime,omitempty"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type PostRolloutTask struct {
// The name of the RolloutApprovalRequest we need to create to get the approval for the next stage.
// An RolloutApprovalRequest object will be created if this is not empty.
// +optional
ApprovalRequestName string `json:"approvalTask,omitempty"`
// The time to wait after all the clusters in the current stage complete the update
// before we move to the next stage.
// +optional
WaitTime *metav1.Duration `json:"waitTime,omitempty"`
}
type PostRolloutTask struct {
type TaskType // approval, wait
// The time to wait after all the clusters in the current stage complete the update
// before we move to the next stage.
// +optional
WaitTime *metav1.Duration `json:"waitTime,omitempty"`
}

PostRolloutTasks []PostRolloutTask

Then user can define their own sequence. Otherwise, it's hard to tell which one runs first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current idea is that they run at the same time. For example, we don't wait before or after the approval, the wait time is the minimum time needed between stages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I thought about the type pattern. The problem with that is each type requires a distinct set of parameters which pretty much makes the taskType field redundant.

// There can only be one active rollout for each ClusterResourcePlacement,
// we will fail the rollout if there is already an active rollout.
// +required
ParentCRPName string `json:"parentCRPName"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm thinking of the other way. CRP refers to the StageRollout name.
StageRollout will be a common rollout configurations for multiple CRP.

All the rollout status will be part of the CRP status, like today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if that would work as different CRPs can pick different clusters which means the stages are different between them.

I am actually thinking the other way around, one CRP can have multiple rollouts just like our AKS rollout. We have daily releases to canary only and one or two weekly releases going on simultaneously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this field? This is auto created for a CRP. Its owner reference should contain the CRP. Just like pod. Do we need a deployment name field in a pod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not auto created for a CRP. This CR is created by user to trigger an update run.

apis/placement/v1alpha1/stagerollout_types.go Outdated Show resolved Hide resolved
apis/placement/v1beta1/clusterresourceplacement_types.go Outdated Show resolved Hide resolved
apis/placement/v1beta1/clusterresourceplacement_types.go Outdated Show resolved Hide resolved

// StagedRolloutRun defines a stage by stage rollout run that is applied to
// the clusters that the corresponding ClusterResourcePlacement selects.
type StagedRolloutRun struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RollingUpdate -> RollingUpdateConfig in the current API.

Use StagedUpdate -> StagedUpdateRun, StagedUpdateStrategy in the new API.

)

// +genclient
// +genclient:namespaced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the resource namespaced if it is used by CRP which is not namespaced? How to decide which namespace a run should be in?

// +kubebuilder:resource:scope="Namespaced",categories={fleet,fleet-placement}
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object

// StagedRolloutRun defines a stage by stage rollout run that is applied to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who creates this resource?

I assume it is auto created whennever we need to roll out a change in CRP. If that's the case, CRP must contain StagedUpdateConfig (similar with RollingUpdateConfig), which includes at least one field, a reference to StagedUpdateStrategy (as strategy must be provide by customers, not auto generated by us).

// There can only be one active rollout for each ClusterResourcePlacement,
// we will fail the rollout if there is already an active rollout.
// +required
ParentCRPName string `json:"parentCRPName"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this field? This is auto created for a CRP. Its owner reference should contain the CRP. Just like pod. Do we need a deployment name field in a pod?

// The name of the rollout strategy that specifies the stages and the sequence in which
// the application will be updated on the member clusters.
// +required
StagedRolloutStrategy string `json:"stagedRolloutStrategy"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the first field?

// - `ParentStageRollout` which points to its owner stage rollout.
// - `TargetStage` which is the name of the stage that this approval request is for.
// - `IsLatestRolloutApproval` which indicates whether this approval request is the latest one related to this rollout
type RolloutApprovalRequest struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which namespace should the rollout approve request be in? Will it be in the same one as rollout update run? How do we manage access control of approve requests vs runs vs strategies?

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
// The clusters in each stage are ordered following its order to be rolled out.
// The updateRun will stop if the cluster stages change during the update.
// +required
ComputedStages [][]string `json:"currentStages"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use []StageStatus which contains each cluster's status. See this one: https://learn.microsoft.com/en-us/rest/api/fleet/update-runs/get?view=rest-fleet-2023-10-15&tabs=HTTP

// Therefore, we need to keep track of the clusters that we have applied the resources to but no long is selected.
// We remove the application from these clusters after all the stages are completed and remove them from this list.
// +required
ToBeRevertedClusters []string `json:"toBeRevertedClusters"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeletionStageStatus StageStatus
contains a list of clusters whose work will be deleted ordered by the given sort label key.

// The clusters in each stage are ordered following its order to be rolled out.
// The updateRun will stop if the cluster stages change during the update.
// +required
ComputedStages [][]string `json:"currentStages"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
// There can only be one active rollout for each ClusterResourcePlacement,
// we will fail the rollout if there is already an active rollout.
// +required
ParentCRPName string `json:"parentCRPName"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not auto created for a CRP. This CR is created by user to trigger an update run.

// The clusters in each stage are ordered following its order to be rolled out.
// The updateRun will stop if the cluster stages change during the update.
// +required
ComputedStages [][]string `json:"currentStages"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add cluster level detail

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
apis/placement/v1beta1/clusterresourceplacement_types.go Outdated Show resolved Hide resolved

// The override snapshot index that this update run is based on.
// +required
OverrideSnapShotIndex string `json:"overrideSnapShotIndex"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the override, we don't support snapshot index because there will be always one snapshot per cro or ro. Here user needs to configure a list of ROs snapshot or CROs snapshot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do users find out which ROs and CROs apply to this given placement? Listing them all seems a very cumbersome task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to go to the status field instead.

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved

// The override snapshot index that this update run is based on.
// +required
OverrideSnapShotIndex string `json:"overrideSnapShotIndex"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do users find out which ROs and CROs apply to this given placement? Listing them all seems a very cumbersome task.

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
// Conditions is an array of current observed conditions for the specific type of post update task.
// Known conditions are "ApprovalRequestCreated", "WaitTimeElapsed", and "ApprovalRequestApproved".
// +kubebuilder:validation:Optional
Conditions []metav1.Condition `json:"conditions"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reuse UpdateStatus instead of Conditions? Maybe rename to StepStatus? Everything: a stage, a cluster, a after stage task is a step in the staged update run.

You also need to record start time and complete time of the step. This is especially useful for TimedWait so the user knows when the wait started and when it ended.

Copy link
Contributor Author

@ryanzhang-oss ryanzhang-oss Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention in k8s is to use conditions to record information for each state (running, wait, completed). I think it's better to at least keep the overall state in condition instead of a single status. We can convert all the state back into conditions too but it's little bit too much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? We use conditions at the updaterun level, but use updatestatus for stages and below.

Better to have a start time and end time for these AfterStageTasks too. So it is better to use UpdateStatus instead conditions for AfterStageTasks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition states (ApprovalRequestCreated, ApprovalRequestApproved, WaitTimeElapsed) in the afterStageTasks are different from the update status so it's hard to reuse it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NotStarted -> (created approval request) -> waiting -> (approve request) -> completed
NotStarted -> (start wait time) -> waiting -> (wait time has passed) -> completed

UpdateRun is a tree of steps: a step can be a run, a stage, a cluster, a after-stage task.

It is better to have the same state model for each step so customers don't need to learn new state/condition model for each step.

That's why we should have the same state for the run and each after-stage task as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fleet RP also have the wait time step and use the same state data model.

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
// for this stage. There cannot be overlapping clusters between stages when the stagedUpdateRun is created.
// If the label selector is absent, all the selected clusters are selected.
// +kubebuilder:validation:Optional
LabelSelector *metav1.LabelSelector `json:"labelSelector"`
Copy link
Contributor

@jwtty jwtty Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to do an intersection between the clusters selected by the scheduler and the clusters selected here right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
// - `ParentStagedUpdateRun` which points to its owner stage update.
// - `TargetStage` which is the name of the stage that this approval request is for.
// - `IsLatestUpdateRunApproval` which indicates whether this approval request is the latest one related to this update run.
type StageApprovalRequest struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit confused: how will this request get approved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user needs to patch the "Approved" condition of the request as true

@@ -346,6 +368,12 @@ const (
// - "False" means the staged update run is not waiting between stages.
// - "Unknown" means it is unknown.
StagedUpdateRunConditionWaitingBetweenStage StagedUpdateRunConditionType = "WaitingBetweenStage"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the "Running" and "WaitingBetweenStage" conditions? they could be "reason" when the completed is "false".

"Running", "WaitingBetweenStage" and "Completed" all these conditions are delivering some duplicate information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With "running" and "waittingBetweenStage" conditions, we will get the information like last time the stage has completed. If we collapse them into one condition then we lost a bit of the history.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need both conditions and state? State serves the same purpose as conditions. State contains starttime and endtime.

RunStatus {
runState State
stages []StageStatus
deletionStage StageStatus
}

StageStatus {
stageName string
stageState State
clusters []ClusterStatus
afterStageTasks []TaskStatus
}

ClusterStatus {
clusterName string
clusterState State
}

TastkStatus {
taskType
taskState State
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's k8s convention in the status to use condition to show the state transition which is pretty much equivalent to the runState. I think using state is more efficient and acceptable as an internal field but we better to keep condition as the first layer fields status. If we really want to unify them then they all should be conditions like in CPR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with Liqian about the detailed stage information. They are already part of the StageStatus.

For the conditions in the "StagedUpdateRunStatus", can we treat it as overall condition/status? so that user can query this condition to know about whether the updateRun has completed or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the conditions in the "StagedUpdateRunStatus" are the overall condition/status of the updateRun.

apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
@@ -346,6 +368,12 @@ const (
// - "False" means the staged update run is not waiting between stages.
// - "Unknown" means it is unknown.
StagedUpdateRunConditionWaitingBetweenStage StagedUpdateRunConditionType = "WaitingBetweenStage"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need both conditions and state? State serves the same purpose as conditions. State contains starttime and endtime.

RunStatus {
runState State
stages []StageStatus
deletionStage StageStatus
}

StageStatus {
stageName string
stageState State
clusters []ClusterStatus
afterStageTasks []TaskStatus
}

ClusterStatus {
clusterName string
clusterState State
}

TastkStatus {
taskType
taskState State
}

type StagedUpdateRunSpec struct {
// A reference to the placement that this update run is applied to.
// There can be multiple active update runs for each placement but
// it's up to the devOps to make sure they don't conflict with each other.
// +kubebuilder:validation:Required
PlacementRef PlacementReference `json:"placementRef"`

// The resource snapshot index that this update run is based on.
// The snapshot index of the selected resources to be updated across clusters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give a concrete example on what you mean by the snaptshot index?

Something like:

e.g., for resource snapshots "xxx-123-1" and "xxx-123-2", "123" is the index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that you used policysnaptshot name instead of index below. Is that because there are N resource snapshot for a single version of selected resources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, for each resource snapshot index, there are N resource snapshots. The index is "123" in the example you gave. There is only 1 policy snapshot object for a snapshot of the CRP placement policy.

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
// Conditions is an array of current observed conditions for the specific type of post update task.
// Known conditions are "ApprovalRequestCreated", "WaitTimeElapsed", and "ApprovalRequestApproved".
// +kubebuilder:validation:Optional
Conditions []metav1.Condition `json:"conditions"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? We use conditions at the updaterun level, but use updatestatus for stages and below.

Better to have a start time and end time for these AfterStageTasks too. So it is better to use UpdateStatus instead conditions for AfterStageTasks.

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
@@ -346,6 +368,12 @@ const (
// - "False" means the staged update run is not waiting between stages.
// - "Unknown" means it is unknown.
StagedUpdateRunConditionWaitingBetweenStage StagedUpdateRunConditionType = "WaitingBetweenStage"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with Liqian about the detailed stage information. They are already part of the StageStatus.

For the conditions in the "StagedUpdateRunStatus", can we treat it as overall condition/status? so that user can query this condition to know about whether the updateRun has completed or not.

apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
// Conditions is an array of current observed conditions for the specific type of post update task.
// Known conditions are "ApprovalRequestCreated", "WaitTimeElapsed", and "ApprovalRequestApproved".
// +kubebuilder:validation:Optional
Conditions []metav1.Condition `json:"conditions"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NotStarted -> (created approval request) -> waiting -> (approve request) -> completed
NotStarted -> (start wait time) -> waiting -> (wait time has passed) -> completed

UpdateRun is a tree of steps: a step can be a run, a stage, a cluster, a after-stage task.

It is better to have the same state model for each step so customers don't need to learn new state/condition model for each step.

That's why we should have the same state for the run and each after-stage task as well.

// Conditions is an array of current observed conditions for the specific type of post update task.
// Known conditions are "ApprovalRequestCreated", "WaitTimeElapsed", and "ApprovalRequestApproved".
// +kubebuilder:validation:Optional
Conditions []metav1.Condition `json:"conditions"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fleet RP also have the wait time step and use the same state data model.

// - `ParentStagedUpdateRun` which points to its owner stage update.
// - `TargetStage` which is the name of the stage that this approval request is for.
// - `IsLatestUpdateRunApproval` which indicates whether this approval request is the latest one related to this update run.
type StageApprovalRequest struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to use ApprovalRequest instead of StageApprovalRequest.

Add TargetStep in the spec. A step can be a stage, a group or a cluster.

StepReference {
  RunName
  StageName
  (future) GroupName
  (future) ClusterName
}

circy9
circy9 previously approved these changes Sep 13, 2024
apis/placement/v1alpha1/stagedupdate_types.go Outdated Show resolved Hide resolved
@ryanzhang-oss ryanzhang-oss merged commit 19ed98c into Azure:main Sep 13, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants