Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix operator upgrades #587

Merged
merged 4 commits into from
Jul 18, 2019
Merged

Fix operator upgrades #587

merged 4 commits into from
Jul 18, 2019

Conversation

alenkacz
Copy link
Contributor

What type of PR is this?
/kind bug

What this PR does / why we need it:
controller was adding version labels with every patch of an object. This is fine for all objects except statefulsets.

"volumeClaimTemplates": [
      {
        "metadata": {
          ...
          "labels": {
           ....
            "version": "0.1.0"
          }

that was causing the same error as helm was dealing with here helm/charts#7803

Which issue(s) this PR fixes:

Fixes #557

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fixes upgrades for operators using statefulsets - it was not possible to upgrade those to another version of template.

@zmalik
Copy link
Member

zmalik commented Jul 17, 2019

I have tested for kafka with steps defined in #557 and was able to successfully upgrade from 2.2.1 to 2.3.0, that is operatorversion 0.1.1 to the next one

Copy link
Member

@kensipe kensipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have questions... I get that this works but isn't version info necessary? what happens when there are multi-versions in the cluster? wouldn't we want version on all things (except statefulsets)? can we use annotations for the version?

@kensipe
Copy link
Member

kensipe commented Jul 17, 2019

thinking more about this... I think we need a different solution. The label for version seems meaningful. There are objects which can not be updated... those objects [StatefulSets, Jobs] should never be updated... the job was a job for a specific version. and SS will be good only for a version... if an upgrade happens... a backup is invoked... the SS is destroyed and replaced (with a new version) and a restore happens.

@alenkacz
Copy link
Contributor Author

alenkacz commented Jul 17, 2019

@kensipe why would you reimplement rolling updates on statefulsets if you can reuse what is inside kubernetes? K8s allow that and as long as you don't touch persistent volumes, you can update anything you want... Sure, if you need to change something around persistence, you need to do what you're proposing

If you touch PV, kubernetes won't allow you to do that update... so it's "safe". Yeah we should probably think about how to document/ease for operator developers situations when they need to touch PV and create that extra upgrade plan, but I think that is a different issue than this one

@kensipe
Copy link
Member

kensipe commented Jul 18, 2019

@alenkacz I'm not sure where you are coming from... I never suggested not using k8s functionality... of course we should. I am suggesting that there are protected k8s objects which KUDO should be aware of and treat appropriately. I don't think I like the model of attempt and if it works good and if it fails it must not be allowed so we are good. I definitely don't like the model of not supporting multiple versions of a operator and would like to have solutions around that in the long term.

@kensipe
Copy link
Member

kensipe commented Jul 18, 2019

Put another way... I would prefer to NOT remove versions from labels and that for most object this makes sense. For objects for which this isn't allowed we should have other mechanisms for management. I'm not convinced this PR make sense.

@kensipe
Copy link
Member

kensipe commented Jul 18, 2019

to be clear... I am FOR not having a version label on volumeClaims... but this code change isn't just for volumeClaims... it is for "CommonLabels" and I am NOT FOR removing version labels on the other k8s objects.

@alenkacz
Copy link
Contributor Author

alenkacz commented Jul 18, 2019

@kensipe I would like us to switch the way of reasoning here and instead of telling what you want to change and how I need to see what you want to achieve with that, because I am missing that from the conversation. Tell me where you need the version inside labels and why. Where would not having that label actually cause a real problem of e.g. selector picking up the wrong resources?

I definitely don't like the model of not supporting multiple versions of a operator and would like to have solutions around that in the long term.

This is nothing new or anything that would change with this PR. You CANNOT have multiple instances inside the same namespace with the same name. That is just how k8s works. You cannot have multiple apps of the same name inside Marathon as well. And when you use a different name when you install different version of operator, you don't need the label with version because filtering on name is enough. This PR does not affect us supporting multiple versions in ANY WAY.

There are objects which can not be updated... those objects [StatefulSets, Jobs] should never be updated...

I don't agree with that. There are things you can allow to update safely and if that's the case we should do it in the normal k8s way without the overhead of doing backup and restore.


I am going to spend some time today going through some k8s docs and making sure how selectors work and where we can actually omit the labels and where not to gather the facts for what I proposed in the first paragraph.

@alenkacz
Copy link
Contributor Author

alenkacz commented Jul 18, 2019

btw. there is no easy way with kustomize to skip adding labels to some parts of yaml although it's pretty requested feature :( kubernetes-sigs/kustomize#817

We could write a plugin but that seems like a pretty big effort...

@alenkacz
Copy link
Contributor Author

After doing a lot of reading my summary is:

  • I think we don't need that label for kudo to work properly
  • having that label as selector MIGHT be useful for certain scenarios. BTW. helm does not add that label automatically as well, they just recommend you to do that in the template. Operator developers in KUDO can do that as well and add it manually to the template in the meantime if absolutely necessary
  • if we want to absolutely keep the feature of automatically applied version label, we should probably open an issue and discuss what is the best way to do that. Using kustomize for that seems not very straightforward, while I actually don't like the way we use kustomize right now at all. It's adding a lot of complexity to the code with a little value. Or we could keep customize and right after using it we can manually remove the label from persistentvolumeclaim (actually the last option seems like most feasible to me, if others approve I can start working on it in the next PR, but I think for now we're fine without that label)

@alenkacz
Copy link
Contributor Author

I'll be working on #601 in the meantime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

upgrading the instance's operatorVersion hangs
4 participants