Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary owner references #1358

Open
lentzi90 opened this issue Dec 12, 2023 · 13 comments
Open

Unnecessary owner references #1358

lentzi90 opened this issue Dec 12, 2023 · 13 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue is ready to be actively worked on.

Comments

@lentzi90
Copy link
Member

lentzi90 commented Dec 12, 2023

What steps did you take and what happened:

  1. Create a Cluster using Metal3DataTemplate and IPPool
  2. Check the owner references on the IPAddress, IPClaim, Metal3Data, Metal3DataClaim and BareMetalHost
  3. The Metal3Data owns the IPAddress and IPClaim (and the IPClaim owns the IPAddress). The Metal3Machine owns the BareMetalHost, Metal3DataClaim and Metal3Data (and the Metal3DataClaim owns the Metal3Data).

This owner structure seems quite strange and unnecessary. Why should the Metal3Machine own the Metal3Data for example when it already owns the Metal3DataClaim? And why should it own the BareMetalHost that is most likely created by the user?

These relations makes it harder to reason about behaviors and understanding the structure.
The owner references are usually used to delete owned objects when the owner is deleted. Owning both the parent and child, like is the case with the IPClaim and IPAddress, is not necessary and can lead to surprising behaviors.

What did you expect to happen:

The owner chain should not have situations where the grand parent owns both parent and child.
We should not put owner references on objects lightly. In general they belong on objects created from or managed by the owner, that should be deleted when the owner is deleted.

Anything else you would like to add:

There may be reasons for the structure that we have, but I have not been able to find them.
The main goal with this issue is to think critically about the structure we have, and think about if/how we can improve it.

Here is a (WIP) graph of the relations:

---
title: Cluster API and Metal3 objects
---
erDiagram
    Cluster ||--|| KubeadmControlPlane : references
    Cluster ||--|| Metal3Cluster : references
    Cluster ||--o{ MachineDeployment : owns
    MachineDeployment ||--|| KubeadmConfigTemplate : references

    KubeadmControlPlane ||--|| Metal3MachineTemplate : references
    Metal3MachineTemplate ||--o| Metal3DataTemplate : references
    Metal3MachineTemplate ||--o{ Metal3Machine : spawns
    Metal3DataTemplate }o--o{ IPPool : references
    IPPool }o--|| Cluster : references
    KubeadmControlPlane ||--|{ Machine : spawns
    MachineDeployment ||--|{ Machine : spawns
    KubeadmConfigTemplate ||--o{ KubeadmConfig : spawns

    Machine ||--|| Metal3Machine : owns
    Machine ||--|| KubeadmConfig : owns
    Metal3Machine |o--o| Metal3DataClaim : owns
    Metal3Machine |o--o| Metal3Data : owns
    Metal3Machine ||--|| BareMetalHost : consumes
    Metal3DataTemplate ||--o{ Metal3Data : spawns
    Metal3DataClaim ||--|| Metal3Data : consumes
    Metal3Data |o--o{ IPClaim : owns
    Metal3Data ||--o{ IPAddress : owns

    IPPool ||--o{ IPAddress : spawns
    IPClaim ||--|| IPAddress : consumes

    %% Secrets
    Metal3Data ||--|| Secret_metadata : produces
    Metal3Data ||--|| Secret_networkdata : produces
    KubeadmConfig ||--|| Secret_userdata : produces
    Secret_userdata ||--|| BareMetalHost : configures
    Secret_metadata ||--|| BareMetalHost : configures
    Secret_networkdata ||--|| BareMetalHost : configures
Loading

Environment:

  • Cluster-api version: v1.5.3
  • Cluster-api-provider-metal3 version: v1.5.2
  • Environment (metal3-dev-env or other): other
  • Kubernetes version: (use kubectl version): v1.28.3

/kind bug

@metal3-io-bot metal3-io-bot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels Dec 12, 2023
@lentzi90
Copy link
Member Author

/cc @Rozzii

@Rozzii
Copy link
Member

Rozzii commented Dec 17, 2023

I agree with you @lentzi90 as we have discussed this off GitHub, I will bring this up on triaging session on Wednesday, I would like a wide discussion over this involving as many of the contributors as possible.

@Rozzii
Copy link
Member

Rozzii commented Dec 17, 2023

/cc @zaneb @hroyrh @hardys @kashifest @furkatgofurov7 @dtantsur not sure who else would be interested in this.

@zaneb
Copy link
Member

zaneb commented Dec 18, 2023

Not an expert on CAPM3, but in general there's 2 motivations:

  1. Ensure stuff gets deleted when the thing owning is deleted
  2. Trigger reconciliation of the owner object when the owned object is modified

(Ideally these line up, but they don't always.)

The Metal3Data owns the IPAddress and IPClaim (and the IPClaim owns the IPAddress).

Depending on what the IPAddress represents, this sounds potentially suspect.

And why should [the Metal3Machine] own the BareMetalHost that is most likely created by the user?

Agree that it should not, and this is why we have the "Consumer" field in the BMH. (The Metal3Machine "owns" the deployment on the BMH, but not the BMH itself. Really that should be a separate CRD, but today it is not.)

Although the graph already shows the relationship as "consumes" not "owns"?

The owner chain should not have situations where the grand parent owns both parent and child.

If the grandparent is the one that needs to be reconciled when the child changes then this is not necessarily unreasonable. But it does leave the grandparent with some cleanup to do if e.g. the parent gets deleted and does not have a finalizer that also removes the child. A way to resolve this would be to roll up into the parent's status any status information from the child that the grandparent needs to watch.

@lentzi90
Copy link
Member Author

Although the graph already shows the relationship as "consumes" not "owns"?

I should have mentioned... don't trust this graph too much. There is a lot missing. Many objects have references to the cluster

@dtantsur
Copy link
Member

And why should it own the BareMetalHost that is most likely created by the user?

This one definitely feels wrong to me.

@Rozzii
Copy link
Member

Rozzii commented Dec 20, 2023

/triage accepted

@metal3-io-bot metal3-io-bot added triage/accepted Indicates an issue is ready to be actively worked on. and removed needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels Dec 20, 2023
@Rozzii
Copy link
Member

Rozzii commented Dec 20, 2023

THis could be a good topic for the 2024 MTM

@metal3-io-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@metal3-io-bot metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 19, 2024
@tuminoid
Copy link
Member

/remove-lifecycle stale

1 similar comment
@tuminoid
Copy link
Member

/remove-lifecycle stale

@metal3-io-bot metal3-io-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 19, 2024
@Rozzii
Copy link
Member

Rozzii commented May 6, 2024

This is going back and forth between stale and active and there is not active work on it at the moment, I am putting the frozen label on this issue, feel free to continue the work whenever you would like to.
/lifecycle frozen

@metal3-io-bot metal3-io-bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label May 6, 2024
@Sunnatillo
Copy link
Member

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue is ready to be actively worked on.
Projects
Status: CAPM3 on hold / blocked
Development

No branches or pull requests

7 participants