-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: adds affinity and other scheduling to the operator #29977
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreported flaky test detected, please review
Unreported flaky test detectedIf the flaky tests below are affected by the changes, please review and update the changes accordingly. Otherwise, a maintainer should report the flaky tests prior to merging the PR. org.keycloak.testsuite.broker.KcOidcBrokerTest#testPostBrokerLoginFlowWithOTP_bruteForceEnabledKeycloak CI - Java Distribution IT (windows-latest - temurin - 19)
|
Thank you for this PR. What I'm reading here is that you're about to expose the Kubernetes constructs as is. When doing this, we'll offer a non-opinionated way to schedule Pods, which is IMHO ok as we don't seem to have an opinionated and agreed-upon solution. Eventually, more coarse grained options might emerge as we learn about how this can be used in different environments. I'd say even then exposing this as a supported feature makes sense as we won't be able to foresee any possible scenario. Even if the deployment uses an external Infinispan, there are still the embedded Infinispans for now, and therefore I'd advise the Operator not to take any caching configuration into account - at least for now. @pruivo is working an a PR which is planned for KC26 which would allow running Keycloak without its embedded distributed caches, but this is still some time in the future. So for now, I'd say
cc: @ryanemerson |
@ahus1 I'm fine with removing the defaults around spread constraints and moving that to a topic for the scaling guide. |
No, please add a default best-effort spread. I'm just against making the default settings dependent on the cache configuration. |
The default spread is just an affinity for the same zone. The pr has been updated with that. Should just need some docs. @vmuzikar WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreported flaky test detected, please review
Unreported flaky test detectedIf the flaky tests below are affected by the changes, please review and update the changes accordingly. Otherwise, a maintainer should report the flaky tests prior to merging the PR. org.keycloak.testsuite.oauth.TokenIntrospectionTest#testUnsupportedToken
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawkins Thank you for the changes!
...tor/src/main/java/org/keycloak/operator/controllers/KeycloakDeploymentDependentResource.java
Show resolved
Hide resolved
...tor/src/main/java/org/keycloak/operator/controllers/KeycloakDeploymentDependentResource.java
Show resolved
Hide resolved
...tor/src/main/java/org/keycloak/operator/controllers/KeycloakDeploymentDependentResource.java
Outdated
Show resolved
Hide resolved
@vmuzikar added the docs, should be fully ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawkins Great work! I've just put a little brain dump below. And there's also some changes conflict.
nit: I also like when we have test-covered the CR serialization as they're running a few ms and may bring additional test coverage - but in this case the PodTemplate test should be sufficient enough.
@@ -204,6 +204,25 @@ It is achieved by providing certain JVM options. | |||
|
|||
For more details, see <@links.server id="containers" />. | |||
|
|||
=== Scheduling | |||
|
|||
You may control several aspects of the server Pod scheduling via the Keycloak CR. The scheduling stanza exposes the standard Kubernetes affinity, tolerations, topology spread constraints, and the priority class name to fine tune the scheduling and placement of your server Pods. By default if you do not specify a custom affinity your Pods will have an affinity for the same zone to prevent stretch clusters and an anti-affinity for the same node to improve availability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't be good for users to also briefly explain why we want to prevent the stretched clusters? When we support the affinity configuration, we should mention the cons of the different AZ and the embedded Infinispan, correct? We will avoid some future confusion when users use different affinity settings.
@shawkins WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a little more on this. I'd defer to @ryanemerson if we want a more detailed explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what we have for now is fine. I think we need a dedicated guide explaining different architectures, stretch vs non-stretch etc, as it's a complex topic.
...tor/src/main/java/org/keycloak/operator/controllers/KeycloakDeploymentDependentResource.java
Show resolved
Hide resolved
@ryanemerson @ahus1 @vmuzikar in the docs should we call out specifically recommending an affinity for the same zone as the database? |
The good thing about pod affinity is that it is now (1) supported by the Operator and no longer part of the unsupported pod template and (2) we have a reasonable default. Deploying Keycloak on the same region as the database seems to be a good idea, still I didn't tested this how much it actually impacts Keycloak. And if the database is running outside of Kubernetes, you would start coding the zone manually in the affinity rules, and not use the "try to run all pods in the same zone but I don't care which". So I'd say it is probably good enough for now. Once we work with this a bit more, I could see that we provide some predefined scheduling templates to hide the complexity and to promote some well-tested and well-understood scheduling methods. This would then be a follow-up PR. |
+1 to Alexander's comment. I think that more advanced/opinionated setting examples should exist as part of the Keycloak HA guide or similar, where we explain in detail the trade-offs of different architectures and how they apply to specific use-cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawkins Just last minor thing about the example formatting, otherwise LGTM
Should be good now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added one minor comment. LGTM otherwise.
closes: keycloak#29258 Signed-off-by: Steve Hawkins <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shawkins LGTM, thank you.
@mabartos I believe your comments has been resolved. I'm overriding your review request so we can merge. Once you're back, let me know if you have any further concerns, we can address them as a follow-up. |
closes: #29258
Draft of a scheduling spec with the options that are most likely to be requested - affinity, topologyspread, priorityclass, and tolerations.
The handling is similar to other constructs - if the user has populated the field in the unsupported spec, that takes presedence.
A new label, app.kubernetes.io/component, is added to distinguish the server pods.
The default node anti-affinity is added if no other affinity has been specified. This mechanism may not have great usability. It could be better to have a withDefaults field on the scheduling spec.
Similarly I'm proposing to go a step further and handle the default affinity or spread - depending upon whether an external cache is in use. Issues with doing this:
@ahus1 @vmuzikar @abelmatos WDYT?