-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster health check will failed in most case bacause the socket connection to the target cluster is not stable #2310
Labels
kind/bug
Categorizes issue or PR as related to a bug.
Comments
pikehuang
added a commit
to pikehuang/tke
that referenced
this issue
Oct 24, 2023
bug address: tkestack#2310 simple description: cluster health check will failed in most case bacause the socket connection to the target cluster is not stable tkestack#2310
pikehuang
added a commit
to pikehuang/tke
that referenced
this issue
Oct 25, 2023
…check go get rid of socket connection failure tkestack#2310
pikehuang
added a commit
to pikehuang/tke
that referenced
this issue
Oct 25, 2023
check go get rid of socket connection failure tkestack#2310
pikehuang
added a commit
to pikehuang/tke
that referenced
this issue
Oct 25, 2023
check go get rid of socket connection failure tkestack#2310
pikehuang
added a commit
to pikehuang/tke
that referenced
this issue
Oct 25, 2023
…check go get rid of socket connection failure tkestack#2310
pikehuang
added a commit
to pikehuang/tke
that referenced
this issue
Oct 25, 2023
in health check go get rid of socket connection failure tkestack#2310
pikehuang
added a commit
to pikehuang/tke
that referenced
this issue
Oct 25, 2023
in health check to go get rid of socket connection failure tkestack#2310
leoryu
pushed a commit
that referenced
this issue
Oct 25, 2023
in health check to go get rid of socket connection failure #2310
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What happened:
platform control will call checkHealth to update the cluster status. In production use case, we have total 100 clusters however 70 of them have failed cluster status, however the failed clusters are running well when we ssh to check their status. The detailed info is shown in following:
What you expected to happen:
wo hope that the cluster status keeps the same with its real status, whose most life lives in running.
How to reproduce it (as minimally and precisely as possible):
make the cluster in heavy network pressure or move the cluster from cloud to idc environment.
Anything else we need to know?:
the health check is not correct in most case, if there is monitor system the issue is easy to find.
Environment:
kubectl version
): anyThe text was updated successfully, but these errors were encountered: