Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?" #9

Closed
FraGThieF opened this issue Mar 31, 2022 · 21 comments
Labels
missing information Further information is requested

Comments

@FraGThieF
Copy link

Hello!
First of all thank you for this guide!
Learn so many new things, but now i am stucked and do not know how to proceed.

I try to get it up and running on rasperry pi's, 3 Masters, and 4 Workers.

Error message after lunching the playbook:

fatal: k3s-m3]: FAILED! => {"attempts": 20, "changed": false, "cmd": ["k3s", "kubectl", "get", "nodes", "-l", "node-role.kubernetes.io/master=true", "-o=jsonpath={.items[*].metadata.name}"], "delta": "0:00:01.384371", "end": "2022-03-31 22:13:11.580652", "msg": "non-zero return code", "rc": 1, "start": "2022-03-31 22:13:10.196281", "stderr": "The connection to the server localhost:8080 was refused - did you specify the right host or port?", "stderr_lines": ["The connection to the server localhost:8080 was refused - did you specify the right host or port?"], "stdout": "", "stdout_lines": []}

fatal: [k3s-m2]: FAILED! => {"attempts": 20, "changed": false, "cmd": ["k3s", "kubectl", "get", "nodes", "-l", "node-role.kubernetes.io/master=true", "-o=jsonpath={.items[*].metadata.name}"], "delta": "0:00:01.599374", "end": "2022-03-31 22:13:21.948957", "msg": "non-zero return code", "rc": 1, "start": "2022-03-31 22:13:20.349583", "stderr": "The connection to the server localhost:8080 was refused - did you specify the right host or port?", "stderr_lines": ["The connection to the server localhost:8080 was refused - did you specify the right host or port?"], "stdout": "", "stdout_lines": []}

fatal: [k3s-m1]: FAILED! => {"attempts": 20, "changed": false, "cmd": ["k3s", "kubectl", "get", "nodes", "-l", "node-role.kubernetes.io/master=true", "-o=jsonpath={.items[*].metadata.name}"], "delta": "0:00:01.694997", "end": "2022-03-31 22:14:06.756075", "msg": "non-zero return code", "rc": 1, "start": "2022-03-31 22:14:05.061078", "stderr": "The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?", "stderr_lines": ["The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?"], "stdout": "", "stdout_lines": []}

What can I do? Where do I have to search? what could the error be?

Thanks for any help that you can give me.

@timothystewart6
Copy link
Contributor

Did you get latest? Also are you sure you are using the right interface name in your variables? Can you show them here?

@FraGThieF
Copy link
Author

THX for your tutorials and help.

Yes, i got the latest,

Interface is eth0


k3s_version: v1.23.4+k3s1
ansible_user: FraG
#ansible_pass: *******
systemd_dir: /etc/systemd/system

interface which will be used for flannel

flannel_iface: "eth0"

apiserver_endpoint is virtual ip-address which will be configured on each master

apiserver_endpoint: "192.168.0.190"

k3s_token is required masters can talk together securely

this token should be alpha numeric only

k3s_token: "++++++++++"

extra_server_args: "--no-deploy servicelb --no-deploy traefik --write-kubeconfig-mode 644 default-not-ready-toleration-seconds=30 --kube-apiserver-arg default-unreachable-toleration-seconds=30 --kube-controller-arg node-monitor-period=20s --kube-controller-arg node-monitor-grace-period=20s --kubelet-arg node-status-update-frequency=5s"
extra_agent_args: "--kubelet-arg node-status-update-frequency=5s"

change these to your liking, the only required one is--no-deploy servicelb

#extra_server_args: "--no-deploy servicelb --no-deploy traefik"
#--write-kubeconfig-mode 644"
#extra_agent_args: ""
#"--kubelet-arg node-status-update-frequency=5s"

image tag for kube-vip

kube_vip_tag_version: "v0.4.2"

image tag for metal lb

metal_lb_speaker_tag_version: "v0.12.1"
metal_lb_controller_tag_version: "v0.12.1"

metallb ip range for load balancer

metal_lb_ip_range: "192.168.0.180-192.168.0.189"

@timothystewart6
Copy link
Contributor

I don't see anything odd. I would try removing all server args except required, reset, and try it again

@timothystewart6 timothystewart6 added the missing information Further information is requested label Apr 2, 2022
@mrimp
Copy link

mrimp commented Apr 3, 2022

Expand your hard disks... On all nodes...

Probably should make a note Tim. You do say this in the video!
Thx for your work!

@yankeeinlondon
Copy link

I too ran into this problem. I have double checked that the hosts.ini file indeed does match up to the IP addresses which are running. I am running 2 physical servers (both x86) with 3 back-planes and 2 workers.

2022-04-03_10-44-19

2022-04-03_10-49-46

@yankeeinlondon
Copy link

Before doing this I actually switched my Ubuntu template to the prior video that Tim did and made sure that the both username/password and SSH keys are consistent across all VMs.

@svartis
Copy link
Contributor

svartis commented Apr 3, 2022

I am encountering the same issue as well when I try to run this on rasperry pi's.
With Raspberry Pi OS Lite 64 bit I can't get it to work at all.
But with Ubuntu I can get 1 master up and running with nodes.
But when I try with 2 or 3 masters I get stuck with the same issue.

image

@timothystewart6
Copy link
Contributor

timothystewart6 commented Apr 3, 2022

Do all machines have the same time zones, same ssh keys, and are able to communicate with each other? Are you using passwordless sudo? If if not you might have to pass in additional flags like --user user --ask-become I would also remove any additional args from your nodes. These can be problematic on slow machines.

@wlwwt
Copy link

wlwwt commented Apr 3, 2022

I am encountering the same issue on a setup provisioned with vagrant. Here is the stacktrace in verbose mode

FAILED - RETRYING: [172.20.20.11]: Verify that all nodes actually joined (check k3s-init.service if this fails) (20 retries left).Result was: { "attempts": 1, "changed": false, "cmd": [ "k3s", "kubectl", "get", "nodes", "-l", "node-role.kubernetes.io/master=true", "-o=jsonpath={.items[*].metadata.name}" ], "delta": "0:00:02.058195", "end": "2022-04-03 19:31:08.887394", "invocation": { "module_args": { "_raw_params": "k3s kubectl get nodes -l \"node-role.kubernetes.io/master=true\" -o=jsonpath=\"{.items[*].metadata.name}\"", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true, "warn": false } }, "msg": "non-zero return code", "rc": 1, "retries": 21, "start": "2022-04-03 19:31:06.829199", "stderr": "time=\"2022-04-03T19:31:06Z\" level=info msg=\"Acquiring lock file /var/lib/rancher/k3s/data/.lock\"\nThe connection to the server localhost:8080 was refused - did you specify the right host or port?", "stderr_lines": [ "time=\"2022-04-03T19:31:06Z\" level=info msg=\"Acquiring lock file /var/lib/rancher/k3s/data/.lock\"", "The connection to the server localhost:8080 was refused - did you specify the right host or port?" ], "stdout": "", "stdout_lines": [] }

@FraGThieF
Copy link
Author

Expand your hard disks... On all nodes...

Probably should make a note Tim. You do say this in the video! Thx for your work!

How big should the hard disk be? ATM it is 86% free?`

Do all machines have the same time zones, same ssh keys, and are able to communicate with each other? Are you using passwordless sudo? If if not you might have to pass in additional flags like --user user --ask-become I would also remove any additional args from your nodes. These can be problematic on slow machines.

Yes, yes, yes, yes.

I have tried it with only the first do neccessary args and run into the same issue again.

@timothystewart6
Copy link
Contributor

can you please paste your all.yaml along with OS you are running these on? it's hard to tell what you are using vs what you copied and pasted from docs.

@svartis
Copy link
Contributor

svartis commented Apr 3, 2022

Ok, so I was able to solve my issue.
So to be able to post my full all.yaml I changed my k3s token from "K10908c1e28096f6665800342ac1bd8962df701dca5519d73b878a89d1921b432b4" to "mynotsosecrettoken" and this fixed my issue.
So now I can run 2 masters and 2 workers that are running on Ubuntu on raspberry pi.

I have also done a reset and verified that the old token was causing the issue.
And by doing another reset and changing it to the new token I was able to successfully run 2 masters and 2 workers again.

After some digging in logs I found this error line. So I was a bit unlucky with the token I had set:

Apr 04 03:04:03 master01 k3s[5136]: time="2022-04-04T03:04:03+02:00" level=fatal msg="failed to normalize token; must be in format K10<CA-HASH>::<USERNAME>:<PASSWORD> or <PASSWORD>"

image

k3s_version: v1.23.4+k3s1
# this is the user that has ssh access to these machines
ansible_user: pi
systemd_dir: /etc/systemd/system

# interface which will be used for flannel
flannel_iface: "eth0"

# apiserver_endpoint is virtual ip-address which will be configured on each master
apiserver_endpoint: "192.168.90.30"

# k3s_token is required  masters can talk together securely
# this token should be alpha numeric only
k3s_token: "mynotsosecrettoken"

# change these to your liking, the only required one is--no-deploy servicelb
extra_server_args: "--no-deploy servicelb --no-deploy traefik --write-kubeconfig-mode 644"
extra_agent_args: ""

# image tag for kube-vip
kube_vip_tag_version: "v0.4.3"

# image tag for metal lb
metal_lb_speaker_tag_version: "v0.12.1"
metal_lb_controller_tag_version: "v0.12.1"

# metallb ip range for load balancer
metal_lb_ip_range: "192.168.90.80-192.168.90.81"

@yankeeinlondon
Copy link

yankeeinlondon commented Apr 3, 2022

I had some odd disk storage issues on one of my physical devices but coming out of that I'm still experiencing the problem. I do have some initial success with the first master server:

image

and then I get this message which I don't quite know how to read:

image

All three masters are now engaging and there is a "change" but it's writing text out to stderr seems like potentially a bad sign. This is then followed by the slow beat of an unhappy service:

image

image

Diagnostics at this point are:

  • Load balancer VIP times out on ping (192.168.100.200)
  • Attempts to ping individual masters works without issue
    image

Final output is:

image

My all.yml file is:

---
k3s_version: v1.23.5-rc5+k3s1
# this is the user that has ssh access to these machines
ansible_user: ken
become: true
systemd_dir: /etc/systemd/system

# interface which will be used for flannel
flannel_iface: "eth0"

# apiserver_endpoint is virtual ip-address which will be configured on each master
apiserver_endpoint: "192.168.100.200"

# k3s_token is required  masters can talk together securely
# this token should be alpha numeric only
k3s_token: "xxxxxxxxxxxxxxxx"

# change these to your liking, the only required one is--no-deploy servicelb
extra_server_args: "--no-deploy servicelb --no-deploy traefik --write-kubeconfig-mode 644 --kube-api-server-arg default-not-ready-toleration-seconds=30 --kube-apiserver-arg default-unreachable-toleration-seconds=30 --kube-controller-arg node-monitor-period=20s --kube-controller-arg node-monitor-grace-period=20s --kubelet-arg node-status-update-frequency=5s"
extra_agent_args: "--kubelet-arg node-status-update-frequency=5s"

# image tag for kube-vip
kube_vip_tag_version: "v0.4.3"

# image tag for metal lb
metal_lb_speaker_tag_version: "v0.12.1"
metal_lb_controller_tag_version: "v0.12.1"

# metallb ip range for load balancer
metal_lb_ip_range: "192.168.100.80-192.168.100.89"

@yankeeinlondon
Copy link

Oh, i also traversed all nodes -- master and worker -- and ran k3s check-status and all came back with a "pass" status.

image

@yankeeinlondon
Copy link

Not surprisingly trying to check on the nodes in the cluster failed as the service is not running:

image

@yankeeinlondon
Copy link

I can -- however -- manually start the masters with sudo k3s server & and they do stay up as this 404 error shows:
image

In the long trail of logs that come from starting these servers the only errors I'm seeing are the following:

image

could this be indicative of a missing network dep to support websockets?

@yankeeinlondon
Copy link

Finally, note that in this current state, I can reach the active node directly at 192.168.100.204 but the load balancing VIP seems to be up too as I get the same results hitting it on 192.168.100.200

@yankeeinlondon
Copy link

Now I find this very odd ... even though I set the configuration as you did in your video ... I am not able to run the kubectl commands without sudo:

image

Even more concerning, each of the masters is aware of only itself rather than the cluster at large.

@svartis
Copy link
Contributor

svartis commented Apr 4, 2022

Regarding my issue running on Raspberry Pi OS Lite 64 bit.
I had to manually add " cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory" to the cmdline.txt file.
So it looks like ansible is not adding it as expected.

@timothystewart6
Copy link
Contributor

@yankeeinlondon please try without all of the args and be sure you have enough disk space on these nodes

@timothystewart6
Copy link
Contributor

Also, this is turning more into a discussion rather than to report bugs :)

@techno-tim techno-tim locked and limited conversation to collaborators Apr 4, 2022
@timothystewart6 timothystewart6 converted this issue into discussion #11 Apr 4, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
missing information Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants