Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to connect to cluster with kubectl, does not complete #188

Open
nkhine opened this issue Jun 7, 2017 · 3 comments
Open

Trying to connect to cluster with kubectl, does not complete #188

nkhine opened this issue Jun 7, 2017 · 3 comments

Comments

@nkhine
Copy link
Contributor

nkhine commented Jun 7, 2017

I have installed the latest tack and have on issue:

➜ ~ kubectl get po
Unable to connect to the server: dial tcp: lookup kz8s-apiserver-test-xxx.eu-west-2.elb.amazonaws.com on 127.0.1.1:53: no such host
➜ ~

so the last step, does not complete:

❤ Trying to connect to cluster with kubectl...............................................

i am able to login to the pki-machine, bastion and etcd machines, on the pki machine, i have this error:

core@ip-10-0-10-9 ~ $ journalctl -fl 
-- Logs begin at Wed 2017-06-07 08:45:51 UTC. --
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1]: Started Session 2 of user core.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd-logind[846]: New session 2 of user core.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1412]: Reached target Paths.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1412]: Reached target Sockets.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1412]: Reached target Timers.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1412]: Reached target Basic System.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1412]: Reached target Default.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1412]: Startup finished in 15ms.
Jun 07 09:01:40 ip-10-0-10-9.eu-west-2.compute.internal systemd[1]: Started User Manager for UID 500.
Jun 07 09:02:19 ip-10-0-10-9.eu-west-2.compute.internal locksmithd[1295]: Unlocking old locks failed: error setting up lock: Error initializing etcd client: client: etcd cluster is unavailable or misconfigured. Retrying in 5m0s.
core@ip-10-0-10-9 ~ $ systemctl status
● ip-10-0-10-9.eu-west-2.compute.internal
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Wed 2017-06-07 08:45:57 UTC; 25min ago
   CGroup: /
           ├─user.slice
           │ └─user-500.slice
           │   ├─session-2.scope
           │   │ ├─1410 sshd: core [priv]
           │   │ ├─1418 sshd: core@pts/0
           │   │ ├─1419 -bash
           │   │ ├─1430 systemctl status
           │   │ └─1431 /usr/bin/less
           │   └─[email protected]
           │     └─init.scope
           │       ├─1412 /usr/lib/systemd/systemd --user
           │       └─1413 (sd-pam)
           ├─init.scope
           │ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 15
           └─system.slice
             ├─systemd-machined.service
             │ └─1123 /usr/lib/systemd/systemd-machined
             ├─systemd-timesyncd.service
             │ └─734 /usr/lib/systemd/systemd-timesyncd
             ├─cfssl.service
             │ └─1279 /opt/bin/cfssl serve -address 0.0.0.0 -ca /etc/cfssl/ca.pem -ca-key /etc/cfssl/ca-key.pem -config /etc/cfssl/ca-config.json
             ├─dbus.service
             │ └─797 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
             ├─update-engine.service
             │ └─845 /usr/sbin/update_engine -foreground -logtostderr
             ├─system-serial\x2dgetty.slice
             │ └─[email protected]
             │   └─921 /sbin/agetty --keep-baud 115200 38400 9600 ttyS0 vt220
             ├─system-getty.slice
             │ └─[email protected]
             │   └─922 /sbin/agetty --noclear tty1 linux
             ├─systemd-logind.service
             │ └─846 /usr/lib/systemd/systemd-logind
             ├─locksmithd.service
             │ └─1295 /usr/lib/locksmith/locksmithd
             ├─systemd-resolved.service
             │ └─852 /usr/lib/systemd/systemd-resolved
             ├─polkit.service
             │ └─868 /usr/lib/polkit-1/polkitd --no-debug
             ├─systemd-udevd.service
             │ └─566 /usr/lib/systemd/systemd-udevd
             ├─systemd-journald.service
             │ └─532 /usr/lib/systemd/systemd-journald
             └─systemd-networkd.service
               └─779 /usr/lib/systemd/systemd-networkd

on the etcd, i have this:

Container Linux by CoreOS stable (1353.8.0)
core@ip-10-0-10-10 ~ $  journalctl -fl 
-- Logs begin at Wed 2017-06-07 08:46:10 UTC. --
Jun 07 09:12:29 ip-10-0-10-10.eu-west-2.compute.internal systemd[3696]: Startup finished in 35ms.
Jun 07 09:12:29 ip-10-0-10-10.eu-west-2.compute.internal systemd[1]: Started User Manager for UID 500.
Jun 07 09:14:59 ip-10-0-10-10.eu-west-2.compute.internal kubelet-wrapper[1777]: E0607 09:14:59.073010    1777 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
Jun 07 09:15:07 ip-10-0-10-10.eu-west-2.compute.internal etcd-wrapper[1300]: 2017-06-07 09:15:07.478614 I | mvcc: store.index: compact 2323
Jun 07 09:15:07 ip-10-0-10-10.eu-west-2.compute.internal etcd-wrapper[1300]: 2017-06-07 09:15:07.480529 I | mvcc: finished scheduled compaction at 2323 (took 1.271578ms)
Jun 07 09:16:19 ip-10-0-10-10.eu-west-2.compute.internal systemd-timesyncd[726]: Network configuration changed, trying to establish connection.
Jun 07 09:16:20 ip-10-0-10-10.eu-west-2.compute.internal systemd-timesyncd[726]: Synchronized to time server 66.232.97.8:123 (2.coreos.pool.ntp.org).
Jun 07 09:19:59 ip-10-0-10-10.eu-west-2.compute.internal kubelet-wrapper[1777]: E0607 09:19:59.089258    1777 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
Jun 07 09:20:07 ip-10-0-10-10.eu-west-2.compute.internal etcd-wrapper[1300]: 2017-06-07 09:20:07.495803 I | mvcc: store.index: compact 2804
Jun 07 09:20:07 ip-10-0-10-10.eu-west-2.compute.internal etcd-wrapper[1300]: 2017-06-07 09:20:07.496958 I | mvcc: finished scheduled compaction at 2804 (took 874.22µs)
Jun 07 09:24:59 ip-10-0-10-10.eu-west-2.compute.internal kubelet-wrapper[1777]: E0607 09:24:59.104149    1777 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
Jun 07 09:25:07 ip-10-0-10-10.eu-west-2.compute.internal etcd-wrapper[1300]: 2017-06-07 09:25:07.503462 I | mvcc: store.index: compact 3283
Jun 07 09:25:07 ip-10-0-10-10.eu-west-2.compute.internal etcd-wrapper[1300]: 2017-06-07 09:25:07.504568 I | mvcc: finished scheduled compaction at 3283 (took 869.887µs)

any advise is much appreciated

@tomfotherby
Copy link

I'm a beginner. I had similar issues a few times when I started. Sometimes just waiting helped, e.g. wait 15 minutes for the cluster to "settle". Other times I had to make clean and then make all again. I would say 1 in 3 attempts succeeded. Once it does succeed it is stable (at least for the 2 weeks I've been trialing Kubernetes).

@nkhine
Copy link
Contributor Author

nkhine commented Jun 7, 2017

i have tried several times and is the same outcome.

the only difference in my setup is that i am using eu-west-2 and had to fix the get-ca cert in relation to #170 issue - will send the patch, but wanted to confirm it was all working, which is why this issue.

diff --git a/scripts/get-ca b/scripts/get-ca
index 15682a0..0e82329 100755
--- a/scripts/get-ca
+++ b/scripts/get-ca
@@ -2,10 +2,12 @@
 
 source ${0%/*}/retry
 
+echo $AWS_REGION
 echo $DIR_SSL
 PKI_S3_BUCKET=`terraform output s3-bucket`
 CA_PATH="s3:https://$PKI_S3_BUCKET/ca.pem"
 
 mkdir -p $DIR_SSL
 
-_retry "❤ Grabbing $CA_PATH" aws s3 cp $CA_PATH $DIR_SSL
+_retry "❤ Grabbing $CA_PATH" aws s3 cp $CA_PATH $DIR_SSL/ca.pem  --recursive --region $AWS_REGION
(END)

@tomfotherby
Copy link

tomfotherby commented Jun 7, 2017

When I tried eu-west-2 it failed because that region didn't have the m3.large instance types that tack uses.
Maybe try with the default tack region just to see if it works. then you will have some confidence that it can work in London region with some "fixes".

(I'm a real beginner, really my words are not something to trust)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants