Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minikube Unrecoverable Failure Accural - potential State Issue? #1136

Open
jbkc85 opened this issue Mar 10, 2017 · 4 comments
Open

Minikube Unrecoverable Failure Accural - potential State Issue? #1136

jbkc85 opened this issue Mar 10, 2017 · 4 comments

Comments

@jbkc85
Copy link

jbkc85 commented Mar 10, 2017

While discussing Linkerd in Slack with @olix0r, I noticed an issue of what he mentioned was Failure Accural occurring - however after multiple restarts of the service, Linkerd never recovered the specific service in Failure.

This actually also occurs on the chance that a service/deployment is scheduled that is later deleted/rescheduled on a 'broken' port, but the container itself still exposes a working port. Linkerd will continue to associate the working port, even if the service for Kubernetes is specifically pointing elsewhere.

Maybe related to #1114? Seems similar in nature...


Environment

$ minikube version
minikube version: v0.16.0
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.0", GitCommit:"a16c0a7f71a6f93c7e0f222d961f4675cd97a46b", GitTreeState:"clean", BuildDate:"2016-09-26T18:16:57Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}

Kubernetes YAML files

ConfigMap for Linkerd Configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: test-linkerd-config
data:
  config.yaml: |-
    admin:
      port: 9990

    namers:
    - kind: io.l5d.k8s
      experimental: true
      host: 127.0.0.1
      port: 8001

    routers:
    - protocol: http
      servers:
      - port: 8080
        ip: 0.0.0.0
      dtab: |
        /iface      => /#/io.l5d.k8s/default;
        /svc        => /iface/http;

Linkerd Deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: linkerd-proxy-controller
  labels:
    k8s-app: linkerd-proxy-lb
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: linkerd-proxy-lb
  template:
    metadata:
      labels:
        k8s-app: linkerd-proxy-lb
        name: linkerd-proxy-lb
        proxy: v0.9.0
    spec:
      terminationGracePeriodSeconds: 60
      dnsPolicy: ClusterFirst
      volumes:
      - name: linkerd-config
        configMap:
          name: "test-linkerd-config"
      containers:
      - name: linkerd
        image: buoyantio/linkerd:latest
        args:
        - "/io.buoyant/linkerd/config/config.yaml"
        - "-log.level=DEBUG"
        ports:
        - name: ext
          containerPort: 8080
          hostPort: 9980
        - name: admin
          containerPort: 9990
          hostPort: 9990
        volumeMounts:
        - name: "linkerd-config"
          mountPath: "/io.buoyant/linkerd/config"
          readOnly: true

      - name: kubectl
        image: buoyantio/kubectl:1.2.3
        args:
        - "proxy"
        - "-p"
        - "8001"

working service/deployment:

apiVersion: v1
kind: Service
metadata:
  name: goapp-svc
  labels:
    app: goapp
spec:
  selector:
    app: goapp
  ports:
    - name: http
      port: 80
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: goapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: goapp
  template:
    metadata:
      labels:
        name: goapp
        app: goapp
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: goapp
        image: kelseyhightower/app-healthz:1.0.0
        ports:
        - name: http
          containerPort: 80

broken service/deployment:

apiVersion: v1
kind: Service
metadata:
  name: goapp-svc
  labels:
    app: goapp
spec:
  selector:
    app: goapp
  ports:
    - name: http
      port: 81
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: goapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: goapp
  template:
    metadata:
      labels:
        name: goapp
        app: goapp
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: goapp
        image: kelseyhightower/app-healthz:1.0.0
        ports:
        - name: http
          containerPort: 81

Reproducing Issue

The directions below show an example of producing an issue from deploying a working service, testing it, and then deleting and deploying a non-working service.

$ kubectl create -f test-configmap.yaml
configmap "test-linkerd-config" created
$ kubectl create -f test-deployment.yaml
deployment "linkerd-proxy-controller" created
$ kubectl get svc
NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   10.0.0.1     <none>        443/TCP   14h
$ kubectl get pods
NAME                                        READY     STATUS    RESTARTS   AGE
linkerd-proxy-controller-1173476374-r44fh   2/2       Running   0          9s
$ curl 192.168.99.100:9990/admin/ping
pong
$ kubectl create -f test-works.yaml
service "goapp-svc" created
deployment "goapp" created
$ kubectl get svc
NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
goapp-svc    10.0.0.188   <none>        80/TCP    11s
kubernetes   10.0.0.1     <none>        443/TCP   15h
$ curl -si -H "Host: goapp-svc" 192.168.99.100:9980 | head -n1
HTTP/1.1 200 OK
$ kubectl delete -f test-works.yaml
service "goapp-svc" deleted
deployment "goapp" deleted
$ kubectl create -f test-fails.yaml
service "goapp-svc" created
deployment "goapp" created
$ kubectl get svc
NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
goapp-svc    10.0.0.123   <none>        81/TCP    6s
kubernetes   10.0.0.1     <none>        443/TCP   15h
$ curl -si -H "Host: goapp-svc" 192.168.99.100:9980 | head -n 1
HTTP/1.1 200 OK
$ kubectl delete -f test-deployment.yaml
deployment "linkerd-proxy-controller" deleted
$ kubectl create -f test-deployment.yaml
deployment "linkerd-proxy-controller" created
$ curl -si -H "Host: goapp-svc" 192.168.99.100:9980 | head -n 1
HTTP/1.1 502 Bad Gateway

Logs from Linkerd:

2017-03-10T17:52:03.895929402Z D 0310 17:52:03.894 UTC THREAD19: k8s ns default initial state: goapp-svc, kubernetes
2017-03-10T17:52:03.896535245Z D 0310 17:52:03.895 UTC THREAD19: k8s ns default service goapp-svc found
2017-03-10T17:52:03.897200021Z D 0310 17:52:03.895 UTC THREAD19: k8s ns default service goapp-svc port http found + /
2017-03-10T17:56:59.497863558Z D 0310 17:56:59.488 UTC THREAD19: k8s ns default deleted: goapp-svc
2017-03-10T17:56:59.510000788Z D 0310 17:56:59.489 UTC THREAD19: k8s ns default initial state: kubernetes
2017-03-10T17:56:59.524061588Z D 0310 17:56:59.502 UTC THREAD19: k8s ns default service goapp-svc missing
2017-03-10T17:57:25.472192860Z D 0310 17:57:25.471 UTC THREAD19: k8s ns default added: goapp-svc
2017-03-10T17:57:25.474783741Z D 0310 17:57:25.472 UTC THREAD19: k8s ns default initial state: kubernetes, goapp-svc
2017-03-10T17:57:25.475877991Z D 0310 17:57:25.474 UTC THREAD19: k8s ns default service goapp-svc found
2017-03-10T17:57:25.476743333Z D 0310 17:57:25.476 UTC THREAD19: k8s ns default service goapp-svc port http missing
2017-03-10T17:57:25.571866348Z D 0310 17:57:25.569 UTC THREAD19: k8s ns default modified: goapp-svc
2017-03-10T17:57:26.265959668Z D 0310 17:57:26.265 UTC THREAD19: k8s ns default modified: goapp-svc
2017-03-10T17:57:26.267886963Z D 0310 17:57:26.267 UTC THREAD19: k8s ns default service goapp-svc port http found + /
@jbkc85
Copy link
Author

jbkc85 commented Mar 10, 2017

updated configMap, forgot I changed it to 'test-linkerd-config' to avoid interrupting my work on Linkerd

@siggy
Copy link
Member

siggy commented Apr 13, 2017

i am seeing a similar issue in minikube. not sure if related, but the steps to reproduce are simpler.

full output of k8s endpoints api and linkerd debug log are at:
https://gist.github.com/siggy/19c049a62bc8e9b65cac041c2921346b

steps to repro

Deploy linkerd and app

kubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/linkerd.yml
kubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml

Verify routing works

OUTGOING_PORT=$(kubectl get svc l5d -o jsonpath='{.spec.ports[?(@.name=="outgoing")].nodePort}')
L5D_INGRESS_LB=http:https://$(minikube ip):$OUTGOING_PORT
http_proxy=$L5D_INGRESS_LB curl -s http:https://world
world (172.17.0.9)!

Redeploy app

kubectl delete -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml
kubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml

Observe routing failure

$ http_proxy=$L5D_INGRESS_LB curl -s http:https://world
No hosts are available for /svc/world, Dtab.base=[/srv=>/#/io.l5d.k8s/default/http;/host=>/srv;/svc=>/host;/host/world=>/srv/world-v1], Dtab.local=[]. Remote Info: Not Available

Observe delegator api returns healthy endpoints

$ ADMIN_PORT=$(kubectl get svc l5d -o jsonpath='{.spec.ports[?(@.name=="admin")].nodePort}')
$ curl -H "Content-Type: application/json" -X POST -d '{"namespace":"incoming","dtab":"/srv=>/#/io.l5d.k8s/default/http;/host=>/srv;/svc=>/host;/host/world=>/srv/world-v1","path":"/svc/world"}' http:https://$(minikube ip):$ADMIN_PORT/delegator.json

{"type":"delegate","path":"/svc/world","delegate":{"type":"alt","path":"/host/world","dentry":{"prefix":"/svc","dst":"/host"},"alt":[{"type":"delegate","path":"/srv/world-v1","dentry":{"prefix":"/host/world","dst":"/srv/world-v1"},"delegate":{"type":"transformation","path":"/#/io.l5d.k8s/default/http/world-v1","name":"SubnetLocalTransformer","bound":{"addr":{"type":"bound","addrs":[{"ip":"172.17.0.13","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.11","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.12","port":7778,"meta":{"nodeName":"minikube"}}],"meta":{}},"id":"/#/io.l5d.k8s/default/http/world-v1","path":"/"},"tree":{"type":"leaf","path":"/%/io.l5d.k8s.localnode/172.17.0.3/#/io.l5d.k8s/default/http/world-v1","dentry":{"prefix":"/srv","dst":"/#/io.l5d.k8s/default/http"},"bound":{"addr":{"type":"bound","addrs":[{"ip":"172.17.0.13","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.11","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.12","port":7778,"meta":{"nodeName":"minikube"}}],"meta":{}},"id":"/%/io.l5d.k8s.localnode/172.17.0.3/#/io.l5d.k8s/default/http/world-v1","path":"/"}}}},{"type":"delegate","path":"/srv/world","dentry":{"prefix":"/host","dst":"/srv"},"delegate":{"type":"neg","path":"/#/io.l5d.k8s/default/http/world","dentry":{"prefix":"/srv","dst":"/#/io.l5d.k8s/default/http"}}}]}}

Observe successful curl to world service from inside l5d container

$ kubectl exec -it l5d-r6fv5 -c l5d curl 172.17.0.11:7778
world (172.17.0.11)!

@olix0r
Copy link
Member

olix0r commented Apr 13, 2017

Can you share linkerd's metrics? (:9990/admin/metrics.json?pretty=1)

@siggy
Copy link
Member

siggy commented Apr 13, 2017

updated gist with metrics.json https://gist.github.com/siggy/19c049a62bc8e9b65cac041c2921346b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

4 participants