-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minikube Unrecoverable Failure Accural - potential State Issue? #1136
Comments
updated configMap, forgot I changed it to 'test-linkerd-config' to avoid interrupting my work on Linkerd |
i am seeing a similar issue in minikube. not sure if related, but the steps to reproduce are simpler. full output of k8s endpoints api and linkerd debug log are at: steps to reproDeploy linkerd and appkubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/linkerd.yml
kubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml Verify routing worksOUTGOING_PORT=$(kubectl get svc l5d -o jsonpath='{.spec.ports[?(@.name=="outgoing")].nodePort}')
L5D_INGRESS_LB=http:https://$(minikube ip):$OUTGOING_PORT
http_proxy=$L5D_INGRESS_LB curl -s http:https://world
world (172.17.0.9)! Redeploy appkubectl delete -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml
kubectl apply -f https://raw.githubusercontent.com/linkerd/linkerd-examples/master/k8s-daemonset/k8s/hello-world-legacy.yml Observe routing failure$ http_proxy=$L5D_INGRESS_LB curl -s http:https://world
No hosts are available for /svc/world, Dtab.base=[/srv=>/#/io.l5d.k8s/default/http;/host=>/srv;/svc=>/host;/host/world=>/srv/world-v1], Dtab.local=[]. Remote Info: Not Available Observe delegator api returns healthy endpoints$ ADMIN_PORT=$(kubectl get svc l5d -o jsonpath='{.spec.ports[?(@.name=="admin")].nodePort}')
$ curl -H "Content-Type: application/json" -X POST -d '{"namespace":"incoming","dtab":"/srv=>/#/io.l5d.k8s/default/http;/host=>/srv;/svc=>/host;/host/world=>/srv/world-v1","path":"/svc/world"}' http:https://$(minikube ip):$ADMIN_PORT/delegator.json
{"type":"delegate","path":"/svc/world","delegate":{"type":"alt","path":"/host/world","dentry":{"prefix":"/svc","dst":"/host"},"alt":[{"type":"delegate","path":"/srv/world-v1","dentry":{"prefix":"/host/world","dst":"/srv/world-v1"},"delegate":{"type":"transformation","path":"/#/io.l5d.k8s/default/http/world-v1","name":"SubnetLocalTransformer","bound":{"addr":{"type":"bound","addrs":[{"ip":"172.17.0.13","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.11","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.12","port":7778,"meta":{"nodeName":"minikube"}}],"meta":{}},"id":"/#/io.l5d.k8s/default/http/world-v1","path":"/"},"tree":{"type":"leaf","path":"/%/io.l5d.k8s.localnode/172.17.0.3/#/io.l5d.k8s/default/http/world-v1","dentry":{"prefix":"/srv","dst":"/#/io.l5d.k8s/default/http"},"bound":{"addr":{"type":"bound","addrs":[{"ip":"172.17.0.13","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.11","port":7778,"meta":{"nodeName":"minikube"}},{"ip":"172.17.0.12","port":7778,"meta":{"nodeName":"minikube"}}],"meta":{}},"id":"/%/io.l5d.k8s.localnode/172.17.0.3/#/io.l5d.k8s/default/http/world-v1","path":"/"}}}},{"type":"delegate","path":"/srv/world","dentry":{"prefix":"/host","dst":"/srv"},"delegate":{"type":"neg","path":"/#/io.l5d.k8s/default/http/world","dentry":{"prefix":"/srv","dst":"/#/io.l5d.k8s/default/http"}}}]}} Observe successful curl to world service from inside l5d container$ kubectl exec -it l5d-r6fv5 -c l5d curl 172.17.0.11:7778
world (172.17.0.11)! |
Can you share linkerd's metrics? ( |
updated gist with metrics.json https://gist.github.com/siggy/19c049a62bc8e9b65cac041c2921346b |
While discussing Linkerd in Slack with @olix0r, I noticed an issue of what he mentioned was Failure Accural occurring - however after multiple restarts of the service, Linkerd never recovered the specific service in Failure.
This actually also occurs on the chance that a service/deployment is scheduled that is later deleted/rescheduled on a 'broken' port, but the container itself still exposes a working port. Linkerd will continue to associate the working port, even if the service for Kubernetes is specifically pointing elsewhere.
Maybe related to #1114? Seems similar in nature...
Environment
Kubernetes YAML files
ConfigMap for Linkerd Configuration:
Linkerd Deployment:
working service/deployment:
broken service/deployment:
Reproducing Issue
The directions below show an example of producing an issue from deploying a working service, testing it, and then deleting and deploying a non-working service.
Logs from Linkerd:
The text was updated successfully, but these errors were encountered: