Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services and opaque ports #2426

Closed
1 task
migue opened this issue May 21, 2021 · 1 comment
Closed
1 task

Services and opaque ports #2426

migue opened this issue May 21, 2021 · 1 comment

Comments

@migue
Copy link

migue commented May 21, 2021

Issue Type:

  • [X ] Bug report
  • Feature request

What happened:

Defining a new service (ElasticSearch) and marking its default port (9200) as opaque makes the service unavailable, returning a 502 Bad Gateway on every request.

What you expected to happen:

I would expect the service to be available. If you remove the opaque-port definition it works smoothly

How to reproduce it (as minimally and precisely as possible):

The first step is to deploy an ElasticSearch cluster (I've deployed one single node cluster) and define the corresponding service. I've just happened to find this problem using ElasticSearch but I don't think the type of service it's relevant for this particular problem.

NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
elasticsearch   ClusterIP   10.0.102.110   <none>        9200/TCP   18m

Then you can just go to any other pod in your cluster and try to perform a regular curl:

curl https://elasticsearch:9200

You will get a 502 Bad Gateway error and, looking at the logs in the Linkerd sidecar :

[  1472.597759s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.16:46202}:server{port=4143}:direct: linkerd_tls::server: Peeked bytes from TCP stream sz=301
[  1472.598753s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.16:46202}:server{port=4143}:direct: linkerd_tls::server: Accepted TLS connection client.id=Some(ClientId("unicorn.ghae.serviceaccount.identity.linkerd.cluster.local")) alpn=Some("transport.l5d.io/v1")
[  1472.598802s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.16:46202}:server{port=4143}:direct: linkerd_transport_header::server: Read transport header header=TransportHeader { port: 9200, name: None, protocol: Some(Http2) }
[  1472.598856s]  INFO ThreadId(01) inbound:accept{client.addr=10.244.3.16:46202}: linkerd_app_core::serve: Connection closed error=A named target must be provided on gateway connections
[  1476.432186s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32826}: linkerd_tls::server: Peeked bytes from TCP stream sz=118
[  1476.432228s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32826}: linkerd_detect: DetectResult protocol=Some(Http1) elapsed=6.4µs
[  1476.432238s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32826}: linkerd_proxy_http::server: Creating HTTP service version=Http1
[  1476.432251s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32826}: linkerd_proxy_http::server: Handling as HTTP version=Http1
[  1476.432371s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32826}: linkerd_proxy_http::server: The client is shutting down the connection res=Ok(())
[  1476.432400s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32826}: linkerd_app_core::serve: Connection closed
[  1480.556736s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32932}: linkerd_tls::server: Peeked bytes from TCP stream sz=117
[  1480.556779s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32932}: linkerd_detect: DetectResult protocol=Some(Http1) elapsed=5.601µs
[  1480.556791s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32932}: linkerd_proxy_http::server: Creating HTTP service version=Http1
[  1480.556801s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32932}: linkerd_proxy_http::server: Handling as HTTP version=Http1
[  1480.556908s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32932}: linkerd_proxy_http::server: The client is shutting down the connection res=Ok(())
[  1480.556939s] DEBUG ThreadId(02) daemon:admin{listen.addr=0.0.0.0:4191}:accept{client.addr=10.244.3.1:32932}: linkerd_app_core::serve: Connection closed
[  1482.284320s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.1:56880}:server{port=9200}: linkerd_app_inbound::prevent_loop: addr=10.244.3.18:9200 self.port=4143
[  1482.284420s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.1:56880}:server{port=9200}:forward:tcp: linkerd_proxy_transport::connect: Connecting server.addr=127.0.0.1:9200
[  1482.284546s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.1:56880}:server{port=9200}:forward:tcp: linkerd_proxy_transport::connect: Connected local.addr=127.0.0.1:56646 keepalive=None
[  1482.284570s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.1:56880}:server{port=9200}:forward:tcp: linkerd_proxy_transport::metrics: client connection open
[  1482.285240s] DEBUG ThreadId(01) inbound:accept{client.addr=10.244.3.1:56880}: linkerd_app_core::serve: Connection closed

Looking at the logs in the Linkerd sidecar of the container where the request is being initiated we can find:

[  1837.792907s] DEBUG ThreadId(01) server{port=9100}:http{v=h2}:http1{name=nginx-unicorn.ghae.svc.cluster.local:9100}:profile:http1: linkerd_proxy_http::client: method=GET uri=https://miguelangelp.ghaekube.net/sessions/_auth_request_endpoint version=HTTP/1.0
[  1837.792928s] DEBUG ThreadId(01) server{port=9100}:http{v=h2}:http1{name=nginx-unicorn.ghae.svc.cluster.local:9100}:profile:http1: linkerd_proxy_http::client: headers={"host": "miguelangelp.ghaekube.net", "x-forwarded-for": "127.0.0.1", "l5d-dst-canonical": "nginx-unicorn.ghae.svc.cluster.local:9100"}
[  1837.792941s] DEBUG ThreadId(01) server{port=9100}:http{v=h2}:http1{name=nginx-unicorn.ghae.svc.cluster.local:9100}:profile:http1: linkerd_proxy_http::h1: Using one-off client use_absolute_form=false is_missing_host=false
[  1837.792990s] DEBUG ThreadId(01) server{port=9100}:http{v=h2}:http1{name=nginx-unicorn.ghae.svc.cluster.local:9100}:profile:http1: linkerd_proxy_transport::connect: Connecting server.addr=127.0.0.1:9100
[  1837.793128s] DEBUG ThreadId(01) server{port=9100}:http{v=h2}:http1{name=nginx-unicorn.ghae.svc.cluster.local:9100}:profile:http1: linkerd_proxy_transport::connect: Connected local.addr=127.0.0.1:46200 keepalive=None
[  1837.793148s] DEBUG ThreadId(01) server{port=9100}:http{v=h2}:http1{name=nginx-unicorn.ghae.svc.cluster.local:9100}:profile:http1: linkerd_proxy_transport::metrics: client connection open
[  1837.830592s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}: linkerd_cache: Caching new service target=Accept { orig_dst: OrigDstAddr(10.0.102.110:9200), protocol: () }
[  1837.830713s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}: linkerd_timeout::failfast: TCP Server service has become unavailable
[  1837.830781s] DEBUG ThreadId(01) server{orig_dst=10.0.89.66:11211}: linkerd_dns: resolve_srv name=linkerd-dst-headless.linkerd.svc.cluster.local.
[  1837.831946s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:poll_profile: linkerd_service_profiles::client: profile received: DestinationProfile { fully_qualified_name: "elasticsearch.ghae.svc.cluster.local", opaque_protocol: false, routes: [], retry_budget: Some(RetryBudget { retry_ratio: 0.2, min_retries_per_second: 10, ttl: Some(Duration { seconds: 10, nanos: 0 }) }), dst_overrides: [WeightedDst { authority: "elasticsearch.ghae.svc.cluster.local.:9200", weight: 10000 }], endpoint: None }
[  1837.831995s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:poll_profile: linkerd_service_profiles::client: profile received: DestinationProfile { fully_qualified_name: "elasticsearch.ghae.svc.cluster.local", opaque_protocol: false, routes: [], retry_budget: Some(RetryBudget { retry_ratio: 0.2, min_retries_per_second: 10, ttl: Some(Duration { seconds: 10, nanos: 0 }) }), dst_overrides: [WeightedDst { authority: "elasticsearch.ghae.svc.cluster.local.:9200", weight: 10000 }], endpoint: None }
[  1837.832010s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:poll_profile: linkerd_service_profiles::client: profile received: DestinationProfile { fully_qualified_name: "elasticsearch.ghae.svc.cluster.local", opaque_protocol: false, routes: [], retry_budget: Some(RetryBudget { retry_ratio: 0.2, min_retries_per_second: 10, ttl: Some(Duration { seconds: 10, nanos: 0 }) }), dst_overrides: [WeightedDst { authority: "elasticsearch.ghae.svc.cluster.local.:9200", weight: 10000 }], endpoint: None }
[  1837.832051s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}: linkerd_detect: DetectResult protocol=Some(Http1) elapsed=9.3µs
[  1837.832063s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}: linkerd_proxy_http::server: Creating HTTP service version=Http1
[  1837.832075s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}: linkerd_cache: Caching new service target=Logical { orig_dst: OrigDstAddr(10.0.102.110:9200), protocol: Http1, profile: Some(..), logical_addr: Some(LogicalAddr(NameAddr { name: "elasticsearch.ghae.svc.cluster.local", port: 9200 })) }
[  1837.832125s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}: linkerd_proxy_http::server: Handling as HTTP version=Http1
[  1837.832201s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}: linkerd_service_profiles::http::route_request: Updating HTTP routes routes=0
[  1837.832233s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}: linkerd_service_profiles::split: Updating targets=[Target { addr: Name(NameAddr { name: "elasticsearch.ghae.svc.cluster.local.", port: 9200 }), weight: 10000 }]
[  1837.832256s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}: linkerd_proxy_api_resolve::resolve: Resolving dst=elasticsearch.ghae.svc.cluster.local:9200 context={"ns":"ghae", "nodeName":"aks-agentpool-32179278-vmss000004"}

[  1837.832283s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}: linkerd_timeout::failfast: HTTP Logical service has become unavailable
[  1837.832475s] DEBUG ThreadId(01) server{orig_dst=10.0.89.66:11211}: linkerd_dns: addrs=[10.244.0.4:8086, 10.244.2.3:8086, 10.244.5.28:8086]
[  1837.832489s] DEBUG ThreadId(01) server{orig_dst=10.0.89.66:11211}: linkerd_proxy_dns_resolve: addrs=[10.244.0.4:8086, 10.244.2.3:8086, 10.244.5.28:8086]
[  1837.833354s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}: linkerd_timeout::failfast: HTTP Balancer service has become unavailable
[  1837.833393s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}: linkerd_proxy_api_resolve::resolve: Add endpoints=1
[  1837.833456s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}: linkerd_proxy_http::client: Building HTTP client settings=OrigProtoUpgrade
[  1837.833469s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}: linkerd_app_outbound::tcp::opaque_transport: Using opaque transport target_port=9200 opaque_port=4143
[  1837.833481s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}: linkerd_tls::client: Initiating TLS connection server.id=default.ghae.serviceaccount.identity.linkerd.cluster.local
[  1837.833489s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}: linkerd_proxy_transport::connect: Connecting server.addr=10.244.3.18:4143
[  1837.833626s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:h2: linkerd_proxy_transport::connect: Connected local.addr=10.244.3.16:51070 keepalive=Some(10s)
[  1837.834643s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:h2: linkerd_tls::client: alpn=Ok("transport.l5d.io/v1")
[  1837.834681s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:h2: linkerd_app_outbound::tcp::opaque_transport: Wrote transport header sz=34
[  1837.834693s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:h2: linkerd_proxy_transport::metrics: client connection open
[  1837.834843s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:orig-proto-upgrade: linkerd_proxy_http::client: method=GET uri=https://elasticsearch:9200/ version=HTTP/1.1
[  1837.834859s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:orig-proto-upgrade: linkerd_proxy_http::client: headers={"host": "elasticsearch:9200", "user-agent": "curl/7.58.0", "accept": "*/*", "l5d-dst-canonical": "elasticsearch.ghae.svc.cluster.local:9200"}
[  1837.834869s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:orig-proto-upgrade: linkerd_proxy_http::orig_proto: Upgrading request version=HTTP/1.1 absolute_form=false
[  1837.835154s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}:logical{dst=elasticsearch.ghae.svc.cluster.local:9200}:concrete{addr=elasticsearch.ghae.svc.cluster.local:9200}:endpoint{server.addr=10.244.3.18:9200}:h2: linkerd_proxy_http::h2: failed error=connection error: broken pipe
[  1837.835193s]  WARN ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}: linkerd_app_core::errors: Failed to proxy request: connection error: broken pipe client.addr=10.244.3.16:59012
[  1837.835201s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}: linkerd_app_core::errors: Closing server-side connection
[  1837.835221s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}: linkerd_app_core::errors: Handling error with HTTP response status=502 Bad Gateway version=HTTP/1.1
[  1837.835262s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}:server{orig_dst=10.0.102.110:9200}:http{v=1.x}: linkerd_proxy_http::server: The stack is tearing down the connection
[  1837.835290s] DEBUG ThreadId(01) outbound:accept{client.addr=10.244.3.16:59012}: linkerd_app_core::serve: Connection closed
[  1837.835305s] DEBUG ThreadId(01) evict{target=Accept { orig_dst: OrigDstAddr(10.0.102.110:9200), protocol: () }}: linkerd_cache: Awaiting idleness
[  1837.835312s] DEBUG ThreadId(01) evict{target=Logical { orig_dst: OrigDstAddr(10.0.102.110:9200), protocol: Http1, profile: Some(..), logical_addr: Some(LogicalAddr(NameAddr { name: "elasticsearch.ghae.svc.cluster.local", port: 9200 })) }}: linkerd_cache: Awaiting idleness

Anything else we need to know?:

You can reach $POD_ID:9200 with no problem at all. I've already talked to @mateiidavid and he asked me to write the issue (for more context on our previous conversation https://linkerd.slack.com/archives/C89RTCWJF/p1621515416063200).

Please, let me know if there's anything I could do to help

Environment:

  • linkerd/namerd version, config files
Linkerd 2.10.1
  • Platform, version, and config files (Kubernetes, DC/OS, etc):
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.7", GitCommit:"8ab00ff68a1763b5a506a7073cb8e67b12dfbcd7", GitTreeState:"clean", BuildDate:"2021-03-10T23:40:01Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration
Azure Kubernetes Service

Thank you very much!

@migue
Copy link
Author

migue commented May 21, 2021

This should go into the linkerd2 repo, my bad

@migue migue closed this as completed May 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant