-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enabling zipkin b3 propagation leads to possible incorrect spans #2333
Comments
Adding attachments: config files and captures for the 2 scenarios described above |
@adriancole Timing is interesting indeed, thanks for mentioning spring-cloud/spring-cloud-sleuth#1452 in my other PR, will take a look. In the meantime I was investigating this issue. What do you think, any resource I could use to make sure spans are generated correctly apart from reviewing the code and following traceIds? |
Not a bug, the behavior reported in issue description (my first comment) is expected behaviour. I have verified Linkerd tracing in two ways:
It seems Linkerd has the same pattern of generating tracing spans in both cases, w/ and w/o Zipking tracing propagation,but, the catch is that you cannot see all generated spans using just console logging, you have to check Zipkin UI which shows all of them. So. whenever a request reaches a router, Linkerd will create a server span, using the trace context from the request as starting point. If there's tracing context in the request, l5d-trace-context or zipkin, the router server span will be created as a span child of the request original tracing context. If there's no tracing context, the router span will have no parent span. Next, when the request is about to leave the router, it will create a client span, using the previous server span as parent. This For a router, you can see both the server span and client span using Zipkin UI. Using console logging you can print only the client span when this is set in the request headers. Here's a diagram of the logic generating server/client spans and where you have access to see them: x--> packet enters router1 x-----> packet leaves router1 x-----> packet enters router2 x------------> packet enters router3 |
This could be a bug, not sure, was found while implementing and testing #2114.
What happened:
Problem is linkerd generates a new spanid and a new parentspanid for each router which has configured property "tracePropagator: kind: io.l5d.zipkin", but the parentspanid is different from the previous spanid.
What you expected to happen:
I'd expect parentspanid of a new span to be the same as spanid of the previous span. We need to make sure the tracing spans are generated correctly w/ and w/o propagation enabled.
How to reproduce it (as minimally and precisely as possible):
Here is my setup:
Scenario 1 (router on port 9000 has tracePropagator io.l5d.zipkin enabled, the other routers do not have tracePropagator) (linkerd config: 1-linkerd-one-router-with-propagation.yaml, capture: 1-capture-one-router-with-propagation.pcapng)
Packet 44
- Client send a HTTP request to l5d 9000 port, containing (X-B3-SpanId: 74d95541b82adc24; X-B3-TraceId: 74d95541b82adc24)
- x-b3- headers present, first span, spanid=traceid
Packet 49
- l5d forwards the request to l5d 9001 port, (x-b3-spanid: 9f387655f61ca970; x-b3-parentspanid: 135fe09bbbd2206e; x-b3-traceid: 74d95541b82adc24)
- new spanid different from packet 44
- parentspanid != spanid of packet 44????
- keeps the traceid
Packet 54 - l5d forwars the request to l5d 9002 port, (x-b3-spanid: 9f387655f61ca970; x-b3-parentspanid: 135fe09bbbd2206e; x-b3-traceid: 74d95541b82adc24)
- keeps all b3 trace unchanged (same spanid, same parentid, same traceid)
Scenario 2: (all routers have tracePropagator io.l5d.zipkin enabled) (linkerd config: 2-linkerd-all-three-routers-with-propagation.yaml capture: 2-capture-all-three-routers-with-propagation.pcapng)
Packet 41
- request from client to port 9000 with traceid=spanid
Packet 47
- request fwd to port 9001
- new spanid, new parentspanid, same traceid
Packet 52
- request fwd to port 9002
- new spanid, new parentspanid, same traceid
Packet 57
- request fwd to port 9990
- new spanid, new parentspanid, same traceid
Anything else we need to know?:
Attached 2 scenarios of testing with linkerd config and captures.
Environment:
1.7.0
Ubuntu/localhost
The text was updated successfully, but these errors were encountered: