Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS XRayExporter - Unable to transform Zipkin/Jaeger TraceIDs to AWS X Ray TraceIds #2396

Closed
vikrambe opened this issue Feb 22, 2021 · 12 comments
Labels
bug Something isn't working closed as inactive comp:aws AWS components comp:aws-xray AWS XRay related issues Stale

Comments

@vikrambe
Copy link
Contributor

Describe the bug
Unable to ship Jaeger/Zipkin traces to AWS Xray. TraceIds are not getting transformed to X-Ray traceID format for below reason.

  1. Here is the sample: Jaeger/Zipkin TraceId “00000000000000004548c0766ce4105a”, when AWS XrayExporter receives this format following condition fails https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/awsxrayexporter/translator/segment.go#L307 as AWS Xray-Exporter expects epoc time at the beginning of the traceId to check the age of the trace [AWS TraceID Format: 0-57ff426a-80c11c39b0c928905eb0828d]

  2. Even if we skip the above condition, traceIds are transformed to “1-31363133-00000000c11ae214393aa65b” format. When this is posted to Xray we get following response from aws-xray

UnprocessedTraceSegments: [{
ErrorCode: "InvalidTraceId",
Id: "e44a79795d71c371",
Message: "Invalid segment. ErrorCode: InvalidTraceId"
},{
ErrorCode: "InvalidTraceId",
Id: "57b471a5e0bcd202",
Message: "Invalid segment. ErrorCode: InvalidTraceId"
}]
}

Steps to reproduce
Enable Zipkin/Jaeger Receiver and XrayExporter on Opentelemetry Collector Pipeline. Send Zipkin/Jaeger Spans to the collector and look at the logs for "Invalid TraceId".

What did you expect to see?
Expect to see spans on AWS Xray

What did you see instead?
We see "Invalid TraceID" logs.

What version did you use?
v0.20.0

What config did you use?
service:
extensions: "health_check"
pipelines:
traces/abc:
processors: [memory_limiter, batch/2]
exporters: [awsxray/customname]
receivers: [jaeger, opencensus/cors, zipkin, otlp/2]
extensions:
health_check:
# Specifies the port in which the HTTP endpoint is going to be opened. The
# default value is 13133. logging/2,
port: 13133
processors:
memory_limiter:
ballast_size_mib: 2000
check_interval: 5s
limit_mib: 4000
spike_limit_mib: 500
batch/2:
timeout: 200ms
send_batch_size: 50
send_batch_max_size: 50
batch:
receivers:
otlp/2:
protocols:
http:
endpoint: 0.0.0.0:9098
cors_allowed_origins:
- https://.abc.com
# tls_settings:
# cert_file: /usr/secrets/tls/server.crt
# key_file: /usr/secrets/tls/server.key # path to private key
zipkin:
endpoint: 0.0.0.0:9412
jaeger:
protocols:
thrift_compact:
endpoint: "0.0.0.0:6842"
thrift_http:
endpoint: "0.0.0.0:14269"
grpc:
# tls_settings:
# key_file: /usr/secrets/tls/server.key # path to private key
# cert_file: /usr/secrets/tls/server.crt
endpoint: "0.0.0.0:14251"
opencensus/cors:
endpoint: ":55678"
cors_allowed_origins:
- https://
.net
- https://.com
- https://
.net
- https://*.com
exporters:
logging:
logging/2:
loglevel: debug
sampling_initial: 100
sampling_thereafter: 100
awsxray/customname:
region: us-west-2
resource_arn: "arn:aws:ec2:us-west2:<account_number>:instance/i-"
role_arn: "arn:aws:iam::<account_number>:role/xray-poc"

Environment
OS: Alpine
Compiler(if manually compiled): go 14.2

Additional context
Add any other context about the problem here.

@vikrambe vikrambe added the bug Something isn't working label Feb 22, 2021
@vikrambe
Copy link
Contributor Author

@anuraaga
Copy link
Contributor

Hi @vikrambe - what are you using to instrument your app? opentracing or zipkin? And which language?

For zipkin instrumentation, you can generally configure 128-bit trace IDs and in most languages, but not all, they are compatible with x-ray. I don't think this is true for opentracing though.

We provide a java agent with an ID generator packed in so you can use OpenTelemetry with x-ray, is this something you can use?

https://github.com/aws-observability/aws-otel-java-instrumentation

@vikrambe
Copy link
Contributor Author

@anuraaga We use opentracing implementation of Zipkin and Jaeger, We also have some services in python and NodeJS not all libraries we use support 128-bit traceIds. Can AWS Xray support lower bit traceIds?
We have many services that already use jaeger/zipkin libs and it will not be possible for us to use aws-otel-java-instrumentation at the moment

@anuraaga
Copy link
Contributor

XRay can't support 64-bit trace IDs unfortunately and there is a more specific requirement on the format of the trace IDs which is only handled out of the box by zipkin instrumentation (brave, zipkin-go, py-zipkin) but not opentracing. With OpenTelemetry, it is an extension in some of the languages and will be supported in more, for example the Java one is

https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/aws/src/main/java/io/opentelemetry/sdk/extension/aws/trace/AwsXrayIdGenerator.java

X-Ray currently only supports the X-Ray SDK and OpenTelemetry for instrumentation, and we don't have any plans currently for supporting opentracing directly. You may be able to customize the ID generation in your apps though in an approach similar to how we do so for OpenTelemetry apps.

@jkowall
Copy link
Contributor

jkowall commented Feb 22, 2021

Sorry, I had to chime in here, we have observability vendor code in the SDKs now? That seems like an antipattern versus using the collector to transform the OLTP to vendor.

@anuraaga
Copy link
Contributor

anuraaga commented Feb 22, 2021

@jkowall Unfortunately IDs need to be propagated so they are at a layer the collector can't get to. Because of that the otel spec requires ID customization.

@jkowall
Copy link
Contributor

jkowall commented Feb 22, 2021

@anuraaga Ah yes, makes sense. That's the downside of manual instrumentation :(

@anuraaga
Copy link
Contributor

@jkowall It's also sort of a downside of X-Ray for sure though. I hope the ID restriction can be lifted at some point but I don't expect it for a few years.

@bogdandrutu
Copy link
Member

@jkowall we don't have any vendor code in the SDK we offer a plugin to customize the id generator and by default we generate w3c compatible IDs.

kisieland referenced this issue in kisieland/opentelemetry-collector-contrib Mar 16, 2021
* Bump github.com/google/uuid from 1.1.5 to 1.2.0

Bumps [github.com/google/uuid](https://github.com/google/uuid) from 1.1.5 to 1.2.0.
- [Release notes](https://github.com/google/uuid/releases)
- [Commits](google/uuid@v1.1.5...v1.2.0)

Signed-off-by: dependabot[bot] <[email protected]>

* Run make gotidy

Signed-off-by: Bogdan Drutu <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Bogdan Drutu <[email protected]>
@alolita alolita added the comp:aws AWS components label Sep 2, 2021
@alolita alolita added the comp:aws-xray AWS XRay related issues label Sep 30, 2021
ljmsc referenced this issue in ljmsc/opentelemetry-collector-contrib Feb 21, 2022
Remove nil check on return from NewTestSpanProcessor as it can never be
nil, addressing #2396. Also, add nil checks for testSpanProcessor
methods to prevent panics.
@willarmiros
Copy link
Contributor

Hi @bogdandrutu @vikrambe can we close this as a duplicate of both aws-observability/aws-otel-collector#492 and #1646?

@github-actions
Copy link
Contributor

github-actions bot commented Nov 8, 2022

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed as inactive comp:aws AWS components comp:aws-xray AWS XRay related issues Stale
Projects
None yet
Development

No branches or pull requests

6 participants