Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XRay exporter not sending traces #1646

Closed
CeeBeeCee opened this issue Nov 19, 2020 · 18 comments
Closed

XRay exporter not sending traces #1646

CeeBeeCee opened this issue Nov 19, 2020 · 18 comments
Assignees

Comments

@CeeBeeCee
Copy link

Describe the bug
I am running OTel collector v0.14 on an us-east-1 EC2 instance. The collector is configured to receive traces in otlp format and export to us-east-1 xray. The logging exporter tells me that it is receiving traces and zpages shows an incrementing count of traces exported to XRay. But the traces do not show up in XRay(us-east-1) and the logs dont show any errors. Is this working for anyone and is there a way to debug what's going on? This used to work until 0.8 if I remember correctly.

awsxray:
awsxray/custom:
region: us-east-1
resource_arn: "arn:aws:ec2:us-east1:xxxxx:instance/xxxxx"
role_arn: "arn:aws:iam::xxxxx:role/xxxxx"

pipelines:
traces:
receivers: [jaeger, zipkin, otlp]
processors: [batch, queued_retry]
exporters: [awsxray,logging]

ZPages

Span Name | Running | Latency Samples | Error Samples
  |   | |  
ExportMetrics | 0 | 0 1 0
exporter/awsxray/TraceDataExported | 0 | 0 1 4
@CeeBeeCee CeeBeeCee added the bug Something isn't working label Nov 19, 2020
@CeeBeeCee
Copy link
Author

Interestingly, the same configs work on the same Ec2 instance if I use the AWS distro docker.io/amazon/aws-otel-collector image instead of otel/opentelemetry-collector-contrib

I cant use the distro because I need zipkin and jaeger and the distro doesnt seem to support these. Please advise.

@CeeBeeCee
Copy link
Author

@anuraaga

@bogdandrutu
Copy link
Member

/cc @kbrockhoff

@anuraaga
Copy link
Contributor

anuraaga commented Dec 2, 2020

Sorry for missing this issue @CeeBeeCee - are you still seeing problems? I'm surprised by the difference with the AWS distro since we don't change that much there.

/cc @mxiamxia

@CeeBeeCee
Copy link
Author

No worries @anuraaga . I just pulled v0.16.0 version and I see the same thing. The exporter seems to have exported the traces to XRay but they don't show up.

docker pull otel/opentelemetry-collector-contrib:0.16.0

REPOSITORY TAG
docker.io/otel/opentelemetry-collector-contrib 0.16.0

Log sample :
Dec 02 15:27:27 2020-12-02T20:27:27.835Z debug awsxrayexporter/awsxray.go:52 TraceExporter {"component_kind": "exporter", "component_type": "awsxray", "component_name": "awsxray", "type": "awsxray", "name": "awsxray", "#spans": 1}

@CeeBeeCee
Copy link
Author

I have the collector running in the debug mode (--log-level DEBUG) but I dont see any debug logs for the AWS exporter. Is there another way to turn on debug logging?

@zl4bv
Copy link

zl4bv commented Dec 2, 2020

Hiya, I'm seeing similar symptoms however I'm running the aws-otel-collector distro v0.4.0 - multiple spans/segments are reported to have been successfully exported to X-Ray from the collector but they never show up in X-Ray.

I'm running the default config from the AWS distro with the metrics processors and exporters removed.

With the collector log level set to DEBUG I also see similar logs and also no errors:

2020-12-02T07:10:06.158Z DEBUG [email protected]/awsxray.go:50 TraceExporter {"component_kind": "exporter", "component_type": "awsxray", "component_name": "awsxray", "type": "awsxray", "name": "awsxray", "#spans": 2}

For additional debugging purposes I also enabled the logging exporter and have used it to confirm spans are being received from the instrumented applications.

Collector config snippet with logging exporter enabled
exporters:
  awsxray:
  logging:
    loglevel: debug
    sampling_initial: 5
    sampling_thereafter: 200
Log sample of a span via the logging exporter
2020-12-02T07:10:06.158Z INFO loggingexporter/logging_exporter.go:339 TracesExporter {"#spans": 2}
2020-12-02T07:10:06.158Z DEBUG loggingexporter/logging_exporter.go:394
ResourceSpans #0 REMOVED
ResourceSpans #1
Resource labels:
-> host.name: STRING(cfac9e0bb7bb)
-> service.name: STRING(REDACTED)
-> telemetry.sdk.language: STRING(go)
-> telemetry.sdk.name: STRING(opentelemetry)
-> telemetry.sdk.version: STRING(0.14.0)
InstrumentationLibrarySpans #0
InstrumentationLibrary go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin semver:0.14.0
Span #0
Trace ID : ffd7678994c4b630231c3e2b17ac6ae2
Parent ID :
ID : ed59c3653144d3b0
Name : /
Kind : SPAN_KIND_SERVER
Start time : 2020-12-02 07:10:06.003542515 +0000 UTC
End time : 2020-12-02 07:10:06.003624156 +0000 UTC
Status code : STATUS_CODE_OK
Status message : HTTP status code: 200
Attributes:
-> net.transport: STRING(IP.TCP)
-> net.peer.ip: STRING(REDACTED)
-> net.peer.port: INT(56302)
-> net.host.name: STRING(REDACTED)
-> http.method: STRING(GET)
-> http.target: STRING(/)
-> http.server_name: STRING(REDACTED)
-> http.route: STRING(/)
-> http.client_ip: STRING(REDACTED)
-> http.user_agent: STRING(Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:83.0) Gecko/20100101 Firefox/83.0)
-> http.scheme: STRING(http)
-> http.host: STRING(REDACTED)
-> http.flavor: STRING(1.1)
-> http.status_code: INT(200)
Log from stdout exporter in the application
[GIN] 2020/12/02 - 07:10:06 | 200 | 7.459us | REDACTED | GET "/"
[
    {
    "SpanContext": {
      "TraceID": "ffd7678994c4b630231c3e2b17ac6ae2",
      "SpanID": "ed59c3653144d3b0",
      "TraceFlags": 1
    },
    "ParentSpanID": "0000000000000000",
    "SpanKind": 2,
    "Name": "/",
    "StartTime": "2020-12-02T07:10:06.003542515Z",
    "EndTime": "2020-12-02T07:10:06.003624156Z",
    "Attributes": [
      {
        "Key": "net.transport",
        "Value": {
          "Type": "STRING",
          "Value": "IP.TCP"
        }
      },
      {
        "Key": "net.peer.ip",
        "Value": {
          "Type": "STRING",
          "Value": "REDACTED"
        }
      },
      {
        "Key": "net.peer.port",
        "Value": {
          "Type": "INT64",
          "Value": 56302
        }
      },
      {
        "Key": "net.host.name",
        "Value": {
          "Type": "STRING",
          "Value": "REDACTED"
        }
      },
      {
        "Key": "http.method",
        "Value": {
          "Type": "STRING",
          "Value": "GET"
        }
      },
      {
        "Key": "http.target",
        "Value": {
          "Type": "STRING",
          "Value": "/"
        }
      },
      {
        "Key": "http.server_name",
        "Value": {
          "Type": "STRING",
          "Value": "REDACTED"
        }
      },
      {
        "Key": "http.route",
        "Value": {
          "Type": "STRING",
          "Value": "/"
        }
      },
      {
        "Key": "http.client_ip",
        "Value": {
          "Type": "STRING",
          "Value": "REDACTED"
        }
      },
      {
        "Key": "http.user_agent",
        "Value": {
          "Type": "STRING",
          "Value": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:83.0) Gecko/20100101 Firefox/83.0"
        }
      },
      {
        "Key": "http.scheme",
        "Value": {
          "Type": "STRING",
          "Value": "http"
        }
      },
      {
        "Key": "http.host",
        "Value": {
          "Type": "STRING",
          "Value": "REDACTED"
        }
      },
      {
      "Key": "http.flavor",
      "Value": {
        "Type": "STRING",
        "Value": "1.1"
        }
      },
      {
      "Key": "http.status_code",
      "Value": {
        "Type": "INT64",
        "Value": 200
        }
      }
    ],
    "MessageEvents": null,
    "Links": null,
    "StatusCode": "Unset",
    "StatusMessage": "HTTP status code: 200",
    "HasRemoteParent": false,
    "DroppedAttributeCount": 0,
    "DroppedMessageEventCount": 0,
    "DroppedLinkCount": 0,
    "ChildSpanCount": 0,
    "Resource": [
    {
      "Key": "host.name",
      "Value": {
        "Type": "STRING",
        "Value": "cfac9e0bb7bb"
      }
    },
    {
      "Key": "service.name",
      "Value": {
        "Type": "STRING",
        "Value": "REDACTED"
      }
    },
    {
      "Key": "telemetry.sdk.language",
      "Value": {
        "Type": "STRING",
        "Value": "go"
      }
    },
    {
      "Key": "telemetry.sdk.name",
      "Value": {
        "Type": "STRING",
        "Value": "opentelemetry"
      }
    },
    {
      "Key": "telemetry.sdk.version",
      "Value": {
        "Type": "STRING",
        "Value": "0.14.0"
      }
    }
    ],
    "InstrumentationLibrary": {
    "Name": "go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin",
    "Version": "semver:0.14.0"
    }
  }
]

I see a few AWS folks here so if it helps there's more details of the environment where I'm experiencing this in AWS support case 7681890941.

@anuraaga
Copy link
Contributor

anuraaga commented Dec 3, 2020

@zl4bv Thanks for the logs. It looks like you are using opentelemetry-go (bawsed on the go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin semver in there), and the trace ID is not starting with a unix timestamp, which is required for X-Ray. We're currently working on submitting a custom ID generator for opentelemetry-go which would then allow you to use it with X-Ray. /cc @wilguo At this moment, Java and JS are the only languages we support, with Python and Go coming up very soon.

@CeeBeeCee Can you also confirm which instrumentation you're using? I wonder if when using the AWS Distro, you used the AWS Distro for instrumentation as well, where we automatically enable the correct ID generation, but by default instrumentation would not be doing that. If you're able to post anything from the logging exporter, that would be great!

@CeeBeeCee
Copy link
Author

@anuraaga : I have 3 apps ( Python, Java and JS) instrumented with Opentelemetry (not AWS Distro) which are emitting traces. Traces for all three apps seem to be processed by the collector but don't make it to Aws. They so show up in Jaeger which I'm using as a second exporter.

@CeeBeeCee
Copy link
Author

I will send some samples traces from the logging exporter tomorrow. Meanwhile if there is a way to turn on debug level logs for the AWS exporter, please let me know

@anuraaga
Copy link
Contributor

anuraaga commented Dec 3, 2020

Thanks @CeeBeeCee - you helped me find this spot that is clearly lacking an important debug log statement

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/master/exporter/awsxrayexporter/awsxray.go#L64

For the issue, it seems to be the same - you will need to either use the AWS Distro for instrumentation which is preconfigured for use with x-ray, or manually enable the x-ray id generator.

Java's is here - https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk-extensions/aws/src/main/java/io/opentelemetry/sdk/extension/aws/trace/AwsXrayIdGenerator.java

(we don't have a doc for it since currently we figure users will use the aws distro)

JS's is here - https://aws-otel.github.io/docs/getting-started/js-sdk/trace-manual-instr

Python's is here - https://github.com/open-telemetry/opentelemetry-python-contrib/blob/master/sdk-extension/opentelemetry-sdk-extension-aws/src/opentelemetry/sdk/extension/aws/trace/aws_xray_ids_generator.py
(we are working on the docs for this as we prepare to launch official support for it)

You can use the normal opentelemetry-collector-contrib fine with the AWS distro for Java instrumentation so for Java I recommend that. The others do require some configuration.

Let me know if this works for you. Thanks!

@zl4bv
Copy link

zl4bv commented Dec 3, 2020

@anuraaga thanks for the clarification about trace IDs not starting with a UNIX timestamp.

I have 2 apps (Go and JS) - I'll wait for the ID generator changes and try the Go app again later. As for the JavaScript app it appears to be working now with your suggestion to manually enable the X-Ray ID generator in the JS SDK. Thanks heaps!

@CeeBeeCee
Copy link
Author

@anuraaga : I updated my Python code to use AwsXRayIdsGenerator and I can now see the traces in XRay using opentelemetry-collector-contrib. Thanks so much for your help!

trace.set_tracer_provider(TracerProvider(resource=resource,ids_generator=AwsXRayIdsGenerator()))

Side question : I have a team asking if we can use OTel with AWS Step Functions instead of XRay. What's your take on this? The use case here is we need to send traces to a 3rd party.

dyladan referenced this issue in dynatrace-oss-contrib/opentelemetry-collector-contrib Jan 29, 2021
* Support zipkin proto in Kafka receiver

Signed-off-by: Pavol Loffay <[email protected]>

* Fix tests

Signed-off-by: Pavol Loffay <[email protected]>
@deki
Copy link

deki commented May 5, 2021

Just hit the same issue with opentelemetry-php. https://github.com/open-telemetry/opentelemetry-php/blob/main/sdk/Trace/RandomIdGenerator.php needs to be replaced with an XRay generator. However this is currently blocked by open-telemetry/opentelemetry-php#281.

punya referenced this issue in punya/opentelemetry-collector-contrib Jul 21, 2021
* Support zipkin proto in Kafka receiver

Signed-off-by: Pavol Loffay <[email protected]>

* Fix tests

Signed-off-by: Pavol Loffay <[email protected]>
@alolita alolita added the comp:aws AWS components label Sep 2, 2021
@alolita alolita added the comp:aws-xray AWS XRay related issues label Sep 30, 2021
@tmarszal
Copy link

tmarszal commented Feb 14, 2022

I just hit the same issue and I am going to use XRay compatible generator for my app. However, I feel README.md may need an improvement. I read there:

(...) X-Ray only allows submission of Trace IDs from the past 30 days, received Trace IDs are checked. If outside the allowed range, a replacement is generated by the exporter using the current time.

So my interpretation was that on incorrect trace ID, exporter would generate a valid one for me. I don't necessary think that it should be exporter's responsibility, but at least the doc should be made clear. Or am I wrong?

@willarmiros
Copy link
Contributor

willarmiros commented May 23, 2022

Hi @tmarszal thank you for chiming in. I've opened #10207 to address the README changes, PTAL!

Otherwise, this issue can stay open as the primary tracking issue in this repo for the X-Ray backend not supporting fully-random trace IDs. We are also tracking it downstream in aws-observability/aws-otel-collector#492

@CeeBeeCee could you possibly update the title of this issue to more closely match the root cause/actual feature request, as is done in aws-observability/aws-otel-collector#492 so other customers looking at this can know it's for the same issue?

@github-actions
Copy link
Contributor

github-actions bot commented Nov 8, 2022

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants