Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rely solely on the trace provider to get the tracer #115

Closed
wants to merge 4 commits into from

Conversation

jhchabran
Copy link

Hi! We've recently been using in our codebase otelsql to modernise some old SQL tracing code and it's been working well :)

One thing that got in the way is that otelsql caches the tracer when it's initialised, which creates for a us a chicken and egg problem: our tracing code requires reading some setting from the database, but setting up the database driver requires tracing, leading otelsql to use the wrong tracer if we didn't change anything.

The present PR instead relies fully on the otel code to handle caching the tracer, and simply calls the tracing provider whenever it needs one.

I think this class of problem may be common in the context of legacy apps whose init layer is a bit more complicated than usual.

We have ran this code in production for a few weeks now, without observing any issue. Please tell me if you think there is a better way of handling this or if I can modify the code more to your liking.

@codecov
Copy link

codecov bot commented Sep 5, 2022

Codecov Report

Base: 80.2% // Head: 79.6% // Decreases project coverage by -0.6% ⚠️

Coverage data is based on head (38f4a6e) compared to base (dd2cb66).
Patch coverage: 54.5% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##            main    #115     +/-   ##
=======================================
- Coverage   80.2%   79.6%   -0.6%     
=======================================
  Files         13      13             
  Lines        652     658      +6     
=======================================
+ Hits         523     524      +1     
- Misses       105     110      +5     
  Partials      24      24             
Impacted Files Coverage Δ
config.go 75.7% <50.0%> (-13.2%) ⬇️
utils.go 100.0% <100.0%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@XSAM
Copy link
Owner

XSAM commented Sep 10, 2022

Hi @jhchabran, thanks for submitting this PR!

our tracing code requires reading some setting from the database, but setting up the database driver requires tracing, leading otelsql to use the wrong tracer if we didn't change anything.

I would like to leave end-users a chance to choose cached tracer (more performance) or dynamic tracer provider (more flexible). I wonder whether we can solve this issue by retrieving the TracerProvider from context.

Like these codes:
https://github.com/open-telemetry/opentelemetry-go-contrib/blob/9f3760a3b2683e30b44990ebf5e5fefcb296ecca/instrumentation/net/http/otelhttp/transport.go#L96-L102
https://github.com/open-telemetry/opentelemetry-go-contrib/blob/9f3760a3b2683e30b44990ebf5e5fefcb296ecca/instrumentation/net/http/otelhttp/config.go#L73-L76

I think retrieving the TracerProvider from context can solve this dependency issue once for all.


Or, maybe adding an option like WithLiveTracer option to disable tracer cache, but I am not sure about this. No existing otel instrumentation has such an option.

@XSAM
Copy link
Owner

XSAM commented Sep 17, 2022

I wonder, is it possible to create a new DB instance after obtaining the tracing setting from the database? If so, then we might not need this PR.

@jhchabran
Copy link
Author

I would like to leave end-users a chance to choose cached tracer (more performance) or dynamic tracer provider (more flexible). I wonder whether we can solve this issue by retrieving the TracerProvider from context.

That makes sense, I'll take a look!

I wonder, is it possible to create a new DB instance after obtaining the tracing setting from the database? If so, then we might not need this PR.

In our case, that would be doable but it would be make that code convoluted. Because my understanding so far is that being able to dynamically get the tracer from the provider is idiomatic, it sounds that it's the best approach, especially if it's possible to toggle it as you mentioned. Please tell me if that's wrong of me to assume so!

@bobheadxi
Copy link

hey @XSAM , also note that internally Tracer() already handles caching of repeat requests:

https://github.com/open-telemetry/opentelemetry-go/blob/5e8eb855bf3c6507715edd12ded5c6a950dd6104/sdk/trace/provider.go#L141-L160

The cache holds a lock, but in the context of a database query the impact should be negligible

@bobheadxi
Copy link

bobheadxi commented Feb 2, 2023

That said - @jhchabran upon a closer look at Tracer():

If the same name and options are passed multiple times, the same Tracer will be returned (it is up to the implementation if this will be the same underlying instance of that Tracer or not). It is not necessary to call this multiple times with the same name and options to get an up-to-date Tracer. All implementations will ensure any TracerProvider configuration changes are propagated to all provided Tracers.

This might be an implementation error on our end as well - maybe we need to configure a zero-value tracer that can be swapped out after initialization

@XSAM
Copy link
Owner

XSAM commented Feb 4, 2023

All implementations will ensure any TracerProvider configuration changes are propagated to all provided Tracers.

I am not sure what kind of dynamic config you need to fetch from the database. But TracerProvider allows changing the SpanProcessor on the fly, though not other things like Sampler. You can register a new SpanProcessor to the tracer provider after getting a tracer. And this change can affect the existing tracer.

// New tracer provider
tp = sdktrace.NewTracerProvider(
	sdktrace.WithSampler(sdktrace.AlwaysSample()),
	sdktrace.WithResource(resource.NewWithAttributes(
		semconv.SchemaURL,
		serviceName,
	)),
)

// New tracer
tracer := tp.Tracer(instrumentationName)

// Fetch config from database to init a `SpanProcessor`
var sp SpanProcessor

// Use the new span processor
tp.RegisterSpanProcessor(sdktrace.NewSimpleSpanProcessor(sp))

// And this would affect existing tracers
ctx, span := tracer.Start(context.Background(), "example")

If you only want to change the SpanProcessor on the fly, then there is no need for this PR.

bobheadxi added a commit to sourcegraph/sourcegraph-public-snapshot that referenced this pull request Feb 21, 2023
…#47428)

The new architecture of `internal/tracer`:

1. Creates a `TracerProvider` immediately. This provider is
preconfigured with service metadata, but does not have a
`SpanProcessor`.
2. Creates an OpenTracing bridge using `TracerProvider`, such that we
have an OpenTracing Tracer and an OpenTelemetry TracerProvider that
bridges the two APIs. We register both as global defaults.
3. On site config initialization, we register a new `SpanProcessor` onto
the global `TracerProvider`.
4. From this point on, site config updates:
   a. Create a new `SpanProcessor` based on site config
   b. Register the new `SpanProcessor`, which starts processing
c. Remove the previous `SpanProcessor`, which flushes it and shuts it
down as well

Previously, we mimicked the old opentracing setup where we swapped the
global tracer, using a similar approach in OpenTelemetry to set up the
global TracerProvider. In OpenTelemetry, registering and unregistering
`SpanProcessors` is the correct way to do this it seems - the swapping
approach causes configuration propagation issues such as
XSAM/otelsql#115 (allowing us to revert back to
upstream: https://github.com/sourcegraph/sourcegraph/pull/47429) and
possibly https://github.com/sourcegraph/sourcegraph/issues/42515 . This
correct approach was pointed out in
XSAM/otelsql#115 (comment), and
makes the setup easier to reason with:

- the previously swappable tracers now no longer need to have a locked
underlying implementation, they just add logging now
- we don't need to update 2 implementations, we make the update at a
lower level so that both ends of the OT/OTel bridge are updated
- debug mode is managed through a mutable atomic bool, instead of
through recreating entire tracers

## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs.sourcegraph.com/dev/background-information/testing_principles
-->

Unit tests assert that span processor updates are propagated. Exporter
setup is functionally unchanged

---------

Co-authored-by: Jean-Hadrien Chabran <[email protected]>
bobheadxi added a commit to sourcegraph/sourcegraph-public-snapshot that referenced this pull request Feb 21, 2023
This is enabled by https://github.com/sourcegraph/sourcegraph/pull/47428
- swapping `SpanProcessor` instead of `TracerProvider` means that we can
correctly propagate configuration changes to `Tracer` instances, so that
we no longer need XSAM/otelsql#115 and can
revert back to upstream. Switching to upstream, which has moved
considerably since we forked it, necessitates a few other
OpenTelemetry-related dependency upgrades as well (split out from
https://github.com/sourcegraph/sourcegraph/pull/47126)

Whether updates are work is tested in
https://github.com/sourcegraph/sourcegraph/pull/47428

## Test plan

<!-- All pull requests REQUIRE a test plan:
https://docs.sourcegraph.com/dev/background-information/testing_principles
-->

Search with `?trace=1` still has SQL traces with args:

<img width="1649" alt="Screenshot 2023-02-07 at 4 27 57 PM"
src="https://user-images.githubusercontent.com/23356519/217288396-3069e947-1841-4108-8ab6-02a384a98a54.png">

<img width="1649" alt="Screenshot 2023-02-07 at 4 27 57 PM"
src="https://user-images.githubusercontent.com/23356519/217288612-f5091180-a2f3-4144-ab6f-3456cc187612.png">
@XSAM
Copy link
Owner

XSAM commented Mar 13, 2023

@jhchabran, I see the related PRs merged. I am closing this PR.

Please feel free to reopen this if you need to.

@XSAM XSAM closed this Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants