Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure second-stage exporters #1609

Closed
jpkrohling opened this issue Aug 21, 2020 · 11 comments
Closed

Configure second-stage exporters #1609

jpkrohling opened this issue Aug 21, 2020 · 11 comments
Labels
enhancement New feature or request

Comments

@jpkrohling
Copy link
Member

jpkrohling commented Aug 21, 2020

As part of #1260, we need a way to add second-stage exporters to the GetExporters(). I can see two possible solutions for it:

  1. allow pipelines to have no receivers, so that they can "declare" the exporters that are used as second-stage exporters
  2. change the contract for GetExporters() so that it returns all exporters from the configuration, even if they aren't declared as part of any pipelines.

I think the first suggestion is cleaner, as it requires users to declare any second-stage exporters explicitly, differentiating them from unused exporters.

If we decide to go with the second suggestion, one additional aspect to solve is that exporters are tied to their types, which is determined based on the exporter's declaration in the pipeline:

exporters:
  exampleexporter:
  exampleexporter/2:
pipelines:
  traces:
  - exampleexporter

  logs:
  - exampleexporter

In the case above, Exporters#ToMapByDataType will not return exampleexporter/2 even when it's part of the list of exporters.

jpkrohling added a commit to jpkrohling/opentelemetry-collector that referenced this issue Aug 21, 2020
@jpkrohling
Copy link
Member Author

cc @tigrannajaryan, @bogdandrutu, @pjanotti , would you have a few minutes to share your opinions?

@nilebox
Copy link
Member

nilebox commented Aug 25, 2020

allow pipelines to have no receivers, so that they can "declare" the exporters that are used as second-stage exporters

What about other aspects of pipelines? They support processors and multiple exporters, for example. If you just need exporters then declaring a pipeline seems redundant.

it requires users to declare any second-stage exporters explicitly, differentiating them from unused exporters

Your use case seems specific enough to have explicit dedicated section in the config. Also, the "second stage" term needs to be documented IMO.

@jpkrohling
Copy link
Member Author

What about other aspects of pipelines? They support processors and multiple exporters, for example.

Good point about the processors. Multiple exporters are indeed part of the design.

If you just need exporters then declaring a pipeline seems redundant.

To be honest, this was the least intrusive way I found to get the exporter instances ready and shared across the components. If we are going with a dedicated construct, it might have a bigger impact in the current code base, but it could look like this:

receivers:
  examplereceiver:

processors:
  exampleprocessor:

exporters:
  exampleexporter:
  exampleexporter/second-stage:
  exampleexporter/unused:

service:
  pipelines:
    traces:
      receivers: [examplereceiver]
      processors: [exampleprocessor]
      exporters: [exampleexporter]

extra: # what could be a better name here?
    traces:
      exporters: [exampleexporter/second-stage]

@tigrannajaryan
Copy link
Member

Unfortunately I am not sure I like either of the suggested approaches.

allow pipelines to have no receivers, so that they can "declare" the exporters that are used as second-stage exporters

I think it will be confusing to have a pipelines that consist of just exporters.

change the contract for GetExporters() so that it returns all exporters from the configuration, even if they aren't declared as part of any pipelines.

I am not sure how you can do this. Exporters are not created at all unless they are part of a pipeline. The Collector will not know how to create an exporter because the pipeline type dictates how the exporter is created.

Ideally the Collector should be able to figure out which exporters need to be created by looking at the routing exporter configuration. However, obviously we will not want to have a custom logic in the Collector config loader just for routing exporter. We will need some sort of generic mechanism for this but I do not know from the top of my head what's the best way to do this.

@jpkrohling
Copy link
Member Author

jpkrohling commented Aug 27, 2020

We will need some sort of generic mechanism for this but I do not know from the top of my head what's the best way to do this.

How about a new interface, like:

type WithDependentExporters interface {
    DependentExportersNames() []string // Like: {traces/exampleexporter/second-stage}
}

The config loader, once it loads the exporter, would check whether the exporter implements the interface and would add the returned dependencies to the list of exporters to instantiate. This would allow us to have a config without the extra part from the previous example:

receivers:
  examplereceiver:

processors:
  exampleprocessor:

exporters:
  exampleexporter:
  exampleexporter/second-stage:
  exampleexporter/unused:
  routing:
    default_exporters: 
      traces: [exampleexporter/second-stage]

service:
  pipelines:
    traces:
      receivers: [routing]
      processors: [exampleprocessor]
      exporters: [exampleexporter]

@nilebox
Copy link
Member

nilebox commented Aug 27, 2020

Alternatively, you may just embed "second stage" exporters into the routing exporter:

exporters:
  routing:
    embedded_exporters:
      # duplicates the top-level exporters config
      logging:
      otlp:
      jaeger/globex:
        endpoint: "https://jaeger.globex.example.com:14250"
      jaeger/ecorp:
        endpoint: "https://jaeger.ecorp.example.com:14250"
      saas/ecorp:
        endpoint: "https://saas.globex.example.com:8443"
        license: xyz
    # references to exporters from the embedded_exporters, not top-level exporters
    default_exporters: [otlp]
    table:
    - value: acme
      exporters: [logging]
    - value: globex
      exporters: [jaeger/globex, saas/globex]
    - value: ecorp
      exporters: [jaeger/ecorp]

This will require reusing code from the exporters builder

// NewExportersBuilder creates a new ExportersBuilder. Call BuildExporters() on the returned value.
func NewExportersBuilder(

but will give you a complete freedom on the preferred configuration format / flexibility.

In that case you don't need any changes in the core collector code at all.

@jpkrohling
Copy link
Member Author

One of the previous concerns was about reusing instances from the main collector. In your example, the routing exporter would have second-stage instances of its own, right?

I don't think it would be a problem, but perhaps @tigrannajaryan has a specific concern about it?

@tigrannajaryan
Copy link
Member

Alternatively, you may just embed "second stage" exporters into the routing exporter:

This is an option but makes the "second stage" exporters special. For example they will be no longer visible from GetExporters() functions to other components. I don't think this is desirable.

The config loader, once it loads the exporter, would check whether the exporter implements the interface and would add the returned dependencies to the list of exporters to instantiate.

This is also a possibility but requires changes to the core which I am not sure we want to do yet.

I would try something else for now. Instead of making this an exporter make it a routing processor, like this:

processors:
  routing:
    from_attribute: X-Tenant
    default_exporters: jaeger/default
    table:
    - value: acme
      exporters: [jaeger/acme]

service:
  pipelines:
    traces/routing:
      receivers: [otlp]
      processors: [routing]
      exporters:
      - jaeger/default
      - jaeger/acme

This way no changes are need in the core. This requires that exporters are listed twice: once in the "service.exporters" and again in the routing table of "processors.routing", which is a downside, but limits the awkwardness too routing component only.

The routing processor will obviously have to not propagate anything to its default consumer in the pipeline (which would be all of the exporters) and will only need to send the data to the exporter matching the routing table.

If the routing exporter see a lot of usage we can look into adding more support to the core to remove config redundancy. Long term we may want to have more flexible ways of chaining pipelines which may also handle the routing use case.

What do you think?

@jpkrohling
Copy link
Member Author

which is a downside, but limits the awkwardness too routing component only

We can catch bugs related to this in the routing processor and log accordingly.

The routing processor will obviously have to not propagate anything to its default consumer in the pipeline (which would be all of the exporters) and will only need to send the data to the exporter matching the routing table.

This could be confusing to users at first, as it would basically mean that this processor has exporter semantics. But sounds good. Can we determine whether the next consumer is a processor? If we detect that this is the case, we might want to log a warning as well (like: "the next consumer is a processor, but it won't ever receive any data").

That said, I'm +1 to this idea, especially because it allows us to start small and iterate based on feedback.

@tigrannajaryan
Copy link
Member

Can we determine whether the next consumer is a processor?

Yes, we can by checking that the next consumer implements the Processor interface (which is distinct from Exporter interface).

If we detect that this is the case, we might want to log a warning as well (like: "the next consumer is a processor, but it won't ever receive any data").

Good idea.

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Sep 2, 2020

Closing this since we decided to go with a processor instead of an exporter for now.

jpkrohling added a commit to jpkrohling/opentelemetry-collector-contrib that referenced this issue Sep 3, 2020
jpkrohling added a commit to jpkrohling/opentelemetry-collector-contrib that referenced this issue Sep 3, 2020
jpkrohling added a commit to jpkrohling/opentelemetry-collector-contrib that referenced this issue Sep 7, 2020
jpkrohling added a commit to jpkrohling/opentelemetry-collector-contrib that referenced this issue Sep 7, 2020
@andrewhsu andrewhsu added the enhancement New feature or request label Jan 6, 2021
Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this issue Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants