Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce telegrafreceiver #2845

Conversation

pmalek-sumo
Copy link
Contributor

Description: Introduce telegraf receiver in unstable components set

This PR introduces telegrafreceiver that can use telegraf input plugins to ingest metrics and then forwards it to otc pipeline for processing and export.

An exemplar config that can be used for testing:

receivers:
  telegraf:
    separate_field: false
    agent_config: |
      [agent]
        interval = "3s"
        flush_interval = "3s"
      [[inputs.mem]]
      [[inputs.disk]]

exporters:
  file:
    path: ./out.json

service:
  pipelines:
    metrics:
      receivers: [telegraf]
      exporters: [file]

For now only telegraf.Gauge metric type is support. If the receiver gets accepted or there's need to add other types of metrics in this PR then that surely can be added.

Testing: Added unit tests for the receiver.

Documentation: Added a README explaining how to configure the receiver.

@pmalek-sumo pmalek-sumo requested a review from a team as a code owner March 25, 2021 09:23
@pmalek-sumo pmalek-sumo changed the title Introduce telegrafreceiver upstream pr Introduce telegrafreceiver Mar 25, 2021
@pmalek-sumo
Copy link
Contributor Author

This PR doesn't yet reflect changes from https://github.com/open-telemetry/opentelemetry-proto/releases/tag/v0.8.0 (which will impact this PR) because those changes from proto definitions were not yet propagated to https://github.com/open-telemetry/opentelemetry-collector/blob/4ae0fc11491d5dff0fa54c79ec2bacc3beeeaf62/consumer/pdata/metric.go#L151-L160

@pmm-sumo
Copy link
Contributor

pmm-sumo commented Mar 25, 2021

To give more context - we've been wondering if there's a way to use Telegraf with OpenTelemetry Collector within a single process (for ease of deployment and management), since they are both developed in Go. There were some changes required to the main tree (kept in a fork for the purpose of the receiver), so it's not a very clean solution, but if the community agrees that it's something worth having, we could work on making it more robust.

I personally think this is an interesting idea, however at the same time I think that long term we should have a matching set of capabilities available in OpenTelemetry Collector via dedicated receivers and integrations. Until that happens, reusing Telegraf's work opens up the ability to use its 200+ integrations. Note: this adds ~55 MB to the executable size

@rmfitzpatrick
Copy link
Contributor

This is an interesting idea and I can easily see its value. I am curious about the reliance on your fork, and in general the integration with telegraf's substantial dependency requirements, which may introduce nontrivial maintenance requirements and subtle incompatibilities that may need to be addressed regularly.

Though not utilizing the same go executable, would it be less work with similar gains to add an OTLP output plugin and run the Collector side by side?

@pmalek-sumo
Copy link
Contributor Author

I am curious about the reliance on your fork

This is intentionally there to mitigate the dependency incompatibilities. We are aware that this PR introduces with it a hefty baggage of dependencies which might be seen as a price to pay for having telegraf agent run in the same process.

Though not utilizing the same go executable, would it be less work with similar gains to add an OTLP output plugin and run the Collector side by side?

This is another approach we've briefly looked at but we decided to put it aside because it would require serialization and deserialization of metrics (to/from OTLP or any other format chosen) and also it would require some non insignificant amount of work to orchestrate running telegraf agent alongside otc.

@tigrannajaryan
Copy link
Member

@pmalek-sumo @pmm-sumo I think this is an interesting approach, but I have a few concerns:

  • Telegraf is a very large dependency, which has a lot of implications when bringing it to the project, including potential dependency conflicts, significant increase in executable size. Generally such large dependencies have high cost of maintenance.

  • The fact that you had to fork Telegraf to make this work reinforces the above point. Once forked maintaining it is costly and there is a chance that it will fall behind and will make Collector vulnerable since we may not be able to update the dependency quickly enough when security vulnerabilities are discovered in Telegraf.

Given the above I am reluctant with having this receiver as something that I will need to maintain and be responsible for in this repository as a maintainer.

We have a few possible ways forward:

  1. Keep this receiver outside Otel github org. We are happy to link to it and https://github.com/open-telemetry/opentelemetry-collector-builder can be used to create custom builds of Collector that include this receiver. This removes the need for Collector maintainers to be responsible for it (which is a responsibility I am not comfortable with given the above issues I referred to).

  2. Explore the possibility to contribute OTLP output plugin to Telegraf. This way Telegraf and Otelcol can be used as a pair. This would be more similar to the Fluentbit use-case that you referred to. The extra serialization/deserialization is a downside but may not be a dealbreaker for many, since OTLP is a pretty efficient protocol. The management of the additional executable remains a concern, though perhaps it can be handled similarly to how Fluentbit extension does it. Obviously this is more cumbersome for users, but it is a possible approach.

  3. Look at what dependencies are conflicting and see if it is possible to avoid forking and instead depend on original Telegraf. This would significantly reduce the maintenance costs. The executable size problem will still remain though so I am not sure even in that case we can make this receiver part of the default build of the Collector. However in this situation maintainability concerns are lower and we can consider hosting the source code of the receiver in this repo (but still exclude from the default build).

We can discuss this live. I am open to other possibilities. I understand the significant value that this receiver brings, so if we can find ways around the concerns I posted maybe we can have a better solution.

cc @bogdandrutu

@bogdandrutu
Copy link
Member

@pmalek-sumo at one point you may want to move this to your repo :)))

@frankreno
Copy link

@tigrannajaryan Completely understand the points above. I believe there is a quite a lot of value that this adds but recognize the challenges you have called out. I think a discussion live is a great idea to see if we can overcome those hurdles.

@jacobmarble
Copy link
Contributor

jacobmarble commented Mar 31, 2021

@pmalek-sumo @frankreno @pmm-sumo I caught the last 1 minute or so of the conversation in the Metrics SIG this morning. I work at InfluxData, and have some work in progress that you should probably be aware of (I just became aware of this PR, haven't looked over it yet).

https://github.com/influxdata/influxdb-observability

So far, I have:

There are rough edges. For example, otelcol builds its own Protobuf objects, so otel2influx generates its own, and passes unsafe pointers. It would be great if otelcol instead depended on opentelemetry-proto-go.

Ideally, your Telegraf receiver internals would be hosted in the same repository (influx2otel?) so that one canonical implementation exists from the beginning. I would love to chat with anyone about this over zoom.


metrics := ilm.Metrics()

switch t := m.Type(); t {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking about this in the context of our Zoom conversation earlier.

The Prometheus Telegraf input plugin is the only Telegraf input plugin that sets this type field. Everything else defaults to "untyped". To be sure, that Telegraf plugin is very popular, but otelcol already has a Prometheus receiver (or two), so those users won't benefit much from this switch block.

I talked with someone on the Telegraf team, and we agreed that Gauge is probably the best we can do here.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@tigrannajaryan
Copy link
Member

Closing as we agreed this approach is not what we want to do right now.

alexperez52 referenced this pull request in open-o11y/opentelemetry-collector-contrib Aug 18, 2021
* Replace InitEmptyWithCapacity with EnsureCapacity and Clear

Signed-off-by: Bogdan Drutu <[email protected]>

* Call Clear always

Signed-off-by: Bogdan Drutu <[email protected]>
@sumo-drosiek sumo-drosiek deleted the introduce-telegrafreceiver-upstream-pr branch March 1, 2024 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants