Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Kubernetes Distribution #507

Merged

Conversation

TylerHelmuth
Copy link
Member

@TylerHelmuth TylerHelmuth commented Mar 21, 2024

This PR adds a new distribution specifically for monitoring Kubernetes. For full details see #357.

Tested in https://github.com/TylerHelmuth/opentelemetry-collector-releases:

Example helm chart values.yaml using this distro:

mode: daemonset

clusterRole:
  create: true
  rules:
    - apiGroups: 
        - ""
      resources:
        - nodes/proxy
      verbs:
        - get

resources:
  limits:
    cpu: 200m
    memory: 512Mi
  requests:
    cpu: 100m
    memory: 256Mi

image:
  repository: thelmuth34/opentelemetry-collector-k8s
  tag: 0.97.7

command:
  name: otelcol-k8s

presets:
  logsCollection:
    enabled: true
  kubernetesAttributes:
    enabled: true
    extractAllPodLabels: true
    extractAllPodAnnotations: true
  kubeletMetrics:
    enabled: true

config:
  receivers:
    jaeger: null
    zipkin: null
    kubeletstats:
      collection_interval: 30s
      metric_groups:
        - node
        - pod
      metrics:
        k8s.node.uptime:
          enabled: true
        k8s.pod.uptime:
          enabled: true
        k8s.pod.cpu_limit_utilization:
          enabled: true
        k8s.pod.cpu_request_utilization:
          enabled: true
        k8s.pod.memory_limit_utilization:
          enabled: true
        k8s.pod.memory_request_utilization:
          enabled: true

  service:
    pipelines:
      traces:
        receivers: [otlp]
        exporters: [debug]
      metrics:
        exporters: [debug]
      logs:
        exporters: [debug]

ports:
  jaeger-compact:
    enabled: false
  jaeger-thrift:
    enabled: false
  jaeger-grpc:
    enabled: false
  zipkin:
    enabled: false
mode: deployment

image:
  repository: thelmuth34/opentelemetry-collector-k8s
  tag: 0.97.7

command:
  name: otelcol-k8s

resources:
  limits:
    cpu: 200m
    memory: 512Mi
  requests:
    cpu: 100m
    memory: 256Mi

# We only want one of these collectors - any more and we'd produce duplicate data
replicaCount: 1

presets:
  # enables the k8sclusterreceiver and adds it to the metrics pipelines
  clusterMetrics:
    enabled: true
  # enables the k8sobjectsreceiver to collect events only and adds it to the logs pipelines
  kubernetesEvents:
    enabled: true

config:
  receivers:
    k8s_cluster:
      collection_interval: 30s
    jaeger: null
    zipkin: null
  processors:
    transform/events:
      error_mode: ignore
      log_statements:
        - context: log
          statements:
            # adds a new watch-type attribute from the body if it exists
            - set(attributes["watch-type"], body["type"]) where IsMap(body) and body["type"] != nil

            # create new attributes from the body if the body is an object
            - merge_maps(attributes, body, "upsert") where IsMap(body) and body["object"] == nil
            - merge_maps(attributes, body["object"], "upsert") where IsMap(body) and body["object"] != nil

            # Transform the attributes so that the log events use the k8s.* semantic conventions
            - merge_maps(attributes, attributes[ "metadata"], "upsert") where IsMap(attributes[ "metadata"])
            - set(attributes["k8s.pod.name"], attributes["regarding"]["name"]) where attributes["regarding"]["kind"] == "Pod"
            - set(attributes["k8s.node.name"], attributes["regarding"]["name"]) where attributes["regarding"]["kind"] == "Node"
            - set(attributes["k8s.job.name"], attributes["regarding"]["name"]) where attributes["regarding"]["kind"] == "Job"
            - set(attributes["k8s.cronjob.name"], attributes["regarding"]["name"]) where attributes["regarding"]["kind"] == "CronJob"
            - set(attributes["k8s.namespace.name"], attributes["regarding"]["namespace"]) where attributes["regarding"]["kind"] == "Pod" or attributes["regarding"]["kind"] == "Job" or attributes["regarding"]["kind"] == "CronJob"

            # Transform the type attribtes into OpenTelemetry Severity types.
            - set(severity_text, attributes["type"]) where attributes["type"] == "Normal" or attributes["type"] == "Warning"
            - set(severity_number, SEVERITY_NUMBER_INFO) where attributes["type"] == "Normal"
            - set(severity_number, SEVERITY_NUMBER_WARN) where attributes["type"] == "Warning"


  service:
    pipelines:
      traces: null
      metrics:
        exporters: [ debug ]
      logs:
        processors: [ memory_limiter, transform/events, batch ]
        exporters: [ debug ]
        
ports:
  jaeger-compact:
    enabled: false
  jaeger-thrift:
    enabled: false
  jaeger-grpc:
    enabled: false
  zipkin:
    enabled: false

Closes #357

@TylerHelmuth TylerHelmuth force-pushed the add-kubernetes-distribution branch 2 times, most recently from c24535a to 7322e3f Compare March 21, 2024 18:10
@TylerHelmuth TylerHelmuth marked this pull request as ready for review March 21, 2024 18:17
@TylerHelmuth TylerHelmuth requested a review from a team as a code owner March 21, 2024 18:17
@TylerHelmuth
Copy link
Member Author

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set of components LGTM assuming we remove the deprecated logging exporter

distributions/otelcol-k8s/README.md Outdated Show resolved Hide resolved
distributions/otelcol-k8s/Dockerfile Outdated Show resolved Hide resolved
distributions/otelcol-k8s/postinstall.sh Outdated Show resolved Hide resolved
@TylerHelmuth
Copy link
Member Author

/cc @open-telemetry/operator-approvers

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have the operator folks review this as well.

@avillela , you had a list of components during your talk yesterday and while I believe they are all here already, I'd appreciate it if you could double check.

distributions/otelcol-k8s/README.md Show resolved Hide resolved
distributions/otelcol-k8s/config.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/manifest.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/manifest.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/postinstall.sh Outdated Show resolved Hide resolved
distributions/otelcol-k8s/Dockerfile Show resolved Hide resolved
Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making this work, @TylerHelmuth !

Copy link

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@avillela
Copy link

It would be great to have the operator folks review this as well.

@avillela , you had a list of components during your talk yesterday and while I believe they are all here already, I'd appreciate it if you could double check.

Are you referring to the Collector components for k8s @jpkrohling?

Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My general comment is "let's start smaller", by releasing for fewer OSs, architectures and types of artifacts

distributions/otelcol-k8s/.goreleaser.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/.goreleaser.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/config.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/config.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/.goreleaser.yaml Outdated Show resolved Hide resolved
@TylerHelmuth
Copy link
Member Author

@mx-psi @swiatekm-sumo you're comments on reducing the produced artifacts is valid, and annoying lol I will play around with what changes are necessary in cmd/goreleaser to reduce the artifact scope for this, and future, distros.

@@ -13,3 +13,5 @@
#

* @open-telemetry/collector-contrib-approvers

distributions/otelcol-k8s/ @open-telemetry/collector-contrib-approvers @open-telemetry/helm-approvers @open-telemetry/operator-approvers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't here be as well the operator maintainers?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't @open-telemetry/operator-approvers include the maintainers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I went with -approvers instead of maintainers because thats how we do our own code owners file and its how the otel site handles adding other teams as code owners. In my opinion helm and operator Approvers are totally qualified to review PRs for this distro.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TylerHelmuth is there anything we need to do to resolve the error message github is giving us here?
Screenshot 2024-04-04 at 11 06 11 AM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, we need to give those groups access to this repo. I'll open a community issue for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TylerHelmuth
Copy link
Member Author

@open-telemetry/helm-approvers please review

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrzej-stencel
Copy link
Member

Shouldn't we add this distro to the main README.md file?

Current list of distributions:

@TylerHelmuth
Copy link
Member Author

@astencel-sumo updated the readme and removed some statements that are no longer universally true.

@evan-bradley evan-bradley changed the title Add a Kubernetes Distributions Add a Kubernetes Distribution Apr 5, 2024
distributions/otelcol-k8s/manifest.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/README.md Outdated Show resolved Hide resolved
distributions/otelcol-k8s/README.md Outdated Show resolved Hide resolved
distributions/otelcol-k8s/README.md Outdated Show resolved Hide resolved
.github/workflows/release-k8s.yaml Outdated Show resolved Hide resolved
distributions/otelcol-k8s/README.md Outdated Show resolved Hide resolved
distributions/otelcol-k8s/README.md Outdated Show resolved Hide resolved
@TylerHelmuth
Copy link
Member Author

Ran another test against the latest commits in this branch (and 1 extra commit to update the references to my repo/dockerhub for the artifacts:

Artifacts: https://github.com/TylerHelmuth/opentelemetry-collector-releases/releases/tag/v0.97.9

Tested locally using the values.yamls from the description with the updated tag and can see the collector running as expected.

@TylerHelmuth TylerHelmuth merged commit cc6bd2d into open-telemetry:main Apr 10, 2024
31 checks passed
@TylerHelmuth TylerHelmuth deleted the add-kubernetes-distribution branch April 10, 2024 17:20
@jaronoff97
Copy link
Contributor

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Distribution Proposal: Kubernetes-specific distribution