Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8S processor extensions #142

Closed
wants to merge 27 commits into from

Conversation

pmm-sumo
Copy link
Contributor

Description:

Extends K8S processor with more attributes being tagged

Motivation:

  1. The available set of tags is very helpful, but for many environments is not enough. Support for more tags was added (which required traversing the owners tree and added caching and some complexity into the code).
  2. Frequently, there is already a taxonomy of K8S tags being used. Hence, a capability to provide custom tag names for each of the extracted field was added
  3. While limiting the tagging of pod labels and annotations to specified ones is very useful, there are cases where all of them are to extracted. Capability to do so was included

The changes were applied in a way compatible with current configs

Link to tracking Issue:

#141

Also, extracted the documentation from various sections into README.md, which is covered by
#97

Testing:

Unit tests for the fields added, as well as manual tests of the collector in several K8S environments

Documentation:

Description of changes included in relevant sections. Also, README.md updated

Copy link
Contributor

@owais owais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think this or parts of it can be useful but I wish we had discussed the proposal before the implementation. My biggest concern is whether the additional API calls are worth fetching a few extra metadata tags. My opinion is that it's not worth it. We need to be very careful with using K8S API. In general we should minimize the amount of RBAC access and the number of API calls. Watch API works very well for this for large clusters but even that needs to be re-evaluated when it comes to performance of the collector and load it might put on the k8s master.

processor/k8sprocessor/README.md Outdated Show resolved Hide resolved
processor/k8sprocessor/README.md Outdated Show resolved Hide resolved
processor/k8sprocessor/README.md Outdated Show resolved Hide resolved
processor/k8sprocessor/README.md Outdated Show resolved Hide resolved
- statefulSetName
tags:
# It is possible to provide your custom key names for each of the extracted metadata:
containerId: k8s.pod.containerId
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear if the key here refers to the tag that should be extracted from pod metadata or attribute key that will be added to the span. I think we should mention what the end result of this operation will be exactly. For example, this will extract field X and add it as attribute Y to span.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the docs to make it hopefully more clear.

}
}
case "ReplicaSet":
replicaSetsAPI := op.clientset.ExtensionsV1beta1().ReplicaSets(namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we only extract replicaSetName. Not sure if additional API call is worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, in large clusters the number of ReplicaSets or Services will be much smaller than the number of pods and with caching the number of calls will not be that extensive.

I added owner_lookup_enabled flag which is disabled by default and only when it's enabled those calls can be actually made

}
}
case "Service":
servicesAPI := op.clientset.CoreV1().Services(namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How important is this? Can multiple service point to the same pod? If so, can we tell which service was responsible for routing the request to the pod that resulted in the span?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very fair question, I think this is a viable scenario (though probably infrequent). I need to research it bit more. The endpoint address would change in such case, but it is not being used for assigning the metadata anyways...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into it and made a quick test and it seems there's really no easy way to attribute the proper Service name in case several of those are connected. In some cases, it could be extracted from the URL (e.g. for http) but I believe adding such capability would be somewhat complex and not worth the cost. Hence - maybe we should leave it as a current limitation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored this significantly and change the way services are retrieved. If there are several of them pointing to the pod, they are now comma-separated

processor/k8sprocessor/kube/owner.go Outdated Show resolved Hide resolved
processor/k8sprocessor/observability/observability.go Outdated Show resolved Hide resolved
}

// GetOwners fetches deep tree of owners for a given pod
func (op *OwnerCache) GetOwners(pod *api_v1.Pod) []*ObjectOwner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these calls ignore any filters users might have applied?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my way of thinking was that filters impact the choice of pods only. If it went through filtering, it's better to be consistent and pull all owners data. Do you think we should still use the filters up the stream?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do implement the feature then yes, I think filtering should be respected as that's what users would expect and also it helps when Otel is deployed as an agent. As the agent can then only retreive/watch resources that are running on the same node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @owais! BTW, I gave it (general solution) a thought - perhaps we could make a flag that could control if expensive calls to K8S API are also being made? (this is owners flag, but it is definiately in the wrong place right now)

@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Mar 21, 2020

Thanks. I think this or parts of it can be useful but I wish we had discussed the proposal before the implementation. My biggest concern is whether the additional API calls are worth fetching a few extra metadata tags. My opinion is that it's not worth it. We need to be very careful with using K8S API. In general we should minimize the amount of RBAC access and the number of API calls. Watch API works very well for this for large clusters but even that needs to be re-evaluated when it comes to performance of the collector and load it might put on the k8s master.

Thank you @owais ! I agree, had those concerns myself. Unfortunately, in my setting I had to pull this information anyway. On a cluster with 1000's of pods it didn't seem to make significant impact. I added extra caching to make it easier on the cluster, but of course it all comes at a price

I am sorry for bringing this without earlier discussion - was curious how hard it is to pull the tags I needed and ended up with something that worked so I brought it here. Will be happy to follow any suggestions of what to incorporate and how.

@pmm-sumo pmm-sumo force-pushed the k8s-extensions branch 2 times, most recently from 327f1cd to ce06837 Compare April 2, 2020 19:12
@pmm-sumo pmm-sumo changed the title [Proposal] K8s extensions [WIP] K8s extensions Apr 2, 2020
@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Apr 2, 2020

@owais I refactored the code to use informers and at the same time reduced the complexity by a fair amount. I want to continue testing it in a real life environment for a bit but I think it is getting close.
One limitation I am aware is that in some cases, when information about Pod comes earlier than e.g. about ReplicaSet or a Service, the Pod will not be updated. I am wondering if tracking this or refreshing all pods is not too expensive operation for each update, but perhaps it should be added (perhaps at least for Services). What do you think?

@pmm-sumo pmm-sumo changed the title [WIP] K8s extensions K8S processor extensions Apr 3, 2020
@pmm-sumo
Copy link
Contributor Author

pmm-sumo commented Apr 7, 2020

I believe it's ready for next iteration review @owais . I am keeping it up to date with the recent changes in master tree. Should I squash the commits?

@owais
Copy link
Contributor

owais commented Apr 16, 2020

@pmm-sumo Thanks so much for contributing this but I'm still not convinced it's worth it adding more API calls, so much more code and concepts just for a few additional tags. The new tags are useful but IMO don't outweigh the cost. I think we should see if we can extract some of these fields from the events we get from the existing watch API calls we have.

Also, there is a plan to split out the k8s API client from this package into an extension. That extension would then interact with k8s cluster and cache interesting information. All otel components can then query this extension to fetch information about any resources. I know at least one other planned component that will need to watch resources like deployments. Once that is ready, we'll remove all client code from this package and take dependency on the extension instead.

Considering everything I said above, I don't think we should merge this at least at this point. If you'd be interested in the k8s extension, I'll keep you in the loop when effort starts so you can share your input. Till then, if you really need this feature, I'd suggest to fork this processor and include it in a custom build built from the contrib repo.

@owais
Copy link
Contributor

owais commented Apr 16, 2020

@pmm-sumo some of these tags can already be extracted without adding any new calls. We watch pod resources and the type has an owner references field. We can fetch all owners from that field for every pod. It tells us the type/kind of owner including if the pod is being owner by a controller or not, and the name of the owner. I think this is good enough and gets us the name of job, replicaset/deployment, daemonset, controller for zero extra cost. We should try to leverage this as a first step.

https://pkg.go.dev/k8s.io/api/core/v1?tab=doc#Pod
https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1?tab=doc#ObjectMeta
https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1?tab=doc#OwnerReference

Owner references contains names that look like this:

svc-auth-86bbdf5695 // replicaset
otel-agent // daemonset
job-cassandra-schema-v4879 // job

We can extract two pieces of information from this:

  1. The generation name of the job or replicaset (deployment). So 86bbdf5695 for svc-auth, v4879 for the cassandra schema job.
  2. The resource name itself i.e svc-auth, otel-agent, job-cassandra-schema.

Sidenote: Kubernetes resource naming scheme might not be covered under API stability policy but it's probably more reliable and we already rely on it to extract info using regex. I suggest we use this to extract owner names instead.

@pmm-sumo
Copy link
Contributor Author

Thanks @owais, I very much appreciate your input and all the comments over the time.

We watch pod resources and the type has an owner references field. We can fetch all owners from that field for every pod. It tells us the type/kind of owner including if the pod is being owner by a controller or not, and the name of the owner. I think this is good enough and gets us the name of job, replicaset/deployment, daemonset, controller for zero extra cost. We should try to leverage this as a first step.

I think I might be missing something. This is what I am doing in this PR too. It's just that I am going deeper than just the first level of owner references. I.e. it has this: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/142/files#diff-2098aba65eeda8d53f8640d5bd56b512R254 , which goes through the owner references: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/142/files#diff-15f0aed614da53b657cbf849d575988eR232-R259 - leveraging the information collected through informers: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/142/files#diff-15f0aed614da53b657cbf849d575988eR103-R121

Considering everything I said above, I don't think we should merge this at least at this point. If you'd be interested in the k8s extension, I'll keep you in the loop when effort starts so you can share your input. Till then, if you really need this feature, I'd suggest to fork this processor and include it in a custom build built from the contrib repo.

That sounds OK. So, per the suggestion, I am going to close the PR here and just keep it being used in my fork.

I have noticed that recently several larger contributions were added and I realised they cover similar area, so it is natural that the common code should be extracted. I would love to be a part of that. If there are some side-events, feel free to ping me at [email protected]. Thank you!

@pmm-sumo pmm-sumo closed this Apr 17, 2020
@pmm-sumo pmm-sumo deleted the k8s-extensions branch May 27, 2020 20:09
mxiamxia referenced this pull request in mxiamxia/opentelemetry-collector-contrib Jul 22, 2020
- change oc references to otelsvc
ljmsc referenced this pull request in ljmsc/opentelemetry-collector-contrib Feb 21, 2022
* Drop dead event type

* Replace EventType from reader with one from exporter

* Do not concat the strings for string buffer

* Fill scope ID in events

That required renaming a variable, because it had the same name as the
type we wanted to assert later.

* Do not fail quietly when span context is busted
bogdandrutu pushed a commit that referenced this pull request May 12, 2022
bjrara pushed a commit to bjrara/opentelemetry-collector-contrib that referenced this pull request Apr 3, 2024
… commits (open-telemetry#168)

Add support for Container Insights on Windows for EKS  

* Add kubelet summary API for Windows (open-telemetry#142)
* CPU extractors with unit tests (open-telemetry#146)
* Add memory extractors for pod and node level (open-telemetry#147)
* Define structs for CPU and Memory stats (open-telemetry#149)
* Add Container level metrics for CPU and memory resources. (open-telemetry#150)
* Add storage metrics for container and node level (open-telemetry#151)
* Add network metrics (open-telemetry#152)
* Enable awscontainerinsights receiver to run inside Host Process container (open-telemetry#153)
* Add HCS shim api as alternative source for metric provider (open-telemetry#154)
* Add check for host process container before reading from hcshim (open-telemetry#156)
* Fix CPU utilization percentage for Windows nodes (open-telemetry#161)
* Add List of Metrics for Windows + Design (open-telemetry#166)
* fix fstype (open-telemetry#164)
bjrara pushed a commit to bjrara/opentelemetry-collector-contrib that referenced this pull request Apr 3, 2024
* Add kubelet summary API for Windows (open-telemetry#142)
* CPU extractors with unit tests (open-telemetry#146)
* Add memory extractors for pod and node level (open-telemetry#147)
* Define structs for CPU and Memory stats (open-telemetry#149)
* Add Container level metrics for CPU and memory resources. (open-telemetry#150)
* Add storage metrics for container and node level (open-telemetry#151)
* Add network metrics (open-telemetry#152)
* Enable awscontainerinsights receiver to run inside Host Process container (open-telemetry#153)
* Add HCS shim api as alternative source for metric provider (open-telemetry#154)
* Add check for host process container before reading from hcshim (open-telemetry#156)
* Fix CPU utilization percentage for Windows nodes (open-telemetry#161)
* Add List of Metrics for Windows + Design (open-telemetry#166)
* fix fstype (open-telemetry#164)
* Add windows build tag to fix building cw agent on Windows
* Downgrade internal/aws/containerinsight from 0.92 0.89
* Separate unit tests in util.go specific for Windows
* Fix util unit tests applicable for Windows
* Fix goporto issue
* Fix lint issue
* Fix regression in unit tests caused due to rebasing mainline
* Add unit tests for k8s Windows
* Add unit test for kubelet client on Windows
* Run DCGM scrapper only for CW agent on Linux
* Separate out node Volume unit tests for Windows
* Add changelog for Container Insights on Windows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants