Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer v5.7.0 is utilizing much more memory and cpu since version 5.3.0 #1308

Open
dkbrkjni opened this issue Aug 7, 2023 · 2 comments

Comments

@dkbrkjni
Copy link
Contributor

dkbrkjni commented Aug 7, 2023

I just did an update of transformer-kinesis and databricks-loader from version 5.3.0 to version 5.7.0. Before and after the upgrade I ran a stress test using Taurus. After both test run I looked at the cpu and memory utilization during the test run and the result was a little surprising. The cpu utilization when from approximately 13 percent on v5.3.0 to 60 percent (peaking at 90 percent) and memory went from 23 percent to 65 (peaking at 95 percent).

I have been looking at release notes between version 5.3.0 and 5.7.0 but cannot find anything that should justify, from the outside, this huge increase in resource utilization.

I would like to have confirmed that this is intentional and not a bug in the transformer-kinesis.

@istreeter
Copy link
Contributor

Hi @dkbrkjni, thanks for the report.

In between 5.3.0 and 5.7.0 we updated a core runtime library (#1192) so I'm not surprised to hear there are differences according to some measurements. But I had not heard about any performance regression for the measurements we were looking at.

Regarding the memory usage.... we normally configure jvm apps using heap settings like -Xmx2g -Xms2g. The process typically uses ~500MB extra for non-heap memory. So I typically set the -Xmx flags to ~500MB less than the total memory available. I observe the process quickly uses all the memory assigned to it, and then stays at a steady value.

Do you configure the heap size when you the transformer and loader? To understand the problem you describe, it would be helpful to know the max size of your heap, and then how much total memory the transformer/loader uses.

Regarding cpu utilization..... CPU usage is closely linked to how many events the transformer is processing per second. If the event volume goes up, then CPU usage also goes up. The metric we normally measure is "how many events per second can the transformer process when we drive CPU up to 100%". In other words, we feed the transformer with a very large number of events so that CPU goes to 100% and then check the throughput.

As far as I'm aware, there was no decrease in that metric since 5.3.0. I know this is slightly different to the question you were asking, which was about CPU usage at less-than-full capacity.

@dkbrkjni
Copy link
Contributor Author

Hi @istreeter, thanks for your response.

I have configured the heap for the container that runs the enricher using the MaxRAMPercentage parameter XX:MaxRAMPercentage=85.0.

Regarding CPU. It makes sense that the number of events affects the CPU usage, however the number of events processed in both run of Taurus was roughly the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants