-
Notifications
You must be signed in to change notification settings - Fork 872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark for scaling in N of independent resources #1041
Conversation
cs-many-resources/Dummy.cs
Outdated
@@ -0,0 +1,15 @@ | |||
using Pulumi; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unsure about this being in the examples
repo - that repo is a user-facing set of examples, and this is not a real example.
If we do decide we really need this to live in examples
, I'd suggest nesting it inside some "infrastructure" or "misc" like folder so it's not as user-facing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stack72 thoughts on where to find a home for this?
- new
pulumi-benchmarks
repo - put into
pulumi/pulumi
Moving the code is trivial, but I'm a little unsure about cloning the YAML Github Action definitions that run the code, I think you mentioned Paul that those are really managed from some central code-generating repository, was it https://github.com/pulumi/ci-mgmt ? Could you give me a few pointers so I get it done the way we like it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the immediate need right now I can definitely nest this so as not to distract the users looking for actual examples.
Note that benchmarking using component resources will not give the same results as benchmarking using custom resources. A single component resource requires 2 calls to the engine--one call to |
@pgavlin that's great - I think I'd like to check this in as-is, but I'm stuffing these ideas into:
And more generally https://github.com/pulumi/home/issues/1499 I'll specifically need your input in one of these, I will tag you. |
* Initial sketch for Go * Fix test expectations * Make sure we are actually scaling resource counts * Fix test compilation * Int of course * Avoid loop var closure capture * Add configurations with smaller deadweight, more resources * 4096 was too slow, try lower settings first * 4096 timed out in 4h but can we have 512? * Python version * TypeScript version * Add C# version * Add all langauge versions * Eliminate N stack outputs as that seems to skew the benchmark * Fix asserts * Fix the real reason C# came out different * Commit to these settings * Move from root to misc/benchmarks * Remove redundant gitignores * Remove env vars in favor of Pulumi config * Remove IDE files
Fixes https://github.com/pulumi/home/issues/1467
Here are some numbers from the CI run:
The findings that stand out:
at worst linear scaling observed
for most slow numbers, the time is dominated by
time_pulumi_api_ms
- strongly indicating that we're simply being too chatty here, perhaps thedeadweight
diff handling and saving to service state is dominating thispreview-initial is 38s with very small
time_pulumi_api_ms
- why? Hypothesis: installing networked dependencies and compiling, limited visibility into thisdestroy is massively slow, why would that be
In brainstorming with @lukehoban earlier we wanted to focus on
pulumi-update-empty
numbers for the definitive benchmark with alert thresholds.Does this PR sounds like a reasonable benchmark? Should we remove deadweight and increase resource counts further? Or at least experiment with that?
How to run these benchmarks to update data
Kickoff https://github.com/pulumi/examples/actions/workflows/performance_metrics_cron.yml from the right branch; this puts data in S3
Manually invoke a lambda function
Alternatively wait 2h it will sync