Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable useDefineForClassFields for ~10% memory and 10-16% performance improvement #3884

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

taj-p
Copy link

@taj-p taj-p commented May 31, 2024

Intent

By disabling useDefineForClassFields in Mobx build artifacts, we can:

  • save up to ~10% memory
  • improve performance by ~10-15%
  • reduce bundle size by ~1.7% (in uncompressed cjs)

(Consumer code still requires enabling useDefineForClassFields to use decorators - see "why won't this break decorators" heading).

When we useDefineForClassFields: true, we output code that looks like the following:

var ObservableValue = /*#__PURE__*/function (_Atom) {
  _inheritsLoose(ObservableValue, _Atom);
  function ObservableValue(value, enhancer, name_, notifySpy, equals) {
    ...
    _this = _Atom.call(this, name_) || this;
    _this.enhancer = void 0;
    _this.name_ = void 0;
    _this.equals = void 0;
    _this.hasUnreportedChange_ = false;
    _this.interceptors_ = void 0;
    _this.changeListeners_ = void 0;
    _this.value_ = void 0;
    _this.dehancer = void 0;
    _this.enhancer = enhancer;
    _this.name_ = name_;
    _this.equals = equals;
    _this.value_ = enhancer(value, undefined, name_);
    ...

Which results in instances that look like:

image

Of ObservableValues' 16 members, 5 members are initially undefined, but, are still allocated as keys on the instance. In many codebases (and ours), most of our mobx class instances have at least a couple unused members (such as undefined listeners/interceptors) whose memory we pay for hundreds of thousands of times.

When we disable useDefineForClassFields, those unused members aren't allocated in the class instance until they're used because we don't need to define them ahead of time in the constructor:

image

The reason this occurs is because when useDefineForClassFields is enabled, we actually define all potentially used fields ahead of time (as exampled below) - this costs us both memory (in unused allocations) and performance (in unused initialised properties).

class C {
  constructor() {
    // Even if "foo" is undefined, we'll still allocate it onto the instance (and pay the performance cost of assignment!)
    Object.defineProperty(this, "foo", {
      enumerable: true,
      configurable: true,
      writable: true,
      value: 100,
    });
    ...
  }
}

Won't this break decorators?

So long as consumers enable useDefineForClassFields, there is no loss of functionality. Skipping these defineProperty calls in every Mobx constructor has no bearing on whether consumer classes use defineProperty. So, the proposed changes don't affect the decorator bookkeeping that is performed on consumer code (where useDefineForClassFields is enabled).

To emphasize: disabling useDefineForClassFields only prevents that behaviour for our build outputs - consumers still need to enable useDefineForClassFields to use decorators (and for Mobx decorator enrichment/bookkeeping/checking to work).

Although we depart from how our consumer code may build, disabling useDefineForClassFields gains consumers appreciable performance and memory improvements which I think may be more important.

Note: the test suite has been amended to enable useDefineForClassFields in order to correctly test decorators. When disabled, only consumer based generated JavaScript breaks. I'm happy to manually transpile class constructors in the test suite to simulate useDefineForClassFields.

For reference, I am able to use this branch without issue in my large codebase that uses thousands of decorators.

Memory Savings: ~10-20%

Each undefined reference consumes 28 bytes according to this memory heap snapshot. But, to be honest, I don't know if I'm reading this right. In any case, there's some cost to having an undefined value and some cost attributed to the name of the property.

image

To test the memory improvements, I created the below test program which allocates a million boxes and runs them all in a single autorun. We were able to achieve 10% reduction in memory, 23% reduction in ObservableValue shallow size, and 11% reduction in ObservableValue retained size.

Test program
<html lang="en">

<head>
  <meta charset="UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>MobX Example</title>
  <script src="../../dist/mobx.umd.production.min.js"></script>
</head>

<body>
  <h1>MobX Testing</h1>
  <script>
    const {observable, autorun} = mobx

    const boxes = []
    // Prevent boxes gc
    window["boxes"] = boxes

    for (let i = 0; i < 1e6; i++) {
      const box = observable({
        width: 100,
        height: 100,
        color: Math.random() > 0.5 ? "blue" : "red"
      })
      boxes.push(box)
    }

    autorun(() => {
      let widths = 0
      let heights = 0
      let blues = 0
      boxes.forEach(box => {
        widths += box.width
        heights += box.height
        colors += box.color === "blue" ? 1 : 0
      })

      console.log(
        `Total width: ${widths}, total height: ${heights}, total blue boxes: ${blues}`
      )
    })

    setTimeout(() => {
      boxes.forEach(box => {
        box.width = Math.random() * 100
        box.height = Math.random() * 100
        box.color = "red"
      })
    }, 500)
  </script>
</body>

</html>

Results

Before (MB) After (MB) %
Total Heap Size 803 719 -10.46%
ObservableValue shallow size 104 80 -23.08%
ObservableValue retained size 752 668 -11.17%

Screenshots

Note: t in these screenshots refers to ObservableValue (I used a production build for testing).

Before: Total 803 MB, ObservableValue: 104 MB

image

image

After: 719 MB, ComputedValue: 80 MB

image

image

Performance Improvements: ~12-15%

Running the performance suite a few times actually shows significant improvement when disabling useDefineForClassFields: somewhere between 12 - 16%. By not declaring these fields upfront, these classes will go through a few more state transitions; but, it seems that setting those properties ahead of time has a larger cost on performance.

I think the performance improvement is due to the lack of a bunch of Object. defineProperty running in each class constructor.

Proxy mode tests: ~12% improvement

Run Before (s) After (s)
1 6.922 5.865
2 6.892 5.875
3 6.348 5.972
4 6.638 5.906
5 6.752 5.861
Average 6.7104 5.8958 -12.14%

Legacy mode tests: ~15% improvement

Run Before (s) After (s) %
1 6.882 6.166
2 6.77 5.873
3 7.633 6.111
4 7.203 5.93
5 7.042 5.849
Average 7.106 5.9858 -15.76%

Extra: Bundle Size falls by 1.7%

Bundle size, as you can expect with the fewer Object.defineProperty, falls by 1.7% (uncompressed prod cjs 52953 B VS 52020 B)

Code change checklist

  • [-] Added/updated unit tests
  • [-] Updated /docs. For new functionality, at least API.md should be updated
  • Verified that there is no significant performance drop (yarn mobx test:performance)

Copy link

changeset-bot bot commented May 31, 2024

🦋 Changeset detected

Latest commit: 3d89415

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
mobx Patch
eslint-plugin-mobx Patch
mobx-react Patch
mobx-react-lite Patch
mobx-undecorate Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@taj-p taj-p changed the title Disable useDefineForClassFields for 10-20% memory and 12-16% performance improvements Disable useDefineForClassFields for 10-20% memory and 12-16% performance improvement May 31, 2024
@taj-p taj-p marked this pull request as draft May 31, 2024 23:00
@taj-p taj-p closed this May 31, 2024
@taj-p taj-p reopened this Jun 1, 2024
@taj-p taj-p marked this pull request as ready for review June 1, 2024 01:45
@taj-p taj-p changed the title Disable useDefineForClassFields for 10-20% memory and 12-16% performance improvement Disable useDefineForClassFields for ~10% memory and 12-16% performance improvement Jun 1, 2024
@taj-p taj-p changed the title Disable useDefineForClassFields for ~10% memory and 12-16% performance improvement Disable useDefineForClassFields for ~5-10% memory and 10-16% performance improvement Jun 1, 2024
@taj-p taj-p changed the title Disable useDefineForClassFields for ~5-10% memory and 10-16% performance improvement Disable useDefineForClassFields for ~10% memory and 10-16% performance improvement Jun 1, 2024
.changeset/tiny-boxes-play.md Outdated Show resolved Hide resolved
@iChenLei iChenLei requested a review from mweststrate June 2, 2024 01:50
@mweststrate
Copy link
Member

Interesting! I think to get a proper read on the performance impact, we should actually update the perf suite to use ESM builds without downcompiled classes; generally speaking not declaring fields upfront in the constructor, will hit deoptimized paths of the engine as soon as you use such as a field, as the fields in the constructor are used by the JIT to generate so called inner classes to optimize member access (roughly: instead of a map look up, all members declared in the constructor can be looked up using constant offsets. So my suspicion would be that the above is maybe faster for down CJS possibly, but likely slower on envs where first class class are idiomatically supported.

@taj-p
Copy link
Author

taj-p commented Jun 2, 2024

Interesting! I think to get a proper read on the performance impact, we should actually update the perf suite to use ESM builds without downcompiled classes; generally speaking not declaring fields upfront in the constructor, will hit deoptimized paths of the engine as soon as you use such as a field, as the fields in the constructor are used by the JIT to generate so called inner classes to optimize member access (roughly: instead of a map look up, all members declared in the constructor can be looked up using constant offsets. So my suspicion would be that the above is maybe faster for down CJS possibly, but likely slower on envs where first class class are idiomatically supported.

Thank you @mweststrate and nice catch - I think being exhaustive is preferable - I didn't think to test non-downcompiled ESM outputs. I migrated the tests to use non-downcompiled classes and ESM in taj-p#3. I generated the below results.

I think that maps / shape classes (the term the JS engine uses for representing the "shape" of some object) extends to objects as well as classes, so, I'm not sure if it would affect classes significantly differently from class-like downcompiled objects. I'll confer and check my understanding with a colleague.

Results

Proxies

Run Before After
1 6.864 5.889
2 6.929 6.103
3 7.321 5.903
4 6.137 5.687
5 5.95 6.756
Average 6.6402 6.0676 -8.62%

I also ran a "proxy"-only performance test 100 times consecutively for the before and after using time. I ran this because my original 5 runs in the above table didn't seem right (since the legacy test results were so different!). Sure enough, I realised my mistake in the changes and have since updated the above.

  Before (s) After (s) %
real 724.184 658.253 -9.10%
user 869.662 800.861 -7.91%
sys 22.071 21.357 -3.24%

Legacy

Run Before After %
1 5.894 5.863
2 7.065 5.813
3 5.835 5.764
4 6.841 5.605
5 5.925 5.79
Average 6.312 5.767 -8.63%

Analysis

Performance seems better when disabling useDefineForClassFields - the test suite indicates the performance may be in the order of 5-10% improvement for non-downcompiled ESM. Note that we also receive ~10% improved memory efficiency and today's build artefacts see ~12% performance improvement.

Please let me know if you'd like me to create a PR/include in this PR:

  • to update perf tests to use non-downcompiled classes
  • or, update ESM build to use non-downcompiled classes in this (or separate PR)
  • or any other relevant changes

Extra

Wow! Non-downcompiled classes are a bit faster than their downcompiled counterparts. Is there an opportunity here for us to target ES6 without class downcompilation? I can open a discussion if you'd like.

Compilation Changes

We can see how properties are not added on instantiation in the "After ObservableValue" screenshot. (I've shown the development outputs below, but did use the production output for testing).

Before ObservableValue

image

After ObservableValue

image

Extra: Results

The below results are taken from using the downcompiled-class ESM outputs (i.e., our current ESM build outputs) in the perf suite:

Proxy mode

Run Before After %
1 6.656 5.651
2 6.677 5.78
3 6.875 5.897
4 6.848 5.775
5 6.843 5.678
Average 6.7798 5.7562 -15.10%

Legacy mode

Run Before After %
1 6.59 6.113
2 6.629 5.842
3 7.419 5.872
4 6.255 6.994
5 7.441 5.852
Average 6.8668 6.1346 -10.66%

@mweststrate
Copy link
Member

Thanks for the detailed follow up @taj-p!

tagging @peterm-canva who is likely interested in the results here! Happy to create a beta build to make validation easier.

@peterm-canva
Copy link
Contributor

The microbenchmarks from this repo don't really exercise the code in a way that would detect regressions due to deopts from shape changes. That's totally fine. What we'd need is larger macrobenchmarks where a whole application is running, to simulate a more realistic usage pattern.

On canva I see that this change has a slightly (0.5% ish) positive effect on page load metrics. That only indicates how we use mobx though. There's plenty of other valid usage patterns, and I think it would be possible to construct a scenario with a regression. That said, the memory and initialization-time improvements of this change could end up cancelling those out in a lot of cases too.

It's hard to say what the overall effect would be without more macrobenchmarks. If there are open-source projects that use mobx and have macrobenchmarks, it'd be great long-term to set those up so we can measure changes like this more readily.

It's true this approach of not adding fields to an object until they are used does go against typical performance advice for JS. But it also gets nice memory wins for very heavily used objects. It can also be undone fairly easily if there are issues reported, or undone partially by setting an explicit foo = undefined for class fields we want to always exist.

@mweststrate
Copy link
Member

mweststrate commented Jul 17, 2024

Ok happy to give this is shot, will likely unroll if noteworthy regressions are reported. Let's release #3901 first however in a separate version, to better be able to pinpoint any potential regression

@taj-p
Copy link
Author

taj-p commented Jul 17, 2024

@mweststrate - do you mind holding off on merge for about a week? There are 2 actions I'd like to complete first to ensure we don't introduce regression:

  1. Measure page load memory and latency for a variety of document types (the above solid analysis by Peter measures the opening of an empty document, I'd like to also test on medium and large documents)
  2. Conduct an internal testing party with the team on the changes to exercise our application through various scenarios

Sorry for the delay, I've been caught up in other work.

@mweststrate
Copy link
Member

mweststrate commented Jul 18, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants