Default memoize hasher returns invalid key for multiple arguments #575

lxe · 2014-07-07T20:38:20Z

Currently, the default memoize hasher takes only the first argument to create the memo key:

https://github.com/caolan/async/blob/master/lib/async.js#L1000-L1002

This will cause memoized functions that take multiple arguments to call back with invalid results:

var fn = async.memoize(function (a, b, c, cb) {
  cb(a, b, c);
});

fn(1, 2, 3, function (a, b, c) {
  console.log(a, b, c); 
  // 1, 2, 3 - correct
});

fn(1, 10, 20, function (a, b, c) {
  console.log(a, b, c); 
  // 1, 2, 3 - incorrect -- should be "1, 10, 20"
});

lxe · 2014-07-07T20:40:52Z

I'm testing performance of an alternative hasher that would just do

    function(x) {
        if (arguments.length <= 1) {
            return x;
        } else {
            return Array.prototype.slice.call(arguments);
        }
    };

with hope that the check for arguments.length will limit the exposure to the slow(?) Array.prototype.slice.call invocation.

lxe · 2014-07-07T21:51:21Z

https://jsperf.com/async-js-memoize-hashers

doing #575 (comment) seems to be 5% slower on chrome. I think the correctness justifies the performance hit

aearly · 2014-07-07T22:27:00Z

When using things that are easily stringified, using all the args won't be that much slower. The real performance hits when you have to stringify large object arguments.

It is also extremely common in other memoize implementations (such as Underscore/Lodash) to only use the first argument by default. That's what the custom hasher is for.

aearly · 2014-07-07T22:28:15Z

There is also tons of existing code that relies on async.memoize only using the first arg. This would cause a huge regression. It would have to be a 1.0.0 type of change.

lxe · 2014-07-07T22:41:12Z

@aearly I've been solving this by passing a custom hasher, but I think it's important that it's at least documented that the hash key is the first argument only. (as it is in lodash)

owenallenaz · 2014-11-06T01:53:54Z

@lxe We just ran into this same problem. One issue with your solution is that effectively what it's doing is a toString() on the args array, that's what occurs when you set data[key] and key = ["arg1", "arg2"]. This opens up the door for some more unintended consequences, in example func("foo", "bar", cb) and func("foo,bar", cb) will result in the same hash. You could count the args, but it could have similar drawbacks such as ["foo,bar", "baz", "qux"] would hash to the same as ["foo", "bar,baz", "qux"].

What about if we just did a JSON.stringify, falling back to Array.prototype.call(arguments).toString() when JSON isn't available (if we have to worry about IE6,7 support).

As proof, I altered @lxe JSperf to show an example of what I mean https://jsperf.com/async-js-memoize-hashers/2. I also changed some of the loops because they previously had some minor async errors. And really, is the performance of the hasher really that important? The entire intent of async.memoize is to save the cost of an expensive async operation, the cost of toString() or JSON.stringify() is moot in comparison right?

To @aearly can we really expect anyone out there to have written code such that they want func("foo", "bar", cb) to return the same value as func("foo", "baz", cb) for all calls but the first? In example if func("foo", "baz", cb) was made first it would forever return that result, but if func("foo", "bar", cb) was called first, it would forever return that result. That feels like programming on top of a bug and then not wanting that bug to get fixed, is that a realistic scenario?

aearly · 2015-05-19T22:19:03Z

Fixed the docs through #631

lxe pushed a commit to lxe/async that referenced this issue Jul 7, 2014

use all of the arguments in default memoize hasher. fixes caolan#575

779e938

lxe mentioned this issue Jul 7, 2014

Use all of the arguments in default memoize hasher. Fixes #575 #576

Closed

aearly added the docs label May 19, 2015

aearly closed this as completed May 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default memoize hasher returns invalid key for multiple arguments #575

Default memoize hasher returns invalid key for multiple arguments #575

lxe commented Jul 7, 2014

lxe commented Jul 7, 2014

lxe commented Jul 7, 2014

aearly commented Jul 7, 2014

aearly commented Jul 7, 2014

lxe commented Jul 7, 2014

owenallenaz commented Nov 6, 2014

aearly commented May 19, 2015

Default memoize hasher returns invalid key for multiple arguments #575

Default memoize hasher returns invalid key for multiple arguments #575

Comments

lxe commented Jul 7, 2014

lxe commented Jul 7, 2014

lxe commented Jul 7, 2014

aearly commented Jul 7, 2014

aearly commented Jul 7, 2014

lxe commented Jul 7, 2014

owenallenaz commented Nov 6, 2014

aearly commented May 19, 2015