three-argument [i]permute! methods #22815

Sacha0 · 2017-07-14T17:53:08Z

Being able to permute one vector with the result stored in another --- without allocation --- is now and again quite handy. For example, this functionality could improve #22774, and I had use for such methods last week in some research code. Naive implementations are as simple as

function permute!(dest, src, perm)
    for (i, pi) in enumerate(perm)
        dest[i] = src[pi]
    end
    return dest
end
function ipermute!(dest, src, perm)
    for (i, pi) in enumerate(perm)
        dest[pi] = src[i]
    end
    return dest
end

Perhaps such methods have a place alongside the two-argument [i]permute! methods existing in base? Best!

The text was updated successfully, but these errors were encountered:

stevengj · 2017-07-14T18:47:20Z

In the second case, wouldn't dest[perm] = src do what you want?

Sacha0 · 2017-07-14T18:57:05Z

In the second case, wouldn't dest[perm] = src do what you want?

Yes, that does the trick! Good catch! :)

TsurHerman · 2017-07-17T22:30:27Z

And in the first case, wouldn't dest .= @view src[perm] do the trick?

Sacha0 · 2017-07-18T00:25:00Z

And in the first case, wouldn't dest .= @view src[perm] do the trick?

Clever! :)

This issue then reduces to two questions: (1) Are three-argument [i]permute! methods worthwhile as alternative, explicit spellings for these operations? (2) Do the indexing approaches perform adequately?

Regarding (2):

For not-small containers, dest[perm] = src and the above naive implementation of ipermute!(dest, src, perm) perform roughly identically. For small containers (e.g. n ~ 10, see below), dest[perm] = src is ~50% slower than ipermute!(dest, src, perm) locally. Chances are the checks necessary to make the naive ipermute!(dest, src, perm) implementation above as safe as the indexing approach would erase that discrepancy. So cheers.

For small containers (e.g. n ~ 10, see below), dest .= @view src[perm] is also ~50% slower than the above naive implementation of permute!(dest, src, perm). Expected. The discrepancy persists for larger containers though:

julia> function permute!(dest, src, perm)
           for (i, pi) in enumerate(perm)
               dest[i] = src[pi]
           end
           return dest
       end
permute! (generic function with 1 method)

julia> function ipermute!(dest, src, perm)
           for (i, pi) in enumerate(perm)
               dest[pi] = src[i]
           end
           return dest
       end
ipermute! (generic function with 1 method)

julia> using BenchmarkTools

julia> begin
           n = 10
           foo = rand(n)
           bar = similar(foo)
           perm = randperm(n)
       end;

julia> @benchmark ipermute!($bar, $foo, $perm)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     12.966 ns (0.00% GC)
  median time:      13.803 ns (0.00% GC)
  mean time:        15.357 ns (0.00% GC)
  maximum time:     236.890 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     998

julia> @benchmark $bar[$perm] = $foo
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     18.628 ns (0.00% GC)
  median time:      18.683 ns (0.00% GC)
  mean time:        21.302 ns (0.00% GC)
  maximum time:     291.829 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     997

  julia> @benchmark permute!($bar, $foo, $perm)
  BenchmarkTools.Trial:
    memory estimate:  0 bytes
    allocs estimate:  0
    --------------
    minimum time:     13.031 ns (0.00% GC)
    median time:      14.247 ns (0.00% GC)
    mean time:        15.802 ns (0.00% GC)
    maximum time:     285.616 ns (0.00% GC)
    --------------
    samples:          10000
    evals/sample:     998

  julia> @benchmark $bar .= @view $foo[$perm]
  BenchmarkTools.Trial:
    memory estimate:  0 bytes
    allocs estimate:  0
    --------------
    minimum time:     21.225 ns (0.00% GC)
    median time:      22.537 ns (0.00% GC)
    mean time:        24.698 ns (0.00% GC)
    maximum time:     140.972 ns (0.00% GC)
    --------------
    samples:          10000
    evals/sample:     997

begin
    n = 100
    foo = rand(n)
    bar = similar(foo)
    perm = randperm(n)
end;

  julia> @benchmark permute!($bar, $foo, $perm)
  BenchmarkTools.Trial:
    memory estimate:  0 bytes
    allocs estimate:  0
    --------------
    minimum time:     129.426 ns (0.00% GC)
    median time:      129.462 ns (0.00% GC)
    mean time:        146.328 ns (0.00% GC)
    maximum time:     541.591 ns (0.00% GC)
    --------------
    samples:          10000
    evals/sample:     886

  julia> @benchmark $bar .= @view $foo[$perm]
  BenchmarkTools.Trial:
    memory estimate:  0 bytes
    allocs estimate:  0
    --------------
    minimum time:     185.797 ns (0.00% GC)
    median time:      190.965 ns (0.00% GC)
    mean time:        213.428 ns (0.00% GC)
    maximum time:     1.447 μs (0.00% GC)
    --------------
    samples:          10000
    evals/sample:     671

Best!

Sacha0 added domain:collections Data structures holding multiple items, e.g. sets design Design of APIs or of the language itself labels Jul 14, 2017

Sacha0 mentioned this issue Jul 14, 2017

Fix LU factorization in-place operations #22774

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

three-argument [i]permute! methods #22815

three-argument [i]permute! methods #22815

Sacha0 commented Jul 14, 2017

stevengj commented Jul 14, 2017

Sacha0 commented Jul 14, 2017

TsurHerman commented Jul 17, 2017

Sacha0 commented Jul 18, 2017

three-argument [i]permute! methods #22815

three-argument [i]permute! methods #22815

Comments

Sacha0 commented Jul 14, 2017

stevengj commented Jul 14, 2017

Sacha0 commented Jul 14, 2017

TsurHerman commented Jul 17, 2017

Sacha0 commented Jul 18, 2017