Add FixedPointDecimal benchmark. #42

NHDaly · 2018-12-06T16:23:05Z

Opening this PR to discuss merging benchmarks into this repo, so that we can track performance across commits/versions.

I'm not sure if there's a usual structure to follow for putting benchmarks into a julia repo? Do other repos besides the main Julia repo use Nanosoldier?

Also, in its current form, the benchmark compares against raw Int and Float types, but for all the operations except division, those types execute the operation in a single clock tick, so it's almost not worth spending the computation to measure them... So maybe we can simplify this file to just measure FixedDecimals.

Anyway, looking forward to figuring out the best way to do this with you! :)

Adds a benchmark file that produces performance comparisons across various types and operations.

TotalVerb · 2018-12-07T21:48:07Z

I think this makes sense. We do have the benchmarks in bench/ on JSON.jl also: https://github.com/JuliaIO/JSON.jl/tree/master/bench, so there's precedent.

omus · 2018-12-08T02:26:24Z

It isn't wildly used yet but there is: https://github.com/JuliaCI/PkgBenchmark.jl

NHDaly · 2018-12-10T16:16:47Z

@omus that PkgBenchmark seems nice, thanks for the link. (I'm sending a couple PRs now to clean it up for 1.0 so we can tag a version there. 😄)

I guess do you want me to play with setting that up before merging this PR in? I think that seems reasonable

Use custom branch of `PkgBenchmark.jl` to support post-processing, which we need.

coveralls · 2018-12-20T17:44:05Z

Coverage remained the same at 98.837% when pulling ed2db17 on NHDaly:bench into 483325a on JuliaMath:master.

NHDaly · 2018-12-20T18:29:36Z

Okay! I think i've got the benchmarks working via PkgBenchmark.jl, based on JuliaCI/PkgBenchmark.jl#75 going through. (I added a Project.toml pointing at that branch for now, so that we can demo it and see if it makes sense.)

I'll post the results.md file generated here in the next post! :)

There are other things we might want to change, such as:

Maybe truncating the timings to a minimum like I was doing before, so as to limit noise when judging between commits.
Or maybe just removing most/all of those types/ops (like, do we really need to be measuring the time for Int64 multiplication? It's just going to be the same every time!).
Reducing the number of iterations -- with N set to 1, it seems to give pretty consistent results for FixedDecimal timings, but other things seem to swing more (like BigInt, and the regular Integers). With N set to 1000 it's more consistent, and it reduced the "noise tolerance" from 5% to 1%, but it makes it pretty slow (takes about 8min on my machine, vs ~2min).

Okay, here are the results, generated by running $ julia benchmark/runbench.jl:

NHDaly · 2018-12-20T18:29:44Z

Benchmark Report for FixedPointDecimals

Job Properties

Time of benchmark: 20 Dec 2018 - 13:15
Package commit: 0dbf53
Julia commit: d78923
Julia command flags: None
Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID	time	GC time	memory	allocations
`["*", " Int32"]`	0.086 ns (1%)
`["*", " Int64"]`	0.235 ns (1%)
`["*", " Int128"]`	0.285 ns (1%)
`["*", "BigFloat"]`	49.889 ns (1%)	2.421 ns	112 bytes (1%)	2
`["*", "BigInt"]`	261.839 ns (1%)	62.843 ns	48 bytes (1%)	3
`["*", "FD{ Int32,2}"]`	1.550 ns (1%)
`["*", "FD{ Int64,2}"]`	15.891 ns (1%)
`["*", "FD{Int128,2}"]`	2.070 μs (1%)	475.085 ns	456 bytes (1%)	24
`["*", "Float32"]`	0.339 ns (1%)
`["*", "Float64"]`	0.215 ns (1%)
`["+", " Int32"]`	0.076 ns (1%)
`["+", " Int64"]`	0.007 ns (1%)
`["+", " Int128"]`	0.180 ns (1%)
`["+", "BigFloat"]`	56.617 ns (1%)	4.689 ns	112 bytes (1%)	2
`["+", "BigInt"]`	258.110 ns (1%)	70.857 ns	48 bytes (1%)	3
`["+", "FD{ Int32,2}"]`	-0.027 ns (1%)
`["+", "FD{ Int64,2}"]`	-0.217 ns (1%)
`["+", "FD{Int128,2}"]`	-37.671 ns (1%)	-18.925 ns
`["+", "Float32"]`	0.238 ns (1%)
`["+", "Float64"]`	0.231 ns (1%)
`["/", " Int32"]`	3.837 ns (1%)
`["/", " Int64"]`	5.173 ns (1%)
`["/", " Int128"]`	13.514 ns (1%)
`["/", "BigFloat"]`	142.886 ns (1%)	2.393 ns	112 bytes (1%)	2
`["/", "BigInt"]`	421.201 ns (1%)	9.256 ns	464 bytes (1%)	10
`["/", "FD{ Int32,2}"]`	5.627 ns (1%)
`["/", "FD{ Int64,2}"]`	20.864 ns (1%)
`["/", "FD{Int128,2}"]`	2.093 μs (1%)	505.571 ns	456 bytes (1%)	24
`["/", "Float32"]`	0.413 ns (1%)
`["/", "Float64"]`	0.216 ns (1%)
`["div", " Int32"]`	0.000 ns (1%)
`["div", " Int64"]`	0.032 ns (1%)
`["div", " Int128"]`	0.100 ns (1%)
`["div", "BigFloat"]`	117.968 ns (1%)	2.174 ns	112 bytes (1%)	2
`["div", "BigInt"]`	263.183 ns (1%)	70.385 ns	40 bytes (1%)	2
`["div", "FD{ Int32,2}"]`	-0.125 ns (1%)
`["div", "FD{ Int64,2}"]`	2.543 ns (1%)
`["div", "FD{Int128,2}"]`	484.794 ns (1%)	102.801 ns	128 bytes (1%)	7
`["div", "Float32"]`	2.540 ns (1%)
`["div", "Float64"]`	2.390 ns (1%)
`["identity", " Int32"]`	0.265 ns (1%)
`["identity", " Int64"]`	0.323 ns (1%)
`["identity", " Int128"]`	0.525 ns (1%)
`["identity", "BigFloat"]`	151.618 ns (1%)	13.535 ns	336 bytes (1%)	6
`["identity", "BigInt"]`	695.563 ns (1%)	157.238 ns	136 bytes (1%)	8
`["identity", "FD{ Int32,2}"]`	1.293 ns (1%)
`["identity", "FD{ Int64,2}"]`	1.266 ns (1%)
`["identity", "FD{Int128,2}"]`	604.572 ns (1%)	137.165 ns	128 bytes (1%)	7
`["identity", "Float32"]`	0.978 ns (1%)
`["identity", "Float64"]`	1.043 ns (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["*"]
["+"]
["/"]
["div"]
["identity"]

Julia versioninfo

Julia Version 1.0.2
Commit d789231e99* (2018-11-08 20:11 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.2.0)
  uname: Darwin 18.2.0 Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 2018; root:xnu-4903.231.4~2/RELEASE_X86_64 x86_64 i386
  CPU: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz: 
                 speed         user         nice          sys         idle          irq
       #1-12  2900 MHz     768878 s          0 s     389078 s    9840166 s          0 s
       
  Memory: 32.0 GB (3027.5859375 MB free)
  Uptime: 269534.0 sec
  Load Avg:  3.58056640625  3.8349609375  3.80224609375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

codecov-io · 2018-12-20T18:57:13Z

Codecov Report

Merging #42 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master      #42   +/-   ##
=======================================
  Coverage   98.83%   98.83%           
=======================================
  Files           1        1           
  Lines         172      172           
=======================================
  Hits          170      170           
  Misses          2        2

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 483325a...ed2db17. Read the comment docs.

… of this

… purposes

NHDaly · 2019-02-11T20:59:53Z

So I just want to leave a status update here.

So I think this basically works. The benchmarks run (and after the merged-changes in JuliaCI/PkgBenchmark.jl#75, they should correctly and precisely be measuring only the time for each operation (not copying the value, reading from an array, etc)).

The remaining blocker to merging is that it's extremely variable, so much so that I don't think it's useful. Even when running on a single computer, comparing the a single commit against itself, PkgBenchmark consistently reports statistically significant variance. If anyone has any advice about how to diagnose this, that would be much appreciated!

I've tried several things trying to pinpoint the source of the variance, but haven't had any luck:

I tried simplifying the benchmarks to just be @benchmarkable $op($x, $x), but saw the same variance there.
I tried statically compiling a sysimg containing FixedPointDecimals, and using that for running the benchmarks, which didn't help.
One of my coworkers tried disabling the GC (i'm not sure what steps they took), but said that didn't help either.

Does anyone have any other ideas? Without this, this seems not very useful. Sometimes the swings are as large as 100% or 200%, so I'm not sure we'd get meaningful feedback on PRs.

Add Decimal Representations Comparison benchmark.

3e7da85

Adds a benchmark file that produces performance comparisons across various types and operations.

NHDaly mentioned this pull request Dec 6, 2018

Performance investigation #37

Open

3 tasks

NHDaly added 3 commits December 10, 2018 13:25

Move bench to benchmark/benchmarks.jl for PkgBenchmark.jl

bc4083f

Trying to switch to use PkgBenchmark.jl

75639c6

Add main script: benchmark/runbench.jl

54e4cae

Use custom branch of `PkgBenchmark.jl` to support post-processing, which we need.

NHDaly force-pushed the bench branch from d2b5ca8 to 4ca46f4 Compare December 20, 2018 17:47

Merge branch 'master' into bench

43df956

NHDaly force-pushed the bench branch from 4ca46f4 to 43df956 Compare December 20, 2018 17:47

Cleanup cruft; update PkgBenchmark.jl commit

0dbf53b

NHDaly force-pushed the bench branch from 29824d3 to 0dbf53b Compare December 20, 2018 18:06

Write results.md to benchmark/ directory.

f6a48f8

NHDaly changed the title ~~Add Decimal Representations Comparison benchmark.~~ Add FixedPointDecimal benchmark. Dec 20, 2018

NHDaly added 2 commits January 4, 2019 15:37

Fix / in postproccess to round() mem and allocs

90a86b5

Remove BigInt,BigFloat from benchmarks cause too slow

b078194

NHDaly mentioned this pull request Jan 16, 2019

Add postprocess::Function arg to benchmarkpkg() JuliaCI/PkgBenchmark.jl#75

Merged

NHDaly added 7 commits January 27, 2019 09:11

Switch to PkgBenchmark/master after merge

c3ac10f

Add judgebench() to judge perf b/w commits

a258785

Don't divide by N when judging two commits to prevent noisiness

8d6712f

put back postprocess; set N=1

6402fa2

Keep N=1000; still no div for judge

66a3172

Remove non-FD benchmarks; they'll never change so shouldn't be a part…

a58dbfd

… of this

Added option to change postprocess function in judgebench for testing…

ed2db17

… purposes

NHDaly force-pushed the bench branch from c5ac6a5 to ed2db17 Compare January 28, 2019 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FixedPointDecimal benchmark. #42

Add FixedPointDecimal benchmark. #42

NHDaly commented Dec 6, 2018

TotalVerb commented Dec 7, 2018

omus commented Dec 8, 2018

NHDaly commented Dec 10, 2018

coveralls commented Dec 20, 2018 •

edited

Loading

NHDaly commented Dec 20, 2018 •

edited

Loading

NHDaly commented Dec 20, 2018 •

edited

Loading

codecov-io commented Dec 20, 2018 •

edited

Loading

NHDaly commented Feb 11, 2019

Add FixedPointDecimal benchmark. #42

Are you sure you want to change the base?

Add FixedPointDecimal benchmark. #42

Conversation

NHDaly commented Dec 6, 2018

TotalVerb commented Dec 7, 2018

omus commented Dec 8, 2018

NHDaly commented Dec 10, 2018

coveralls commented Dec 20, 2018 • edited Loading

NHDaly commented Dec 20, 2018 • edited Loading

NHDaly commented Dec 20, 2018 • edited Loading

Benchmark Report for FixedPointDecimals

Job Properties

Results

Benchmark Group List

Julia versioninfo

codecov-io commented Dec 20, 2018 • edited Loading

Codecov Report

NHDaly commented Feb 11, 2019

coveralls commented Dec 20, 2018 •

edited

Loading

NHDaly commented Dec 20, 2018 •

edited

Loading

NHDaly commented Dec 20, 2018 •

edited

Loading

codecov-io commented Dec 20, 2018 •

edited

Loading