Implement runs inside threads #67

danthe96 · 2018-06-20T18:30:35Z

Runs to be averaged are done inside threads to reduce overhead from syncing threadFlag and some other things on very small vectors. Related to that, the number of runs (if no parameter is specified) defaults to 100 / (size / 8KiB), so the smallest size will do 100 iterations, the second smallest one 50 etc. I'm still returning one time for each iteration so that we can use error bars.

Cache Clearing is kind of irrelevant now, so we might remove it? Feedback welcome!
Edit: Cache clearing has just been removed, from code and bash script

jonaschn · 2018-06-20T18:52:02Z

benchmark/main.cpp

@@ -170,7 +189,7 @@ int main(int argc, char* argv[]) {
 Flags flags;
 flags.Var(colCount, 'c', "column-count", 1, "Number of columns to use");
 flags.Var(threadCount, 't', "thread-count", 1, "Number of threads");
- flags.Var(iterations, 'i', "iterations", 6, "Number of iterations");
+ flags.Var(iterations, 'i', "iterations", -1, "Number of iterations");


If you use 0, you don't need this check iterations==-1 below.

I'm still gonna need the check, but I can change it to 0 if you want.

You still need the check but it would look like:
int mIterations = iterations ? iterations : max(1, (int) (1 / ITERATIONS_FACTOR / size * 8 * KiB));

Don't really like that, it should be immediately clear iterations is a number... I changed it to 0 now.

jonaschn · 2018-06-20T18:52:57Z

benchmark/main.cpp

 for (auto size: DB_SIZES){
 cerr << "benchmarking " << (size / 1024.0f) << " KiB" << endl;
+
+ int mIterations = iterations == -1 ? max(1, (int) (1 / ITERATIONS_FACTOR / size * 8 * KiB)) : iterations;


#Insert useful comment here and intuition behind
int mIterations = iterations ? iterations : max(1, (int) (1 / ITERATIONS_FACTOR / size * 8 * KiB));

MasterCarl · 2018-06-20T19:56:37Z

benchmark/main.cpp

 for (auto size: DB_SIZES){
 cerr << "benchmarking " << (size / 1024.0f) << " KiB" << endl;
+
+ // For smallest size, iterations will be 1 / factor, e.g. 1/0.01 = 100. For double the size half of that etc.
+ int mIterations = iterations == 0 ? max(1, (int) (1 / ITERATIONS_FACTOR / size * DB_SIZES[0])) : iterations;


I still find this super confusing. Why are we using the reciprocal value and not a multiplicative factor? Alternatively, we could define a minimum amount of data to be benchmarked against.
Also, what does the m in mIterations stand for?

I agree that a reciprocal value is not really intuitive.
But the greater problem I see is that with the current setting we will run the benchmark only once for DB_SIZES > 1 MB. Therefore I would propose (6 iterations like before as a minimum):
iterations = iterations == 0 ? max(6, (int) (ITERATIONS_FACTOR / size * DB_SIZES[0])) : iterations; with ITERATIONS_FACTOR = 100.0f

If you don't like to overwrite iterations just rename it, maybe dynamic or modified Iterations

Always 6 at least? Ok sure, why not. I thought reciprocal value would be more intuitive, but I can change it.

The reason I'm not overwriting iterations is that otherwise iterations == 0 will fail in successive loops. Will rename it.

jonaschn · 2018-06-21T21:49:45Z

benchmark/main.cpp

 while (!threads.empty()) {
 delete threads.back();
 threads.pop_back();
 }
 threadFlag = false;
 }

- printResults<T>(times, colSize, threadCount, cache);
+ size_t startIndex = 0;


line 147-145 is exactly the same as 126-144.
I am quite confused why/ if this is necessary.

I simply moved it from the cache iteration the way it was before. As far as I could tell, the cache was being cleared, followed by a dry-run of reading without using the time measured. The reason I'm not using a for loop with 2 iterations was that I thought it would make the code more obtuse than saving 10 lines was worth.

jonaschn · 2018-06-21T21:55:50Z

benchmark/main.cpp

- }
+ // Cache clearing, do one dry run after
+ if (!cache) {
+ clearCache();


cache clearing has only an effect for the first iteration.
If we still want to use cache clearing it has to be performed in threadFunc.
But that would make our benchmark extremely slow.

That was my thought as well. Can I remove it then?

jonaschn · 2018-06-21T22:11:59Z

benchmark/main.cpp

 for (auto size: DB_SIZES){
 cerr << "benchmarking " << (size / 1024.0f) << " KiB" << endl;
+
+ // For smallest size, iterations will be 1 / factor, e.g. 1/0.01 = 100. For double the size half of that etc.
+ int mIterations = iterations == 0 ? max(1, (int) (1 / ITERATIONS_FACTOR / size * DB_SIZES[0])) : iterations;


I agree that a reciprocal value is not really intuitive.
But the greater problem I see is that with the current setting we will run the benchmark only once for DB_SIZES > 1 MB. Therefore I would propose (6 iterations like before as a minimum):
iterations = iterations == 0 ? max(6, (int) (ITERATIONS_FACTOR / size * DB_SIZES[0])) : iterations; with ITERATIONS_FACTOR = 100.0f

jonaschn · 2018-06-21T22:17:07Z

benchmark/main.cpp

 for (auto size: DB_SIZES){
 cerr << "benchmarking " << (size / 1024.0f) << " KiB" << endl;
+
+ // For smallest size, iterations will be 1 / factor, e.g. 1/0.01 = 100. For double the size half of that etc.
+ int mIterations = iterations == 0 ? max(1, (int) (1 / ITERATIONS_FACTOR / size * DB_SIZES[0])) : iterations;


If you don't like to overwrite iterations just rename it, maybe dynamic or modified Iterations

…inside_thread # Conflicts: # benchmark/main.cpp

danthe96 · 2018-06-22T15:43:21Z

Ready to merge, please rereview

MasterCarl

Looks good to me, although the missing cache attribute might break the plots script

MasterCarl

Actually, we should keep the ability to clear the cache for the single-iteration benchmarks

The results for single-iteration benchmarks are probably not reliable (see current variance in plots with even 6 iterations). Therefore we don't need them.

jonaschn

Looks good to me.

Daniel Theveßen added 2 commits June 20, 2018 20:19

Implement runs inside threads

f4b9236

Fix iterations scaling

948afc1

danthe96 self-assigned this Jun 20, 2018

danthe96 requested review from flxw, MasterCarl, jonaschn and janetzki June 20, 2018 18:30

Remove unnecessary var

39f4055

jonaschn reviewed Jun 20, 2018

View reviewed changes

Add comment

0b13c7f

MasterCarl requested changes Jun 20, 2018

View reviewed changes

jonaschn reviewed Jun 21, 2018

View reviewed changes

MasterCarl and others added 4 commits June 22, 2018 17:23

Merge remote-tracking branch 'origin/master' into feature/iterations_…

bbec85f

…inside_thread # Conflicts: # benchmark/main.cpp

Merge branch master

44d8b6e

Remove cache clearing, rename iterations, add minimum iterations

56a3643

Remove cache clearing in bash script, fix factor

f35138d

MasterCarl approved these changes Jun 22, 2018

View reviewed changes

MasterCarl previously requested changes Jun 22, 2018

View reviewed changes

jonaschn approved these changes Jun 23, 2018

View reviewed changes

jonaschn merged commit f204ba9 into master Jun 23, 2018

jonaschn deleted the feature/iterations_inside_thread branch June 23, 2018 10:47

This was referenced Jun 23, 2018

Execute small column scans multiple times #63

Closed

Move iterations into threads #59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement runs inside threads #67

Implement runs inside threads #67

danthe96 commented Jun 20, 2018 •

edited

Loading

jonaschn Jun 20, 2018

danthe96 Jun 20, 2018

jonaschn Jun 20, 2018

danthe96 Jun 20, 2018 •

edited

Loading

jonaschn Jun 20, 2018 •

edited

Loading

MasterCarl Jun 20, 2018

jonaschn Jun 21, 2018

jonaschn Jun 21, 2018

danthe96 Jun 22, 2018

danthe96 Jun 22, 2018

jonaschn Jun 21, 2018

danthe96 Jun 22, 2018

jonaschn Jun 21, 2018

danthe96 Jun 22, 2018

jonaschn Jun 21, 2018

jonaschn Jun 21, 2018

danthe96 commented Jun 22, 2018

MasterCarl left a comment

MasterCarl left a comment

jonaschn left a comment

Implement runs inside threads #67

Implement runs inside threads #67

Conversation

danthe96 commented Jun 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danthe96 Jun 20, 2018 • edited Loading

Choose a reason for hiding this comment

jonaschn Jun 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danthe96 commented Jun 22, 2018

MasterCarl left a comment

Choose a reason for hiding this comment

MasterCarl left a comment

Choose a reason for hiding this comment

jonaschn left a comment

Choose a reason for hiding this comment

danthe96 commented Jun 20, 2018 •

edited

Loading

danthe96 Jun 20, 2018 •

edited

Loading

jonaschn Jun 20, 2018 •

edited

Loading