Optimize weights computation #27

wrightky · 2021-05-01T18:00:37Z

I have one more thing to add to this PR before committing (change routine plotting to re-use plots, also a speed improvement) but I'll go ahead and open it up now for feedback.

Recasts the algorithm used in make_weights to compute weights everywhere using a few big array operations, rather than many small array operations inside a for-loop. Takes advantage of the AVX architecture in numpy to do many operations at once under the hood inside the CPU, which I expect is more optimal than any algorithm we could've gotten from scratch.

Had to introduce a few helper functions to get the process worked out and clean, so here's the breakdown of how those functions are used:

make_weight performs the same steps as before, except instead of being performed locally around some index (i,j), it performs those steps globally on a 3D ndarray in which information on all of the neighbors at each index has been stored along the third axis (exactly like the previous Particles.weights array)
big_sliding_window is a helper function to map all local neighbors into a 3D array. The outcome of this function matches what we would get from (a) looping over all indices (i,j), (b) grabbing the local D8 neighbors with raster[i-1:i+2,j-1:j+2], then (c) applying np.ravel() to that local array, which we store along the third axis of a new array. big_sliding_window just does that without looping, by mapping sub-arrays using the same order as ravel, i.e. [NW, N, NE, W, 0, E, SW, S, SE]
tile_local_array is a helper function that maps a single local D8 raster into a 3D array in which the elements are repeated L,W times. This is used to get arrays like ivec or distances, which are the same everywhere, into the same space as the weights array.
tile_domain_array is a helper function which works like tile_local_array, except it simply repeats a big 2D array 9 times along the vertical. When used on a raster like stage, the output of this function and the output of big_sliding_window directly match up the local values in the raster with the neighboring values.
Lastly, clear_borders is just a useful little function to set all the 2D boundaries stored in the 3D array to 0. This is to eliminate any values that show up in the weights array along the border during the math inside make_weight.

In the test cases I've done so far, computing the weights never takes more than a few seconds. On the largest domain I could think to use (~2900 x 2000 elements), it took about 10 seconds.

For a test application, iterating 2000 particles for 10 hours in a domain (~1100 x 1100 elements) with no plotting, the previous algorithm took 4.6 minutes to finish, whereas this algorithm took almost exactly 3 minutes with similar-looking results. From what I remember of previous times the package was profiled, this 34% speedup is about how much time we were spending inside make_weight, which suggests to me that this function will be our problem no more.

wrightky · 2021-05-01T18:05:34Z

Will also work on adding and editing unit tests for these functions.

wrightky · 2021-05-04T19:04:14Z

I have a couple general questions about the four unit tests that were previously failing (the four newest weight-related tests added to test_lagrangian_walker, i.e. test_make_weight_shallow and thereafter). The reason for these failings actually had nothing to do with the array-weight methodology, but instead one throw-away line that in retrospect was a change to the method:

weight[:,:,4] = 0.

Which, if included, would enforce that particles sitting at some index cannot stand still after an iteration (i.e. the weight for the cell they are in, index 4, is zero). I was under the impression that this was already the case, but it appears that we not only allow it, but these tests are designed to ensure that this happens in certain circumstances. Is this the behavior we actually want?

In all of these test cases, the way this happens inside make_weight is identical. The stage and discharge arrays are zero, and depths are zero almost everywhere except the cell we're in, which means that the weight computation results in zero preference for being anywhere (no surface or velocity gradient). We then enter the failsafe condition (lines 211-215) that sets the weight to 1.0 anywhere nearby with the maximum depth -- which in all of these cases is the cell we are in, even if there are non-zero depths nearby.

I can see how this might be preferable in the case described in test_make_weight_shallow, where none of the surrounding cells have any water. But in test_make_weight_equal_opportunity, where the current and two neighboring indices have the same max depth, I feel like we should prefer traveling to the two neighboring cells over standing still. Likewise, in test_make_weight_unequal_opportunity, because the two neighboring cells have less than the local max depth, the chance of us traveling to those cells over standing still is zero (there is no surface or velocity gradient, and they do not have the local max depth), which I'm not sure is ideal. Lastly, all of these tests check that the weights along the boundary are zero, but they don't check if the weight of traveling to the boundary is zero, merely the weight of staying at the boundary once there (which I think we ruled out as a possibility elsewhere?).

elbeejay

I think these changes look good overall, I'll try to answer some of the questions you brought up. I still have to make in-line comments to some of the code but will do that later tonight

... whereas this algorithm took almost exactly 3 minutes with similar-looking results.

At least for standard scenarios, we can confidently say that the results with this weighting method will be identical to the previous ones as the tests in test_examplecases.py did not break.

I have a couple general questions about the four unit tests that were previously failing (the four newest weight-related tests added to test_lagrangian_walker, i.e. test_make_weight_shallow and thereafter). The reason for these failings actually had nothing to do with the array-weight methodology, but instead one throw-away line that in retrospect was a change to the method:

weight[:,:,4] = 0.

Good to know the new weighting scheme was fine and the tests were just catching this "edge-case" 👍

Which, if included, would enforce that particles sitting at some index cannot stand still after an iteration (i.e. the weight for the cell they are in, index 4, is zero). I was under the impression that this was already the case, but it appears that we not only allow it, but these tests are designed to ensure that this happens in certain circumstances. Is this the behavior we actually want?

I think it is useful for cases where the flow field changes and a particle may be "stranded", in cases when the flow field is very stagnant, the particle might not actually move while others in the cohort do. In the current implementation, we do have 1 exception to this, which is the case steepest_descent is turned on. In that case, we remove the initial location from consideration when routing the particles.

But in test_make_weight_equal_opportunity, where the current and two neighboring indices have the same max depth, I feel like we should prefer traveling to the two neighboring cells over standing still. Likewise, in test_make_weight_unequal_opportunity, because the two neighboring cells have less than the local max depth, the chance of us traveling to those cells over standing still is zero (there is no surface or velocity gradient, and they do not have the local max depth), which I'm not sure is ideal.

In test_make_weight_equal_opportunity() because steepest_descent == True we actually exclude the weight at the initial cell. So although it has a valid weight, the particle step will go to either of the 2 valid neighbors with a 50/50 chance for each. In test_make_weight_unequal_opportunity() I agree that the stepping is a bit strange. Since we remove the initial particle location from consideration, then there would be an equal chance of the particle stepping in any direction, rather than just towards either of the cells with water. I actually think we can fix this quite easily by switching the order we do things in steep_descent() we should remove the initial location from consideration first, and then identify the highest remaining weights, and I think your concern would be addressed.

Lastly, all of these tests check that the weights along the boundary are zero, but they don't check if the weight of traveling to the boundary is zero, merely the weight of staying at the boundary once there (which I think we ruled out as a possibility elsewhere?).

Yeah maybe the way we handle weights near the edges isn't the most consistent. But we actually have a separate function check_for_boundary() which prevents particles from iterating to any cell with a value of -1 in the cell_type array. So this adds a bit of flexibility as one can define additional "edges" with a custom cell_type array. That behavior is covered in test_check_for_boundary(), but I agree, if we folded that check into the weighting scheme to make it impossible to route into cells where cell_type == -1 maybe that would be better than trying to send a particle somewhere and rescinding the motion after the fact.

wrightky · 2021-05-05T23:40:16Z

I actually think we can fix this quite easily by switching the order we do things in steep_descent() we should remove the initial location from consideration first, and then identify the highest remaining weights, and I think your concern would be addressed.

As a quick clarification, in test_make_weight_unequal_opportunity(), the two nearby cells with non-zero depth have a weight of zero, similar to the surrounding cells with no water. This is because there's no stage or velocity gradient leading there, and those cells don't have the local max depth (and therefore aren't caught by the failsafe). So if we do want to route to those cells in a circumstance like that, it needs to happen upstream of steep_descent() somewhere in make_weight, or else the weighting array shows no other viable cells for travel. However, cases where the stage and velocity gradients are zero may be exactly when we'd need the particle to look for the deepest cell, so perhaps the problem is a bit ill-posed.

elbeejay · 2021-05-05T23:47:03Z

As a quick clarification, in test_make_weight_unequal_opportunity(), the two nearby cells with non-zero depth have a weight of zero, similar to the surrounding cells with no water. This is because there's no stage or velocity gradient leading there, and those cells don't have the local max depth (and therefore aren't caught by the failsafe). So if we do want to route to those cells in a circumstance like that, it needs to happen upstream of steep_descent() somewhere in make_weight, or else the weighting array shows no other viable cells for travel.

Ah yes good point. I think I agree with what you say below:

However, cases where the stage and velocity gradients are zero may be exactly when we'd need the particle to look for the deepest cell, so perhaps the problem is a bit ill-posed.

There is the argument to be made that in the absence of a gradient we'd want the particle to sit still since it is already in the "deepest" cell in the neighborhood.

dorado/lagrangian_walker.py

elbeejay

Just went through the code and left a few minor comments, I think it looks good to me. Fair number of small changes to logic that are pretty clever I think and save us time by cutting if-statements and such and moving to these array operations.

dorado/lagrangian_walker.py

wrightky added 2 commits May 1, 2021 00:24

restructure make_weight into full-domain array operations

52b63d5

increment version

bde30e2

wrightky requested a review from elbeejay May 1, 2021 18:00

rename variable in clear_borders

2353025

fix existing unit tests

09f2df0

wrightky added 5 commits May 4, 2021 15:07

test new functions

f6f8bee

re-use plots in routines for speed

baef22d

comment out troublesome cbar update for now

3517111

get rid of deprecated cbar command

b773616

revert to old method for time_plots

aaf64b2

elbeejay reviewed May 5, 2021

View reviewed changes

dorado/lagrangian_walker.py Outdated Show resolved Hide resolved

elbeejay approved these changes May 6, 2021

View reviewed changes

dorado/lagrangian_walker.py Outdated Show resolved Hide resolved

final touches and fix time_plots re-use

96bd9b8

wrightky marked this pull request as ready for review May 7, 2021 19:32

wrightky merged commit d32a050 into passaH2O:master May 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize weights computation #27

Optimize weights computation #27

wrightky commented May 1, 2021

wrightky commented May 1, 2021 •

edited

Loading

wrightky commented May 4, 2021

elbeejay left a comment

wrightky commented May 5, 2021

elbeejay commented May 5, 2021

elbeejay left a comment

Optimize weights computation #27

Optimize weights computation #27

Conversation

wrightky commented May 1, 2021

wrightky commented May 1, 2021 • edited Loading

wrightky commented May 4, 2021

elbeejay left a comment

Choose a reason for hiding this comment

wrightky commented May 5, 2021

elbeejay commented May 5, 2021

elbeejay left a comment

Choose a reason for hiding this comment

wrightky commented May 1, 2021 •

edited

Loading