Replace polyprism Cython code with pure numpy #368

leouieda · 2016-12-19T00:04:50Z

Replacing the Cython gravmag.polyprism forward modeling code
with simpler pure Python + numpy versions (following #364).
Using some optimizations and simplifications, the resulting code is just as fast or
even faster than the Cython version.
The main optimization is combining logarithms, e.g.
log(a/b) - log(c/d) with log((a*d)/(b*c)).

Using the following IPython script to benchmark against master:

from __future__ import division, print_function
import sys
import numpy as np
from fatiando import gridder, utils
from fatiando.mesher import PolygonalPrism
# Get processor information
tmp = !cat /proc/cpuinfo | grep "model name"
processor = tmp[0].split(':')[1].strip()
print(processor)
# Make a model for testing
vertices = np.transpose(gridder.circular_scatter([-300, 300, -300, 300], 50))
props = {'density': 1000, 'magnetization': utils.ang2vec(2, 25, -10)}
model = [PolygonalPrism(vertices, 0, 200, props),
         PolygonalPrism(vertices, 200, 300, props)]
inc, dec = -30, 50
x, y, z = gridder.regular((-500, 500, -500, 500), (70, 70), z=-1)
print('Model size: {}'.format(len(vertices)))
print('Grid size: {}'.format(x.size))
# Time the forward modeling of gravity, gradients and mag
from fatiando.gravmag import polyprism
print('Times:')
if len(sys.argv) == 1:
    fields = 'gz gxx gxy gxz gyy gyz gzz tf bz bx by'.split()
else:
    fields = sys.argv[1:]
for field in fields:
    print('  {}: '.format(field), end='')
    if field == 'tf':
        args = (x, y, z, model, inc, dec)
    else:
        args = (x, y, z, model)
    %timeit getattr(polyprism, field)(*args)

These are the times for the master branch on my laptop:

Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz
Model size: 50
Grid size: 4900
Times:
  gz: 1 loop, best of 3: 272 ms per loop
  gxx: 1 loop, best of 3: 215 ms per loop
  gxy: 1 loop, best of 3: 222 ms per loop
  gxz: 10 loops, best of 3: 133 ms per loop
  gyy: 1 loop, best of 3: 229 ms per loop
  gyz: 10 loops, best of 3: 152 ms per loop
  gzz: 10 loops, best of 3: 141 ms per loop
  tf: 1 loop, best of 3: 1.2 s per loop
  bz: 1 loop, best of 3: 442 ms per loop
  bx: 1 loop, best of 3: 658 ms per loop
  by: 1 loop, best of 3: 611 ms per loop

And after the conversion and optimization:

Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz
Model size: 50
Grid size: 4900
Times:
  gz: 1 loop, best of 3: 220 ms per loop
  gxx: 10 loops, best of 3: 155 ms per loop
  gxy: 10 loops, best of 3: 156 ms per loop
  gxz: 10 loops, best of 3: 130 ms per loop
  gyy: 10 loops, best of 3: 157 ms per loop
  gyz: 10 loops, best of 3: 129 ms per loop
  gzz: 10 loops, best of 3: 128 ms per loop
  tf: 1 loop, best of 3: 835 ms per loop
  bz: 1 loop, best of 3: 372 ms per loop
  bx: 1 loop, best of 3: 452 ms per loop
  by: 1 loop, best of 3: 443 ms per loop

Checklist:

Make tests for new code (at least 80% coverage)
Create/update docstrings
Include relevant equations and citations in docstrings
Docstrings follow the style conventions
Code follows PEP8 style conventions
Code and docs have been spellchecked
Include new dependencies in doc/install.rst, environment.yml, ci/requirements-conda.txt and ci/requirements-pip.txt.
Documentation builds properly (run cd doc; make locally)
Changelog entry (leave for last to avoid conflicts)

leouieda · 2017-05-03T21:20:04Z

Currently trying to optimize gz. Tried to replace two arctan2 calls with a single one using an identity. Doesn't work because the identity isn't always valid.

Had success replacing a log subtraction with a division for moderate speed ups. Here is the benchmark result comparing to master:

master
gz: 1 loop, best of 3: 270 ms per loop
branch
gz: 1 loop, best of 3: 221 ms per loop

Simplify the tests of `polyprism` vs `prism` and add regression tests against saved results, like was done for `sphere` in #364. This allows us to test more complex polygonal prisms (an ellipse for example) and catch regressions that might affect both `polyprism` and `prism` at the same time. Test coverage decreases because the numpy alternative implementation is not tested anymore. This will become the main version in #368 so it's not a big deal right now.

Could get a decent speed by replacing log subtraction with a single log. Couldn't get the arctan identities to work for some reason.

No speedups for gzz but got the same speed as master on the benchmark.

leouieda · 2017-05-05T16:09:25Z

gzz didn't yield any speedups but got the same speed as Cython version.

Combine the 4 log calls into one.

leouieda · 2017-05-05T17:58:10Z

Combined 4 log calls in kernelxx into a single one. Benchmark results:

master
gxx: 1 loop, best of 3: 213 ms per loop
branch
gxx: 10 loops, best of 3: 153 ms per loop

Couldn't optimize because collapsing the logs leads to a decreased precision (larger difference with the prism code). Not a problem because it's already pretty fast and the same as the Cython speed.

Same optimization used for kernelxx (collapse logs)

leouieda · 2017-05-05T19:23:52Z

Used same optimization for gyy (collapse logs).

Didn't optimize the logs for the same reason as kernelxz

They all use the kernel functions so got a bit of a speedup, specially in tf because of that.

leouieda · 2017-05-05T20:03:50Z

Finished optimizations (see results of benchmark in PR description).

Need to take a better look at the kernel functions.

Add them to the API list for polyprism.

leouieda · 2017-05-08T15:55:58Z

@birocoles a bit of a heads-up on this PR. I've simplified some of your original code. Don't know if @leobvital is using the code for Fatiando or implemented his own, but this is much simpler to read now. The tests are also improved and I've learned how to use pytest better.

birocoles · 2017-05-08T17:46:06Z

Very good @leouieda ! @leobvital is using the Fatiando code but I think that these changes will not affect his work.

Uses saved data from a simple 1 prism model to check for regressions in the code. Same as done for polyprism in #368 and sphere in #364.

leouieda added the enhancement label Dec 19, 2016

leouieda force-pushed the polyprism-numpy branch from 740b1e6 to e49814a Compare May 3, 2017 21:12

leouieda mentioned this pull request May 4, 2017

Implement regression tests for gravmag.polyprism #395

Merged

9 tasks

Test using numpy instead of Cython code

2fe6daf

Could get a decent speed by replacing log subtraction with a single log. Couldn't get the arctan identities to work for some reason.

leouieda added this to the 0.6 milestone May 5, 2017

Convert and optimize gz as much as I can

7054327

leouieda force-pushed the polyprism-numpy branch from e49814a to 7054327 Compare May 5, 2017 15:14

leouieda added 3 commits May 5, 2017 05:21

PEP8 fixes

e11bc0f

Convert gzz to numpy only and simplify code

d02b72f

No speedups for gzz but got the same speed as master on the benchmark.

Use fixture in prism comparison and test kernels

be8f7d3

leouieda added 3 commits May 5, 2017 06:27

Use the kernel function for gzz

bab0116

Convert gxx and its kernel but no optimizations

9f16319

Optimize kernelxx by combining logs

de97b9a

Combine the 4 log calls into one.

leouieda added 5 commits May 5, 2017 08:17

Convert kernelxy to numpy only

29f7dfa

Optimize gxy by collapsing log calls

dd33b24

Convert kernelxz and gxz

ea2e296

Couldn't optimize because collapsing the logs leads to a decreased precision (larger difference with the prism code). Not a problem because it's already pretty fast and the same as the Cython speed.

Use kernel functions in all tensor components

1c4939e

Convert and optimize kernelyy

5de9ed9

Same optimization used for kernelxx (collapse logs)

leouieda added 3 commits May 5, 2017 09:30

Convert and simplify kernelyz

3a7a9e5

Didn't optimize the logs for the same reason as kernelxz

Convert mag functions for slight speedup

7206bff

They all use the kernel functions so got a bit of a speedup, specially in tf because of that.

Remove polyprism Cython and C code

fe34d90

leouieda added 3 commits May 5, 2017 10:16

Update function docstrings

27c9baf

Need to take a better look at the kernel functions.

Better docstrings for kernel functions

ae94418

Add them to the API list for polyprism.

Changelog entry for polyprism refactor

f4741e2

leouieda merged commit b1abf5e into master May 8, 2017

leouieda deleted the polyprism-numpy branch May 8, 2017 15:57

leouieda added a commit that referenced this pull request May 20, 2017

Implement regression tests for gravmag prism code

810f5b4

Uses saved data from a simple 1 prism model to check for regressions in the code. Same as done for polyprism in #368 and sphere in #364.

leouieda mentioned this pull request May 20, 2017

Convert gravmag.prism Cython code to numpy #398

Closed

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace polyprism Cython code with pure numpy #368

Replace polyprism Cython code with pure numpy #368

leouieda commented Dec 19, 2016 •

edited

Loading

leouieda commented May 3, 2017 •

edited

Loading

leouieda commented May 5, 2017

leouieda commented May 5, 2017

leouieda commented May 5, 2017

leouieda commented May 5, 2017

leouieda commented May 8, 2017

birocoles commented May 8, 2017

Replace polyprism Cython code with pure numpy #368

Replace polyprism Cython code with pure numpy #368

Conversation

leouieda commented Dec 19, 2016 • edited Loading

Checklist:

leouieda commented May 3, 2017 • edited Loading

leouieda commented May 5, 2017

leouieda commented May 5, 2017

leouieda commented May 5, 2017

leouieda commented May 5, 2017

leouieda commented May 8, 2017

birocoles commented May 8, 2017

leouieda commented Dec 19, 2016 •

edited

Loading

leouieda commented May 3, 2017 •

edited

Loading