Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

Replace polyprism Cython code with pure numpy #368

Merged
merged 19 commits into from
May 8, 2017
Merged

Conversation

leouieda
Copy link
Member

@leouieda leouieda commented Dec 19, 2016

Replacing the Cython gravmag.polyprism forward modeling code
with simpler pure Python + numpy versions (following #364).
Using some optimizations and simplifications, the resulting code is just as fast or
even faster than the Cython version.
The main optimization is combining logarithms, e.g.
log(a/b) - log(c/d) with log((a*d)/(b*c)).

Using the following IPython script to benchmark against master:

from __future__ import division, print_function
import sys
import numpy as np
from fatiando import gridder, utils
from fatiando.mesher import PolygonalPrism
# Get processor information
tmp = !cat /proc/cpuinfo | grep "model name"
processor = tmp[0].split(':')[1].strip()
print(processor)
# Make a model for testing
vertices = np.transpose(gridder.circular_scatter([-300, 300, -300, 300], 50))
props = {'density': 1000, 'magnetization': utils.ang2vec(2, 25, -10)}
model = [PolygonalPrism(vertices, 0, 200, props),
         PolygonalPrism(vertices, 200, 300, props)]
inc, dec = -30, 50
x, y, z = gridder.regular((-500, 500, -500, 500), (70, 70), z=-1)
print('Model size: {}'.format(len(vertices)))
print('Grid size: {}'.format(x.size))
# Time the forward modeling of gravity, gradients and mag
from fatiando.gravmag import polyprism
print('Times:')
if len(sys.argv) == 1:
    fields = 'gz gxx gxy gxz gyy gyz gzz tf bz bx by'.split()
else:
    fields = sys.argv[1:]
for field in fields:
    print('  {}: '.format(field), end='')
    if field == 'tf':
        args = (x, y, z, model, inc, dec)
    else:
        args = (x, y, z, model)
    %timeit getattr(polyprism, field)(*args)

These are the times for the master branch on my laptop:

Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz
Model size: 50
Grid size: 4900
Times:
  gz: 1 loop, best of 3: 272 ms per loop
  gxx: 1 loop, best of 3: 215 ms per loop
  gxy: 1 loop, best of 3: 222 ms per loop
  gxz: 10 loops, best of 3: 133 ms per loop
  gyy: 1 loop, best of 3: 229 ms per loop
  gyz: 10 loops, best of 3: 152 ms per loop
  gzz: 10 loops, best of 3: 141 ms per loop
  tf: 1 loop, best of 3: 1.2 s per loop
  bz: 1 loop, best of 3: 442 ms per loop
  bx: 1 loop, best of 3: 658 ms per loop
  by: 1 loop, best of 3: 611 ms per loop

And after the conversion and optimization:

Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz
Model size: 50
Grid size: 4900
Times:
  gz: 1 loop, best of 3: 220 ms per loop
  gxx: 10 loops, best of 3: 155 ms per loop
  gxy: 10 loops, best of 3: 156 ms per loop
  gxz: 10 loops, best of 3: 130 ms per loop
  gyy: 10 loops, best of 3: 157 ms per loop
  gyz: 10 loops, best of 3: 129 ms per loop
  gzz: 10 loops, best of 3: 128 ms per loop
  tf: 1 loop, best of 3: 835 ms per loop
  bz: 1 loop, best of 3: 372 ms per loop
  bx: 1 loop, best of 3: 452 ms per loop
  by: 1 loop, best of 3: 443 ms per loop

Checklist:

  • Make tests for new code (at least 80% coverage)
  • Create/update docstrings
  • Include relevant equations and citations in docstrings
  • Docstrings follow the style conventions
  • Code follows PEP8 style conventions
  • Code and docs have been spellchecked
  • Include new dependencies in doc/install.rst, environment.yml, ci/requirements-conda.txt and ci/requirements-pip.txt.
  • Documentation builds properly (run cd doc; make locally)
  • Changelog entry (leave for last to avoid conflicts)

@leouieda
Copy link
Member Author

leouieda commented May 3, 2017

Currently trying to optimize gz. Tried to replace two arctan2 calls with a single one using an identity. Doesn't work because the identity isn't always valid.

Had success replacing a log subtraction with a division for moderate speed ups. Here is the benchmark result comparing to master:

master
gz: 1 loop, best of 3: 270 ms per loop
branch
gz: 1 loop, best of 3: 221 ms per loop

leouieda added a commit that referenced this pull request May 5, 2017
Simplify the tests of `polyprism` vs `prism` and add regression tests against saved
results, like was done for `sphere` in #364. 

This allows us to test more complex polygonal prisms (an ellipse for example) and 
catch regressions that might affect both `polyprism` and `prism` at the same time.

Test coverage decreases because the numpy alternative implementation is not
tested anymore. This will become the main version in #368 so it's not a big deal 
right now.
Could get a decent speed by replacing log subtraction with a single log.
Couldn't get the arctan identities to work for some reason.
@leouieda leouieda added this to the 0.6 milestone May 5, 2017
@leouieda
Copy link
Member Author

leouieda commented May 5, 2017

gzz didn't yield any speedups but got the same speed as Cython version.

@leouieda
Copy link
Member Author

leouieda commented May 5, 2017

Combined 4 log calls in kernelxx into a single one. Benchmark results:

master
gxx: 1 loop, best of 3: 213 ms per loop
branch
gxx: 10 loops, best of 3: 153 ms per loop

Couldn't optimize because collapsing the logs leads to a decreased
precision (larger difference with the prism code).
Not a problem because it's already pretty fast and the same as the
Cython speed.
Same optimization used for kernelxx (collapse logs)
@leouieda
Copy link
Member Author

leouieda commented May 5, 2017

Used same optimization for gyy (collapse logs).

Didn't optimize the logs for the same reason as kernelxz
They all use the kernel functions so got a bit of a speedup, specially
in tf because of that.
@leouieda
Copy link
Member Author

leouieda commented May 5, 2017

Finished optimizations (see results of benchmark in PR description).

Need to take a better look at the kernel functions.
Add them to the API list for polyprism.
@leouieda
Copy link
Member Author

leouieda commented May 8, 2017

@birocoles a bit of a heads-up on this PR. I've simplified some of your original code. Don't know if @leobvital is using the code for Fatiando or implemented his own, but this is much simpler to read now. The tests are also improved and I've learned how to use pytest better.

@leouieda leouieda merged commit b1abf5e into master May 8, 2017
@leouieda leouieda deleted the polyprism-numpy branch May 8, 2017 15:57
@birocoles
Copy link
Member

Very good @leouieda ! @leobvital is using the Fatiando code but I think that these changes will not affect his work.

leouieda added a commit that referenced this pull request May 20, 2017
Uses saved data from a simple 1 prism model to check for regressions in
the code. Same as done for polyprism in #368 and sphere in #364.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants