Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Implement graceful lowering of derivatives (aka "unexpansion") #2060

Merged
merged 61 commits into from
Feb 13, 2023
Merged
Show file tree
Hide file tree
Changes from 57 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
f799a72
compiler: Prototype unexpansion
FabioLuporini Nov 24, 2022
ffc8c21
compiler: Revamp code generation from _C_ctype
FabioLuporini Nov 25, 2022
91eee45
compiler: Support trivial unexpanded-derivatives examples
FabioLuporini Nov 26, 2022
0eddd5d
dsl: Patch cross_derivative evaluation
FabioLuporini Nov 26, 2022
63f41e8
dsl: Introduce Spacing subclass
FabioLuporini Nov 28, 2022
a94ab05
compiler: Patch StencilDimension reconstruction
FabioLuporini Nov 28, 2022
ce7c96c
compiler: Extend unexpansion machinery
FabioLuporini Nov 26, 2022
4d63f17
compiler: Support StencilDimension in estimate_cost
FabioLuporini Dec 3, 2022
380d262
compiler: Enhance fusion upon lower_index_derivative
FabioLuporini Dec 3, 2022
45b5fa4
compiler: Rework is_cross rule for Cluster fusion
FabioLuporini Dec 10, 2022
8805cd9
compiler: Implement maximal fusion for lowered IndexDerivatives
FabioLuporini Dec 12, 2022
573ba59
compiler: Patch profiling in presence of StencilDimensions
FabioLuporini Dec 12, 2022
fda18ef
compiler: Improve IndexDerivative lowering to catch duplicates
FabioLuporini Dec 12, 2022
be02291
compiler: Enhance pow_to_mul to work around SymPy misbehavior
FabioLuporini Dec 13, 2022
c069dfe
compiler: Rework globals generation for device backends
FabioLuporini Dec 13, 2022
573b4cb
compiler: Rework weights generation for device backends
FabioLuporini Dec 13, 2022
eb8616c
compiler: Patch index mode detection with StencilDimensions
FabioLuporini Dec 15, 2022
0044b3b
compiler: Add IndexDerivative.mapper
FabioLuporini Dec 22, 2022
fb6c226
compiler: Patch lower_index_derivative
FabioLuporini Dec 23, 2022
9a6679a
tests: Patch draft flaky unexpansion test
FabioLuporini Dec 23, 2022
958c9a0
compiler: Patch lower_index_derivatives
FabioLuporini Dec 23, 2022
3f73098
compiler: Patch globs codegen for deterministic output
FabioLuporini Dec 27, 2022
8d28be1
compiler: Relax Properties manipulation methods
FabioLuporini Dec 27, 2022
3eecf4d
compiler: Change IndexDerivative.mapper
FabioLuporini Dec 30, 2022
9a843ba
compiler: Add IterationSpace.translate
FabioLuporini Jan 3, 2023
52e076a
compiler: Move IndexSum.mapper to IndexDerivative.mapper
FabioLuporini Jan 3, 2023
a993d0d
compiler: Patch IndexDerivative.mapper
FabioLuporini Jan 5, 2023
417dac4
compiler: Relax WAR dependencies involving shared Array
FabioLuporini Jan 9, 2023
9c6ac52
compiler: Maximize likelihood of fusing clusters over shm
FabioLuporini Jan 9, 2023
42627b1
compiler: Improve data dependence analysis
FabioLuporini Jan 10, 2023
fead97c
compiler: Add Jump mixin class
FabioLuporini Jan 11, 2023
19f6af3
compiler: Patch collect_derivative pass
FabioLuporini Jan 13, 2023
17ab99b
compiler: Add shm-related heuristics to Cluster fusion
FabioLuporini Jan 20, 2023
c2044d5
compiler: Add Properties methods
FabioLuporini Jan 21, 2023
60e2e03
compiler: Make IndexDerivatives comparable; fix their CSE
FabioLuporini Jan 25, 2023
e5ea4d6
compiler: Draft Guards, akin to Properties
FabioLuporini Jan 31, 2023
9d579b1
compiler: Rework customization of clusters visitors
FabioLuporini Jan 31, 2023
fcc2e86
compiler: Fix Cluster properties normalization at init
FabioLuporini Jan 31, 2023
50025f5
compiler: Extend uxreplace to substitute types as well
FabioLuporini Feb 3, 2023
3496221
compiler: Fixup linearization with isolated routines
FabioLuporini Feb 3, 2023
fcfa448
misc: Fixup pep8 violations
FabioLuporini Feb 4, 2023
c562c30
compiler: Introduce AffineIndexAccessFunction
FabioLuporini Feb 6, 2023
43e700f
compiler: Improve IndexDerivative
FabioLuporini Feb 6, 2023
a588b6d
compiler: Enhance dtype retrieval
FabioLuporini Feb 6, 2023
18bc963
compiler: Tidy up Interval.expand()
FabioLuporini Feb 6, 2023
2640df3
compiler: Drop has_free for compatibility with older SymPy versions
FabioLuporini Feb 7, 2023
5113825
examples: Update expected notebook output
FabioLuporini Feb 7, 2023
0297597
compiler: Patch codegen upon pow_to_mul
FabioLuporini Feb 7, 2023
2c5eb4d
misc: Postpone codegen speed improvement
FabioLuporini Feb 7, 2023
403a194
examples: Update expected output
FabioLuporini Feb 8, 2023
62e0fc8
examples: Disable openmp where necessary due to issue 2061
FabioLuporini Feb 8, 2023
0c801bd
compiler: Exploit SubDim.local to support nasty deps in examples
FabioLuporini Feb 8, 2023
a97f6e6
ci: Drop support for gcc5, sympy1.7, sympy1.8
FabioLuporini Feb 8, 2023
77eae11
compiler: Add IndexDerivative.total_order
FabioLuporini Feb 8, 2023
9af12f4
arch: Enable openmp with nvc on CPU
FabioLuporini Feb 8, 2023
f33fcfc
ci: Add back forgotten gcc-11
FabioLuporini Feb 9, 2023
88fafce
compiler: Tweak lower_index_derivatives
FabioLuporini Feb 9, 2023
dde53cc
compiler: IndexDerivative.total_order -> depth
FabioLuporini Feb 10, 2023
2ce2797
misc: Tweak docstring
FabioLuporini Feb 10, 2023
95b482c
compiler: Lift overrides from AffineIndexAccessFunc into IndexAccessFunc
FabioLuporini Feb 10, 2023
0453503
arch: Add amdclang mapping
FabioLuporini Feb 10, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions .github/workflows/pytest-core-nompi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:

matrix:
name: [
pytest-ubuntu-py37-gcc5-omp,
pytest-ubuntu-py39-gcc11-noomp,
pytest-ubuntu-py38-gcc12-omp,
pytest-ubuntu-py36-gcc7-omp,
pytest-ubuntu-py310-gcc10-noomp,
Expand All @@ -38,19 +38,19 @@ jobs:
]
set: [base, adjoint]
include:
- name: pytest-ubuntu-py37-gcc5-omp
python-version: '3.7'
os: ubuntu-18.04
arch: "gcc-5"
mloubout marked this conversation as resolved.
Show resolved Hide resolved
language: "openmp"
sympy: "1.7"
- name: pytest-ubuntu-py39-gcc11-noomp
python-version: '3.9'
os: ubuntu-22.04
arch: "gcc-11"
language: "C"
sympy: "1.11"

- name: pytest-ubuntu-py38-gcc12-omp
python-version: '3.8'
os: ubuntu-22.04
arch: "gcc-12"
language: "openmp"
sympy: "1.8"
sympy: "1.10"

- name: pytest-ubuntu-py36-gcc7-omp
python-version: '3.6'
Expand Down Expand Up @@ -78,7 +78,7 @@ jobs:
os: ubuntu-20.04
arch: "gcc-9"
language: "openmp"
sympy: "1.8"
sympy: "1.9"

- name: pytest-osx-py37-clang-omp
python-version: '3.7'
Expand Down
7 changes: 5 additions & 2 deletions devito/arch/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
from codepy.jit import compile_from_string
from codepy.toolchain import GCCToolchain

from devito.arch import (AMDGPUX, NVIDIAX, M1, SKX, POWER8, POWER9, get_nvidia_cc,
check_cuda_runtime, get_m1_llvm_path)
from devito.arch import (AMDGPUX, Cpu64, M1, NVIDIAX, SKX, POWER8, POWER9,
get_nvidia_cc, check_cuda_runtime, get_m1_llvm_path)
from devito.exceptions import CompilationError
from devito.logger import debug, warning, error
from devito.parameters import configuration
Expand Down Expand Up @@ -535,6 +535,9 @@ def __init__(self, *args, **kwargs):
self.cflags.extend(['-mp', '-acc:gpu'])
elif language == 'openmp':
self.cflags.extend(['-mp=gpu'])
elif isinstance(platform, Cpu64):
if language == 'openmp':
self.cflags.append('-mp')

if not configuration['safe-math']:
self.cflags.append('-fast')
Expand Down
1 change: 1 addition & 0 deletions devito/core/cpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ def _normalize_kwargs(cls, **kwargs):
o['par-nested'] = oo.pop('par-nested', cls.PAR_NESTED)

# Misc
o['expand'] = oo.pop('expand', cls.EXPAND)
o['optcomms'] = oo.pop('optcomms', True)
o['linearize'] = oo.pop('linearize', False)
o['mapify-reduce'] = oo.pop('mapify-reduce', cls.MAPIFY_REDUCE)
Expand Down
1 change: 1 addition & 0 deletions devito/core/gpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ def _normalize_kwargs(cls, **kwargs):
o['gpu-fit'] = as_tuple(oo.pop('gpu-fit', cls._normalize_gpu_fit(**kwargs)))

# Misc
o['expand'] = oo.pop('expand', cls.EXPAND)
o['optcomms'] = oo.pop('optcomms', True)
o['linearize'] = oo.pop('linearize', False)
o['mapify-reduce'] = oo.pop('mapify-reduce', cls.MAPIFY_REDUCE)
Expand Down
7 changes: 7 additions & 0 deletions devito/core/operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
class BasicOperator(Operator):

# Default values for various optimization options

CSE_MIN_COST = 1
"""
Minimum computational cost of an operation to be eliminated as a
Expand Down Expand Up @@ -88,6 +89,12 @@ class BasicOperator(Operator):
which may be easier to parallelize for certain backends.
"""

EXPAND = True
"""
Unroll all loops with short, numeric trip count, such as loops created by
finite-difference derivatives.
"""

MPI_MODES = tuple(mpi_registry)
"""
The supported MPI modes.
Expand Down
9 changes: 6 additions & 3 deletions devito/finite_differences/derivative.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,11 @@ class Derivative(sympy.Derivative, Differentiable):
evaluation are `x0`, `fd_order` and `side`.
"""

_state = ('expr', 'dims', 'side', 'fd_order', 'transpose', '_ppsubs', 'x0')
_fd_priority = 3

__rargs__ = ('expr', 'dims')
__rkwargs__ = ('side', 'deriv_order', 'fd_order', 'transpose', '_ppsubs', 'x0')

def __new__(cls, expr, *dims, **kwargs):
if type(expr) == sympy.Derivative:
raise ValueError("Cannot nest sympy.Derivative with devito.Derivative")
Expand All @@ -104,7 +106,8 @@ def __new__(cls, expr, *dims, **kwargs):
obj._deriv_order = orders if skip else DimensionTuple(*orders, getters=obj._dims)
obj._side = kwargs.get("side")
obj._transpose = kwargs.get("transpose", direct)
obj._ppsubs = as_tuple(frozendict(i) for i in kwargs.get("subs", []))
obj._ppsubs = as_tuple(frozendict(i) for i in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be nonsensical?

And even then, it would be an issue in master as well since we never attempt to read from the ppsubs, no?

As far as I understand, ppsubs is just an internal attribute that stashes user-provided subs via .subs(...). The only reason I added it is because now Derivative reconstruction exploits (finally!) the Reconstructable infrastructure based on __rargs__ and __rkwargs__, and ppsubs is one such __rkwargs__

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I would ditch subs kwarg, it's only used in one place (xreplace) so would replace it there by _ppsubs=subs and only have one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I tried that but something backfired, not sure I remember what now. I'll give it another try tomorrow morning

kwargs.get("subs", kwargs.get("_ppsubs", [])))
obj._x0 = frozendict(kwargs.get('x0', {}))
return obj

Expand Down Expand Up @@ -237,7 +240,7 @@ def _xreplace(self, subs):

@cached_property
def _metadata(self):
state = list(self._state)
state = list(self.__rargs__ + self.__rkwargs__)
state.remove('expr')
ret = [getattr(self, i) for i in state]
ret.append(self.expr.staggered or (None,))
Expand Down
103 changes: 84 additions & 19 deletions devito/finite_differences/differentiable.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,22 @@
from itertools import product
from functools import singledispatch

from cached_property import cached_property
import numpy as np
import sympy
from sympy.core.add import _addsort
from sympy.core.mul import _keep_coeff, _mulsort
from sympy.core.decorators import call_highest_priority
from sympy.core.evalf import evalf_table

from cached_property import cached_property
from devito.finite_differences.tools import make_shift_x0
from devito.logger import warning
from devito.tools import as_tuple, filter_ordered, flatten, is_integer, split
from devito.types import Array, DimensionTuple, Evaluable, StencilDimension
from devito.tools import (as_tuple, filter_ordered, flatten, frozendict,
infer_dtype, is_integer, split)
from devito.types import (Array, DimensionTuple, Evaluable, Indexed, Spacing,
StencilDimension)

__all__ = ['Differentiable', 'IndexDerivative', 'EvalDerivative']
__all__ = ['Differentiable', 'IndexDerivative', 'EvalDerivative', 'Weights']


class Differentiable(sympy.Expr, Evaluable):
Expand All @@ -29,7 +31,7 @@ class Differentiable(sympy.Expr, Evaluable):
# operators to be used
_op_priority = sympy.Expr._op_priority + 1.

_state = ('space_order', 'time_order', 'indices')
__rkwargs__ = ('space_order', 'time_order', 'indices')

@cached_property
def _functions(self):
Expand Down Expand Up @@ -63,6 +65,11 @@ def grid(self):
except KeyError:
return None

@cached_property
def dtype(self):
dtypes = {f.dtype for f in self.find(Indexed)} - {None}
return infer_dtype(dtypes)

@cached_property
def indices(self):
return tuple(filter_ordered(flatten(getattr(i, 'indices', ())
Expand Down Expand Up @@ -222,7 +229,8 @@ def __neg__(self):

def __eq__(self, other):
return super(Differentiable, self).__eq__(other) and\
all(getattr(self, i, None) == getattr(other, i, None) for i in self._state)
all(getattr(self, i, None) == getattr(other, i, None)
for i in self.__rkwargs__)

@property
def name(self):
Expand Down Expand Up @@ -503,6 +511,8 @@ class IndexSum(DifferentiableOp):
Dimensions, of an indexed expression.
"""

__rargs__ = ('expr', 'dimensions')

is_commutative = True

def __new__(cls, expr, dimensions, **kwargs):
Expand All @@ -517,16 +527,21 @@ def __new__(cls, expr, dimensions, **kwargs):
pass
raise ValueError("Expected Dimension with numeric size, "
"got `%s` instead" % d)
if not expr.has_free(*dimensions):

# TODO: `has_free` only available with SymPy v>=1.10
# We should start using `not expr.has_free(*dimensions)` once we drop
# support for SymPy 1.8<=v<1.0
if not all(d in expr.free_symbols for d in dimensions):
raise ValueError("All Dimensions `%s` must appear in `expr` "
"as free variables" % str(dimensions))

for i in expr.find(IndexSum):
for d in dimensions:
if d in i.dimensions:
raise ValueError("Dimension `%s` already appears in a "
"nested tensor contraction" % d)

obj = sympy.Expr.__new__(cls, expr, *dimensions)
obj = sympy.Expr.__new__(cls, expr)
mloubout marked this conversation as resolved.
Show resolved Hide resolved
obj._expr = expr
obj._dimensions = dimensions

Expand All @@ -538,6 +553,9 @@ def __repr__(self):

__str__ = __repr__

def _hashable_content(self):
return super()._hashable_content() + (self.dimensions,)

@property
def expr(self):
return self._expr
Expand All @@ -563,6 +581,8 @@ def _evaluate(self, **kwargs):
def free_symbols(self):
return super().free_symbols - set(self.dimensions)

func = DifferentiableOp._rebuild


class Weights(Array):

Expand All @@ -579,47 +599,92 @@ def __init_finalize__(self, *args, **kwargs):
assert isinstance(d, StencilDimension) and d.symbolic_size == len(weights)
assert isinstance(weights, (list, tuple, np.ndarray))

kwargs['scope'] = 'static'
try:
self._spacings = set().union(*[i.find(Spacing) for i in weights])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just set(i.find(Spacing) for i in weights) ?

except AttributeError:
self._spacing = set()

kwargs['scope'] = 'constant'

super().__init_finalize__(*args, **kwargs)

def __eq__(self, other):
return (isinstance(other, Weights) and
self.dimension is other.dimension and
self.name == other.name and
self.indices == other.indices and
self.weights == other.weights)

__hash__ = sympy.Basic.__hash__

def _hashable_content(self):
return super()._hashable_content() + (self.name,) + tuple(self.weights)

@property
def dimension(self):
return self.dimensions[0]

@property
def spacings(self):
return self._spacings

weights = Array.initvalue


class IndexDerivative(IndexSum):

def __new__(cls, expr, dimensions, **kwargs):
if not (expr.is_Mul and len(expr.args) == 2):
raise ValueError("Expect expr*weights, got `%s` instead" % str(expr))
_, weights = expr.args
if not isinstance(weights, Weights):
# All of the SymPy versions we support end up placing the Weights
# array here, so if something changes we'll get an alarm through
# this exception
raise ValueError("Couldn't find weights array")
__rargs__ = ('expr', 'mapper')

def __new__(cls, expr, mapper, **kwargs):
dimensions = as_tuple(mapper.values())

# Detect the Weights among the arguments
weightss = []
for a in expr.args:
try:
f = a.function
except AttributeError:
continue
if isinstance(f, Weights):
weightss.append(a)

# Sanity check
if not (expr.is_Mul and len(weightss) == 1):
raise ValueError("Expect `expr*weights`, got `%s` instead" % str(expr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expected (?)

weights = weightss.pop()

obj = super().__new__(cls, expr, dimensions)
obj._weights = weights
obj._mapper = frozendict(mapper)

return obj

def _hashable_content(self):
return super()._hashable_content() + (self.mapper,)

@property
def weights(self):
return self._weights

@property
def mapper(self):
return self._mapper

@property
def total_order(self):
iderivs = self.expr.find(IndexDerivative)
return 1 + max([i.total_order for i in iderivs], default=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 1+?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self is an IndexDerivative :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexDerivative don't have an order? If an IndexDerivative is always 1 it's gonna lead to inconsistency with the actual order of the derivative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have an IndexDerivative (say self), the total_order will definitely be 1 + X. 1 is for self; X depends on the arguments. This still looks correct to me , it's also used in two tests (admittedly in a very simple way)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 is for self;

But why is it one it could be any order derivative it's the representation of the stencil expression that corresponds to a derivative with deriv_order order not just 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I see what you mean now, sorry!

I don't even have the deriv_order information at this point anymore. Perhaps I should attach it when IndexDerivative is created inside finite_differences.py. There it's definitely available!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just realized that what I need for the heuristics I have in place in PRO is the concept of depth, not the total_order (I'm not that sophisticated yet...). So I'm renaming it as "depth", which is what the current implementation represents.


def _evaluate(self, **kwargs):
expr = super()._evaluate(**kwargs)

if not kwargs.get('expand', True):
return expr

w = self.weights
f = w.function
d = w.dimension
mapper = {w.subs(d, i): w.weights[n] for n, i in enumerate(d.range)}
mapper = {w.subs(d, i): f.weights[n] for n, i in enumerate(d.range)}
expr = expr.xreplace(mapper)

return expr
Expand Down
9 changes: 5 additions & 4 deletions devito/finite_differences/finite_difference.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def first_derivative(expr, dim, fd_order=None, side=centered, matvec=direct, x0=

@check_input
@check_symbolic
def cross_derivative(expr, dims, fd_order, deriv_order, **kwargs):
def cross_derivative(expr, dims, fd_order, deriv_order, x0=None, **kwargs):
"""
Arbitrary-order cross derivative of a given expression.

Expand Down Expand Up @@ -155,9 +155,10 @@ def cross_derivative(expr, dims, fd_order, deriv_order, **kwargs):
(-1/h_y)*(-f(1, 2)*g(1, 2)/h_x + f(h_x + 1, 2)*g(h_x + 1, 2)/h_x) + (-f(1, h_y + 2)*\
g(1, h_y + 2)/h_x + f(h_x + 1, h_y + 2)*g(h_x + 1, h_y + 2)/h_x)/h_y
"""
x0 = kwargs.get('x0', {})
x0 = x0 or {}
for d, fd, dim in zip(deriv_order, fd_order, dims):
expr = generic_derivative(expr, dim=dim, fd_order=fd, deriv_order=d, x0=x0)
expr = generic_derivative(expr, dim=dim, fd_order=fd, deriv_order=d, x0=x0,
**kwargs)

return expr

Expand Down Expand Up @@ -240,7 +241,7 @@ def make_derivative(expr, dim, fd_order, deriv_order, side, matvec, x0, symbolic
# Pure number
pass

deriv = IndexDerivative(expr*weights, weights.dimension)
deriv = IndexDerivative(expr*weights, {dim: indices.free_dim})
else:
terms = []
for i, c in zip(indices, weights):
Expand Down
Loading