Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Implement graceful lowering of derivatives (aka "unexpansion") #2060

Merged
merged 61 commits into from
Feb 13, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
f799a72
compiler: Prototype unexpansion
FabioLuporini Nov 24, 2022
ffc8c21
compiler: Revamp code generation from _C_ctype
FabioLuporini Nov 25, 2022
91eee45
compiler: Support trivial unexpanded-derivatives examples
FabioLuporini Nov 26, 2022
0eddd5d
dsl: Patch cross_derivative evaluation
FabioLuporini Nov 26, 2022
63f41e8
dsl: Introduce Spacing subclass
FabioLuporini Nov 28, 2022
a94ab05
compiler: Patch StencilDimension reconstruction
FabioLuporini Nov 28, 2022
ce7c96c
compiler: Extend unexpansion machinery
FabioLuporini Nov 26, 2022
4d63f17
compiler: Support StencilDimension in estimate_cost
FabioLuporini Dec 3, 2022
380d262
compiler: Enhance fusion upon lower_index_derivative
FabioLuporini Dec 3, 2022
45b5fa4
compiler: Rework is_cross rule for Cluster fusion
FabioLuporini Dec 10, 2022
8805cd9
compiler: Implement maximal fusion for lowered IndexDerivatives
FabioLuporini Dec 12, 2022
573ba59
compiler: Patch profiling in presence of StencilDimensions
FabioLuporini Dec 12, 2022
fda18ef
compiler: Improve IndexDerivative lowering to catch duplicates
FabioLuporini Dec 12, 2022
be02291
compiler: Enhance pow_to_mul to work around SymPy misbehavior
FabioLuporini Dec 13, 2022
c069dfe
compiler: Rework globals generation for device backends
FabioLuporini Dec 13, 2022
573b4cb
compiler: Rework weights generation for device backends
FabioLuporini Dec 13, 2022
eb8616c
compiler: Patch index mode detection with StencilDimensions
FabioLuporini Dec 15, 2022
0044b3b
compiler: Add IndexDerivative.mapper
FabioLuporini Dec 22, 2022
fb6c226
compiler: Patch lower_index_derivative
FabioLuporini Dec 23, 2022
9a6679a
tests: Patch draft flaky unexpansion test
FabioLuporini Dec 23, 2022
958c9a0
compiler: Patch lower_index_derivatives
FabioLuporini Dec 23, 2022
3f73098
compiler: Patch globs codegen for deterministic output
FabioLuporini Dec 27, 2022
8d28be1
compiler: Relax Properties manipulation methods
FabioLuporini Dec 27, 2022
3eecf4d
compiler: Change IndexDerivative.mapper
FabioLuporini Dec 30, 2022
9a843ba
compiler: Add IterationSpace.translate
FabioLuporini Jan 3, 2023
52e076a
compiler: Move IndexSum.mapper to IndexDerivative.mapper
FabioLuporini Jan 3, 2023
a993d0d
compiler: Patch IndexDerivative.mapper
FabioLuporini Jan 5, 2023
417dac4
compiler: Relax WAR dependencies involving shared Array
FabioLuporini Jan 9, 2023
9c6ac52
compiler: Maximize likelihood of fusing clusters over shm
FabioLuporini Jan 9, 2023
42627b1
compiler: Improve data dependence analysis
FabioLuporini Jan 10, 2023
fead97c
compiler: Add Jump mixin class
FabioLuporini Jan 11, 2023
19f6af3
compiler: Patch collect_derivative pass
FabioLuporini Jan 13, 2023
17ab99b
compiler: Add shm-related heuristics to Cluster fusion
FabioLuporini Jan 20, 2023
c2044d5
compiler: Add Properties methods
FabioLuporini Jan 21, 2023
60e2e03
compiler: Make IndexDerivatives comparable; fix their CSE
FabioLuporini Jan 25, 2023
e5ea4d6
compiler: Draft Guards, akin to Properties
FabioLuporini Jan 31, 2023
9d579b1
compiler: Rework customization of clusters visitors
FabioLuporini Jan 31, 2023
fcc2e86
compiler: Fix Cluster properties normalization at init
FabioLuporini Jan 31, 2023
50025f5
compiler: Extend uxreplace to substitute types as well
FabioLuporini Feb 3, 2023
3496221
compiler: Fixup linearization with isolated routines
FabioLuporini Feb 3, 2023
fcfa448
misc: Fixup pep8 violations
FabioLuporini Feb 4, 2023
c562c30
compiler: Introduce AffineIndexAccessFunction
FabioLuporini Feb 6, 2023
43e700f
compiler: Improve IndexDerivative
FabioLuporini Feb 6, 2023
a588b6d
compiler: Enhance dtype retrieval
FabioLuporini Feb 6, 2023
18bc963
compiler: Tidy up Interval.expand()
FabioLuporini Feb 6, 2023
2640df3
compiler: Drop has_free for compatibility with older SymPy versions
FabioLuporini Feb 7, 2023
5113825
examples: Update expected notebook output
FabioLuporini Feb 7, 2023
0297597
compiler: Patch codegen upon pow_to_mul
FabioLuporini Feb 7, 2023
2c5eb4d
misc: Postpone codegen speed improvement
FabioLuporini Feb 7, 2023
403a194
examples: Update expected output
FabioLuporini Feb 8, 2023
62e0fc8
examples: Disable openmp where necessary due to issue 2061
FabioLuporini Feb 8, 2023
0c801bd
compiler: Exploit SubDim.local to support nasty deps in examples
FabioLuporini Feb 8, 2023
a97f6e6
ci: Drop support for gcc5, sympy1.7, sympy1.8
FabioLuporini Feb 8, 2023
77eae11
compiler: Add IndexDerivative.total_order
FabioLuporini Feb 8, 2023
9af12f4
arch: Enable openmp with nvc on CPU
FabioLuporini Feb 8, 2023
f33fcfc
ci: Add back forgotten gcc-11
FabioLuporini Feb 9, 2023
88fafce
compiler: Tweak lower_index_derivatives
FabioLuporini Feb 9, 2023
dde53cc
compiler: IndexDerivative.total_order -> depth
FabioLuporini Feb 10, 2023
2ce2797
misc: Tweak docstring
FabioLuporini Feb 10, 2023
95b482c
compiler: Lift overrides from AffineIndexAccessFunc into IndexAccessFunc
FabioLuporini Feb 10, 2023
0453503
arch: Add amdclang mapping
FabioLuporini Feb 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
compiler: Enhance fusion upon lower_index_derivative
  • Loading branch information
FabioLuporini committed Feb 7, 2023
commit 380d26266426dcd2d624cf8aee06c87d85ed2429
5 changes: 4 additions & 1 deletion devito/finite_differences/differentiable.py
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,10 @@ def __init_finalize__(self, *args, **kwargs):
assert isinstance(d, StencilDimension) and d.symbolic_size == len(weights)
assert isinstance(weights, (list, tuple, np.ndarray))

self._spacings = set().union(*[i.find(Spacing) for i in weights])
try:
self._spacings = set().union(*[i.find(Spacing) for i in weights])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just set(i.find(Spacing) for i in weights) ?

except AttributeError:
self._spacing = set()

kwargs['scope'] = 'constant'

Expand Down
9 changes: 7 additions & 2 deletions devito/ir/clusters/cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@
from devito.ir.support import (PARALLEL, PARALLEL_IF_PVT, BaseGuardBoundNext,
Forward, Interval, IntervalGroup, IterationSpace,
DataSpace, Properties, Scope, detect_accesses,
detect_io, normalize_properties, normalize_syncs)
detect_io, normalize_properties, normalize_syncs,
sdims_min, sdims_max)
from devito.symbolics import estimate_cost
from devito.tools import as_tuple, flatten, frozendict

Expand Down Expand Up @@ -257,7 +258,11 @@ def dspace(self):
if f is None:
continue

intervals = [Interval(d, min(offs), max(offs)) for d, offs in v.items()]
#TODO OOOOOOOo
intervals = [Interval(d,
min([sdims_min(i) for i in offs]),
max([sdims_max(i) for i in offs]))
for d, offs in v.items()]
intervals = IntervalGroup(intervals)

# Factor in the IterationSpace -- if the min/max points aren't zero,
Expand Down
7 changes: 6 additions & 1 deletion devito/ir/iet/visitors.py
Original file line number Diff line number Diff line change
Expand Up @@ -304,11 +304,16 @@ def _blankline_logic(self, children):
g = list(group)

if k in (ExpressionBundle, Section) and len(g) >= 2:
# Separate consecutive Sections/ExpressionBundles with BlankLine
# Separate consecutive Sections/ExpressionBundles with
# BlankLine
for i in g[:-1]:
rebuilt.append(i)
rebuilt.append(BlankLine)
rebuilt.append(g[-1])
elif (k is Iteration and
prev is ExpressionBundle and
all(i.dim.is_Stencil for i in g)):
rebuilt.extend(g)
elif prev in candidates and k in candidates:
rebuilt.append(BlankLine)
rebuilt.extend(g)
Expand Down
10 changes: 8 additions & 2 deletions devito/ir/support/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,10 @@ def sdims_min(expr):
"""
Replace all StencilDimensions in `expr` with their minimum point.
"""
sdims = expr.find(StencilDimension)
try:
sdims = expr.find(StencilDimension)
except AttributeError:
return expr
mapper = {e: e._min for e in sdims}
return expr.subs(mapper)

Expand All @@ -295,7 +298,10 @@ def sdims_max(expr):
"""
Replace all StencilDimensions in `expr` with their maximum point.
"""
sdims = expr.find(StencilDimension)
try:
sdims = expr.find(StencilDimension)
except AttributeError:
return expr
mapper = {e: e._max for e in sdims}
return expr.subs(mapper)

Expand Down
21 changes: 15 additions & 6 deletions devito/passes/clusters/derivatives.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,30 @@
from devito.finite_differences import IndexDerivative
from devito.ir import Cluster, Interval, IntervalGroup, IterationSpace
from devito.symbolics import retrieve_dimensions, q_leaf, uxreplace
from devito.passes.clusters.misc import fuse
from devito.symbolics import (retrieve_dimensions, reuse_if_untouched, q_leaf,
uxreplace)
from devito.tools import as_tuple, filter_ordered, timed_pass
from devito.types import Eq, Inc, Spacing, StencilDimension, Symbol

__all__ = ['lower_index_derivatives']


@timed_pass()
def lower_index_derivatives(clusters, sregistry=None, **kwargs):
def lower_index_derivatives(clusters, mode=None, **kwargs):
clusters = _lower_index_derivatives(clusters, **kwargs)
if mode != 'noop':
clusters = fuse(clusters, toposort=True)

return clusters

def _lower_index_derivatives(clusters, sregistry=None, **kwargs):
processed = []
weights = {}
for c in clusters:

exprs = []
for e in c.exprs:
expr, v = _lower_index_derivatives(e, c, weights, sregistry)
expr, v = _lower_index_derivatives_core(e, c, weights, sregistry)
exprs.append(expr)
processed.extend(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move in if ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, that would be wrong


Expand All @@ -24,7 +33,7 @@ def lower_index_derivatives(clusters, sregistry=None, **kwargs):
return processed


def _lower_index_derivatives(expr, c, weights, sregistry):
def _lower_index_derivatives_core(expr, c, weights, sregistry):
"""
Recursively carry out the core of `lower_index_derivatives`.
"""
Expand All @@ -34,11 +43,11 @@ def _lower_index_derivatives(expr, c, weights, sregistry):
args = []
processed = []
for a in expr.args:
e, clusters = _lower_index_derivatives(a, c, weights, sregistry)
e, clusters = _lower_index_derivatives_core(a, c, weights, sregistry)
args.append(e)
processed.extend(clusters)

expr = expr.func(*args)
expr = reuse_if_untouched(expr, args)

if not isinstance(expr, IndexDerivative):
return expr, processed
Expand Down
2 changes: 1 addition & 1 deletion tests/test_derivatives.py
Original file line number Diff line number Diff line change
Expand Up @@ -674,7 +674,7 @@ def test_index_derivative_like(self):
u = Function(name="u", grid=grid, space_order=2)

ui = u.subs(x, x + i*x.spacing)
w = Weights(name='w0', dimensions=i, weights=[-0.5, 0, 0.5])
w = Weights(name='w0', dimensions=i, initvalue=[-0.5, 0, 0.5])

idxder = IndexDerivative(ui*w, w.dimension)

Expand Down
86 changes: 60 additions & 26 deletions tests/test_dse.py
Original file line number Diff line number Diff line change
Expand Up @@ -1488,6 +1488,30 @@ def test_space_invariant_v3(self):
self.check_array(arrays[2], ((0, 0), (0, 0)), (xs, ys))

def test_space_invariant_v4(self):
"""
Similar to test_space_invariant, stems from viscoacoustic -- a portion
of a space derivative that would be redundantly computed in two separated
loop nests is recognised to be a time invariant and factored into a common
temporary.
"""
grid = Grid(shape=(10, 10, 10))

f = Function(name='f', grid=grid)
u = TimeFunction(name='u', grid=grid)
v = TimeFunction(name='v', grid=grid)

eqns = [Eq(u.forward, (u*cos(f)).dx + v),
Eq(v.forward, (v*cos(f)).dy + u.forward.dx)]

op = Operator(eqns)

xs, ys, zs = self.get_params(op, 'x_size', 'y_size', 'z_size')
arrays = self.get_arrays(op)
assert len(arrays) == 1
self.check_array(arrays[0], ((1, 0), (1, 0), (0, 0)), (xs+1, ys+1, zs))
assert op._profiler._sections['section1'].sops == 15

def test_unexpanded_v0(self):
"""
Without prematurely expanding derivatives.
"""
Expand All @@ -1514,33 +1538,9 @@ def test_space_invariant_v4(self):

assert np.allclose(u.data, u1.data, rtol=10e-6)

def test_space_invariant_v5(self):
"""
Similar to test_space_invariant, stems from viscoacoustic -- a portion
of a space derivative that would be redundantly computed in two separated
loop nests is recognised to be a time invariant and factored into a common
temporary.
def test_unexpanded_v1(self):
"""
grid = Grid(shape=(10, 10, 10))

f = Function(name='f', grid=grid, space_order=4)
u = TimeFunction(name='u', grid=grid, space_order=4)
v = TimeFunction(name='v', grid=grid, space_order=4)

eqns = [Eq(u.forward, (u*cos(f)).dx + v),
Eq(v.forward, (v*cos(f)).dy + u.forward.dx)]

op = Operator(eqns)

xs, ys, zs = self.get_params(op, 'x_size', 'y_size', 'z_size')
arrays = self.get_arrays(op)
assert len(arrays) == 1
self.check_array(arrays[0], ((2, 2), (2, 2), (0, 0)), (xs+4, ys+4, zs))
assert op._profiler._sections['section1'].sops == 34

def test_space_invariant_v6(self):
"""
Like test_space_invariant_v5, but now try expanded vs unexpanded
Inspired by test_space_invariant_v5, but now try with unexpanded
derivatives.
"""
grid = Grid(shape=(10, 10, 10))
Expand Down Expand Up @@ -1572,6 +1572,26 @@ def test_space_invariant_v6(self):
assert np.allclose(u.data, u1.data, rtol=10e-5)
assert np.allclose(v.data, v1.data, rtol=10e-5)

def test_unexpanded_v2(self):
grid = Grid(shape=(10, 10, 10))

u = TimeFunction(name='u', grid=grid, space_order=4)
v = TimeFunction(name='v', grid=grid, space_order=4)
u1 = TimeFunction(name='u', grid=grid, space_order=4)
v1 = TimeFunction(name='v', grid=grid, space_order=4)

eqns = [Eq(u.forward, (u.dx.dy + v*u.dx + 1.)),
Eq(v.forward, (v.dy.dx + u.dx.dz + 1.))]

op0 = Operator(eqns)
op1 = Operator(eqns, opt=('advanced', {'expand': False}))

op0.apply(time_M=5)
op1.apply(time_M=5, u=u1, v=v1)

assert np.allclose(u.data, u1.data, rtol=10e-3)
assert np.allclose(v.data, v1.data, rtol=10e-3)

def test_catch_duplicate_from_different_clusters(self):
"""
Check that the compiler is able to detect redundant aliases when these
Expand Down Expand Up @@ -2693,6 +2713,20 @@ def test_premature_evalderiv_lowering(self):
assert len([i for i in FindSymbols().visit(op) if i.is_Array]) == 1
assert op._profiler._sections['section0'].sops == 16

def test_fusion_after_unexpansion(self):
grid = Grid(shape=(10, 10, 10))

u = TimeFunction(name='u', grid=grid, space_order=4)

eqn = Eq(u.forward, u.dx + u.dy)

op = Operator(eqn, opt=('advanced', {'expand': False}))
print(op)

#TODO -- FIX THE OPERATION COUNT !!!!!!!!!!!!!!!!!
#assert op._profiler._sections['section0'].sops == 34
assert_structure(op, ['t,x,y,z', 't,x,y,z,i0'], 't,x,y,z,i0')


class TestIsoAcoustic(object):

Expand Down