compiler: Implement graceful lowering of derivatives (aka "unexpansion") #2060

FabioLuporini · 2023-02-06T11:15:59Z

With the option

Operation(..., {'expand': False})

derivatives are now synthesized as loops over FD coefficients, rather than through unrolling. This required non-negligible additions and changes to the compiler, among which the enhancement of so called StencilDimensions, which were until now utilized in tests only and the generalization of several compiler passes (among which Cluster fusion).

The PR is equipped with lots of new tests.

However, the machinery introduced here is mainly exploited in PRO, where >100 tests stressing the new compilation engine have been added.

~~There still are 3 TODOs left here, which I count to address between this week and next, but they shouldn't change the PR content meaningfully~~ EDIT: Done now

This stuff is mandatory for Rice; hence we should aim to merge it ASAP. Apologies for the little pressure, I did my best to file it ASAP, but despite working on it full-time since December, it is only now that it's reached a decent stage)

This PR should also more neatly achieve what #2014 (@georgebisbas ) has tried to do

FabioLuporini · 2023-02-06T11:17:30Z

devito/ir/clusters/cluster.py

@@ -42,12 +43,18 @@ def __init__(self, exprs, ispace=None, guards=None, properties=None, syncs=None)

 self._exprs = tuple(ClusterizedEq(e, ispace=ispace) for e in as_tuple(exprs))
 self._ispace = ispace
- self._guards = frozendict(guards or {})
+ self._guards = Guards(guards or {})


note for reviewers: to be exploited inside the compiler incrementally, so far only in a few places (and mostly in PRO)

codecov · 2023-02-06T11:25:22Z

Codecov Report

Merging #2060 (0453503) into master (17e6e6e) will increase coverage by 0.08%.
The diff coverage is 92.63%.

@@            Coverage Diff             @@
##           master    #2060      +/-   ##
==========================================
+ Coverage   87.69%   87.78%   +0.08%     
==========================================
  Files         223      224       +1     
  Lines       37909    38653     +744     
  Branches     5707     5816     +109     
==========================================
+ Hits        33246    33930     +684     
- Misses       4126     4173      +47     
- Partials      537      550      +13

Impacted Files	Coverage Δ
devito/passes/iet/languages/C.py	`100.00% <ø> (ø)`
devito/passes/iet/languages/openacc.py	`65.54% <0.00%> (ø)`
devito/arch/compiler.py	`42.57% <25.00%> (-0.32%)`	⬇️
devito/ir/support/guards.py	`55.85% <32.14%> (-8.43%)`	⬇️
devito/ir/support/properties.py	`74.25% <33.33%> (-4.16%)`	⬇️
devito/symbolics/extended_sympy.py	`95.89% <66.66%> (-0.26%)`	⬇️
devito/ir/support/space.py	`89.96% <76.92%> (-0.37%)`	⬇️
devito/tools/dtypes_lowering.py	`82.07% <81.81%> (-1.96%)`	⬇️
devito/passes/iet/definitions.py	`88.33% <84.37%> (+0.32%)`	⬆️
devito/tools/data_structures.py	`72.49% <86.66%> (+4.46%)`	⬆️
... and 57 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

georgebisbas · 2023-02-06T12:06:24Z

devito/ir/iet/nodes.py

@@ -680,8 +681,9 @@ def __init__(self, name, body, retval, parameters=None, prefix=None):
 self.parameters = as_tuple(parameters)

 def __repr__(self):
+ param_types = [ctypes_to_cstr(i._C_ctype) for i in self.parameters]


will look, thanks

FabioLuporini · 2023-02-07T15:43:40Z

An important issue of this PR I forgot to mention. Code generation times do increase. Between 15% and 40% I'd say, but only noticeable with complex physics and high order discretizations. Cost increase is due to more sophisticated and in particular more accurate data dependence analysis.

@ggorman and I discussed at length about this and in particular what to do. We propose to park it until after Rice, or at least until the stuff for Rice is ready, then sprint on it altogether to improve the situation.

This time we want to use Numba

mloubout · 2023-02-07T15:45:12Z

Numba might be a good idea yeah, and for other part of the compiler as well.

review-notebook-app · 2023-02-07T17:18:33Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

mloubout

First pass some misc comments and questions ,will have proper look into FD

mloubout · 2023-02-06T17:10:00Z

devito/finite_differences/derivative.py

@@ -104,7 +106,8 @@ def __new__(cls, expr, *dims, **kwargs):
 obj._deriv_order = orders if skip else DimensionTuple(*orders, getters=obj._dims)
 obj._side = kwargs.get("side")
 obj._transpose = kwargs.get("transpose", direct)
- obj._ppsubs = as_tuple(frozendict(i) for i in kwargs.get("subs", []))
+ obj._ppsubs = as_tuple(frozendict(i) for i in


What if there is both?

That'd be nonsensical?

And even then, it would be an issue in master as well since we never attempt to read from the ppsubs, no?

As far as I understand, ppsubs is just an internal attribute that stashes user-provided subs via .subs(...). The only reason I added it is because now Derivative reconstruction exploits (finally!) the Reconstructable infrastructure based on __rargs__ and __rkwargs__, and ppsubs is one such __rkwargs__

Then I would ditch subs kwarg, it's only used in one place (xreplace) so would replace it there by _ppsubs=subs and only have one

I think I tried that but something backfired, not sure I remember what now. I'll give it another try tomorrow morning

devito/finite_differences/differentiable.py

devito/operator/operator.py

devito/operator/profiling.py

.github/workflows/pytest-core-nompi.yml

mloubout · 2023-02-08T19:43:23Z

devito/passes/clusters/derivatives.py

+
+ # Transform e.g. `w[i0] -> w[i0 + 2]` for alignment with the
+ # StencilDimensions starting points
+ subs = {expr.weights: expr.weights.subs(d, d - d._min) for d in dims}


Wouldn't it be better to have the weights directly indexed w.r.t min

I'm not sure I follow, can you elaborate?

devito/symbolics/inspection.py

devito/tools/data_structures.py

devito/types/dimension.py

mloubout · 2023-02-08T20:19:30Z

devito/types/dimension.py

+ else:
+ return None
+
+ @call_highest_priority('__radd__')


Move these to IndexAccessFunction

can't, because they're returning AffineIndexAccessFunctions, which would be a subclass...

Yeah but IndexAccessFunction now doesn't really make sense since it's in here, using something like self.func and moving it to IndexAccessFunction would make more sense to me

I get you now. Changing as per your suggestion

devito/passes/iet/definitions.py

devito/tools/dtypes_lowering.py

georgebisbas · 2023-02-09T13:22:49Z

devito/types/array.py

+ C/C++ sense. 'constant' and 'shared' mean that the Array represents an
+ object allocated in so called constant and shared memory, respectively,
+ which are typical of device architectures. If 'shared' is specified but
+ the underlying architecture does not something akin to shared memory, the


does not do? something akin

does not have? as below?

georgebisbas · 2023-02-09T13:49:35Z

tests/test_cinterface.py

@@ -27,11 +27,11 @@ def test_basic():
 assert 'include "%s.h"' % name in ccode

 # The public `struct dataobj` only appears in the header file
- assert str(f._C_typedecl) not in ccode
- assert str(f._C_typedecl) in hcode
+ assert 'struct dataobj\n{' not in ccode


so we cannot access this as field now?

well, simply not as a Function property, which was ugly

georgebisbas · 2023-02-09T13:49:49Z

tests/test_cinterface.py


 # Same with `struct profiler`
 timers = op.parameters[-1]
 assert isinstance(timers, Timer)
- assert str(timers._C_typedecl) not in ccode
- assert str(timers._C_typedecl) in hcode
+ assert 'struct profiler\n{' not in ccode


georgebisbas · 2023-02-09T13:58:12Z

tests/test_dse.py

@@ -2665,7 +2798,7 @@ def test_fullopt(self):
 bns, _ = assert_blocking(op1, {'x0_blk0'}) # due to loop blocking

 assert summary0[('section0', None)].ops == 50
- assert summary0[('section1', None)].ops == 140
+ assert summary0[('section1', None)].ops == 148


! Εxpected ?

yes, it was miscounted before IIRC

georgebisbas · 2023-02-10T09:51:58Z

devito/finite_differences/differentiable.py

+
+ # Sanity check
+ if not (expr.is_Mul and len(weightss) == 1):
+ raise ValueError("Expect `expr*weights`, got `%s` instead" % str(expr))


Expected (?)

devito/passes/iet/definitions.py

georgebisbas · 2023-02-10T10:29:33Z

devito/tools/dtypes_lowering.py

+ elif len(dtypes) == 1:
+ return dtypes.pop()
+ else:
+ # E.g., mixed integer arithmetic


lift/rephrase this case in func docstring?

mloubout

Some minor comments but looks good to me

mloubout · 2023-02-10T16:37:43Z

devito/finite_differences/differentiable.py

@@ -579,47 +599,92 @@ def __init_finalize__(self, *args, **kwargs):
 assert isinstance(d, StencilDimension) and d.symbolic_size == len(weights)
 assert isinstance(weights, (list, tuple, np.ndarray))

- kwargs['scope'] = 'static'
+ try:
+ self._spacings = set().union(*[i.find(Spacing) for i in weights])


why not just set(i.find(Spacing) for i in weights) ?

mloubout · 2023-02-10T16:46:43Z

devito/symbolics/inspection.py

+
+@_estimate_cost.register(Derivative)
+def _(expr, estimate):
+ return _estimate_cost(expr._evaluate(expand=False), estimate)


Might be good to have the Derivative know its own cost without evaluation that can be expensive for large expression (and that will be re-eavluated). In theory it should always be expr.fd_order * 2 * _estimate_cost(expr.expr) and would be correct with and without expand for free

Not entirely sure about that? doesn't it also depend on deriv_order, whether it's left/right/center , or perhaps even shifted, etc etc

Anyway, in practice you never ever run estimate_cost on unlowered expressions, hence, you never actually hit this handle...

mloubout · 2023-02-10T16:59:06Z

tests/test_derivatives.py

+ f = TimeFunction(name='f', grid=grid, space_order=4)
+
+ term1 = f.dxdy._evaluate(expand=False)
+ assert len(term1.find(IndexDerivative)) == 2


assert depth==1?

you mean ==2?

mloubout · 2023-02-10T17:00:26Z

tests/test_dimension.py

+ assert expr.sd is sd
+ assert expr.ofs == 1 + s
+
+ def test_sub(self):


yes, instead of classic add, a sub

FabioLuporini added the compiler label Feb 6, 2023

FabioLuporini requested review from ggorman, mloubout and georgebisbas February 6, 2023 11:16

FabioLuporini commented Feb 6, 2023

View reviewed changes

georgebisbas reviewed Feb 6, 2023

View reviewed changes

mloubout changed the title ~~Implement graceful lowering of derivatives (aka "unexpansion")~~ compiler: Implement graceful lowering of derivatives (aka "unexpansion") Feb 6, 2023

FabioLuporini added 20 commits February 7, 2023 15:48

compiler: Prototype unexpansion

f799a72

compiler: Revamp code generation from _C_ctype

ffc8c21

compiler: Support trivial unexpanded-derivatives examples

91eee45

dsl: Patch cross_derivative evaluation

0eddd5d

dsl: Introduce Spacing subclass

63f41e8

compiler: Patch StencilDimension reconstruction

a94ab05

compiler: Extend unexpansion machinery

ce7c96c

compiler: Support StencilDimension in estimate_cost

4d63f17

compiler: Enhance fusion upon lower_index_derivative

380d262

compiler: Rework is_cross rule for Cluster fusion

45b5fa4

compiler: Implement maximal fusion for lowered IndexDerivatives

8805cd9

compiler: Patch profiling in presence of StencilDimensions

573ba59

compiler: Improve IndexDerivative lowering to catch duplicates

fda18ef

compiler: Enhance pow_to_mul to work around SymPy misbehavior

be02291

compiler: Rework globals generation for device backends

c069dfe

compiler: Rework weights generation for device backends

573b4cb

compiler: Patch index mode detection with StencilDimensions

eb8616c

compiler: Add IndexDerivative.mapper

0044b3b

compiler: Patch lower_index_derivative

fb6c226

tests: Patch draft flaky unexpansion test

9a6679a

FabioLuporini added 8 commits February 7, 2023 15:48

compiler: Introduce AffineIndexAccessFunction

c562c30

compiler: Improve IndexDerivative

43e700f

compiler: Enhance dtype retrieval

a588b6d

compiler: Tidy up Interval.expand()

18bc963

compiler: Drop has_free for compatibility with older SymPy versions

2640df3

examples: Update expected notebook output

5113825

compiler: Patch codegen upon pow_to_mul

0297597

misc: Postpone codegen speed improvement

2c5eb4d

FabioLuporini force-pushed the unexpansion-final branch from b528c17 to 2c5eb4d Compare February 7, 2023 17:18

FabioLuporini added 6 commits February 8, 2023 09:44

examples: Update expected output

403a194

examples: Disable openmp where necessary due to issue 2061

62e0fc8

compiler: Exploit SubDim.local to support nasty deps in examples

0c801bd

ci: Drop support for gcc5, sympy1.7, sympy1.8

a97f6e6

compiler: Add IndexDerivative.total_order

77eae11

arch: Enable openmp with nvc on CPU

9af12f4

mloubout reviewed Feb 8, 2023

View reviewed changes

FabioLuporini added 2 commits February 9, 2023 10:15

ci: Add back forgotten gcc-11

f33fcfc

compiler: Tweak lower_index_derivatives

88fafce

georgebisbas reviewed Feb 9, 2023

View reviewed changes

FabioLuporini added 4 commits February 10, 2023 08:53

compiler: IndexDerivative.total_order -> depth

dde53cc

misc: Tweak docstring

2ce2797

compiler: Lift overrides from AffineIndexAccessFunc into IndexAccessFunc

95b482c

arch: Add amdclang mapping

0453503

georgebisbas reviewed Feb 10, 2023

View reviewed changes

mloubout approved these changes Feb 10, 2023

View reviewed changes

FabioLuporini merged commit 96e618a into master Feb 13, 2023

FabioLuporini deleted the unexpansion-final branch February 13, 2023 08:13

FabioLuporini mentioned this pull request Feb 13, 2023

compiler: restrict qualifier as property #2014

Closed

compiler: Implement graceful lowering of derivatives (aka "unexpansion") #2060

compiler: Implement graceful lowering of derivatives (aka "unexpansion") #2060

Conversation

FabioLuporini commented Feb 6, 2023 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Feb 6, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

FabioLuporini commented Feb 7, 2023

mloubout commented Feb 7, 2023

review-notebook-app bot commented Feb 7, 2023

mloubout left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mloubout left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FabioLuporini commented Feb 6, 2023 •

edited

Loading

codecov bot commented Feb 6, 2023 •

edited

Loading