Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Implement graceful lowering of derivatives (aka "unexpansion") #2060

Merged
merged 61 commits into from
Feb 13, 2023

Conversation

FabioLuporini
Copy link
Contributor

@FabioLuporini FabioLuporini commented Feb 6, 2023

With the option

Operation(..., {'expand': False})

derivatives are now synthesized as loops over FD coefficients, rather than through unrolling. This required non-negligible additions and changes to the compiler, among which the enhancement of so called StencilDimensions, which were until now utilized in tests only and the generalization of several compiler passes (among which Cluster fusion).

The PR is equipped with lots of new tests.

However, the machinery introduced here is mainly exploited in PRO, where >100 tests stressing the new compilation engine have been added.

There still are 3 TODOs left here, which I count to address between this week and next, but they shouldn't change the PR content meaningfully EDIT: Done now

This stuff is mandatory for Rice; hence we should aim to merge it ASAP. Apologies for the little pressure, I did my best to file it ASAP, but despite working on it full-time since December, it is only now that it's reached a decent stage)

This PR should also more neatly achieve what #2014 (@georgebisbas ) has tried to do

@@ -42,12 +43,18 @@ def __init__(self, exprs, ispace=None, guards=None, properties=None, syncs=None)

self._exprs = tuple(ClusterizedEq(e, ispace=ispace) for e in as_tuple(exprs))
self._ispace = ispace
self._guards = frozendict(guards or {})
self._guards = Guards(guards or {})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for reviewers: to be exploited inside the compiler incrementally, so far only in a few places (and mostly in PRO)

@codecov
Copy link

codecov bot commented Feb 6, 2023

Codecov Report

Merging #2060 (0453503) into master (17e6e6e) will increase coverage by 0.08%.
The diff coverage is 92.63%.

@@            Coverage Diff             @@
##           master    #2060      +/-   ##
==========================================
+ Coverage   87.69%   87.78%   +0.08%     
==========================================
  Files         223      224       +1     
  Lines       37909    38653     +744     
  Branches     5707     5816     +109     
==========================================
+ Hits        33246    33930     +684     
- Misses       4126     4173      +47     
- Partials      537      550      +13     
Impacted Files Coverage Δ
devito/passes/iet/languages/C.py 100.00% <ø> (ø)
devito/passes/iet/languages/openacc.py 65.54% <0.00%> (ø)
devito/arch/compiler.py 42.57% <25.00%> (-0.32%) ⬇️
devito/ir/support/guards.py 55.85% <32.14%> (-8.43%) ⬇️
devito/ir/support/properties.py 74.25% <33.33%> (-4.16%) ⬇️
devito/symbolics/extended_sympy.py 95.89% <66.66%> (-0.26%) ⬇️
devito/ir/support/space.py 89.96% <76.92%> (-0.37%) ⬇️
devito/tools/dtypes_lowering.py 82.07% <81.81%> (-1.96%) ⬇️
devito/passes/iet/definitions.py 88.33% <84.37%> (+0.32%) ⬆️
devito/tools/data_structures.py 72.49% <86.66%> (+4.46%) ⬆️
... and 57 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -680,8 +681,9 @@ def __init__(self, name, body, retval, parameters=None, prefix=None):
self.parameters = as_tuple(parameters)

def __repr__(self):
param_types = [ctypes_to_cstr(i._C_ctype) for i in self.parameters]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will look, thanks

@mloubout mloubout changed the title Implement graceful lowering of derivatives (aka "unexpansion") compiler: Implement graceful lowering of derivatives (aka "unexpansion") Feb 6, 2023
@FabioLuporini
Copy link
Contributor Author

An important issue of this PR I forgot to mention. Code generation times do increase. Between 15% and 40% I'd say, but only noticeable with complex physics and high order discretizations. Cost increase is due to more sophisticated and in particular more accurate data dependence analysis.

@ggorman and I discussed at length about this and in particular what to do. We propose to park it until after Rice, or at least until the stuff for Rice is ready, then sprint on it altogether to improve the situation.

This time we want to use Numba

@mloubout
Copy link
Contributor

mloubout commented Feb 7, 2023

Numba might be a good idea yeah, and for other part of the compiler as well.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass some misc comments and questions ,will have proper look into FD

@@ -104,7 +106,8 @@ def __new__(cls, expr, *dims, **kwargs):
obj._deriv_order = orders if skip else DimensionTuple(*orders, getters=obj._dims)
obj._side = kwargs.get("side")
obj._transpose = kwargs.get("transpose", direct)
obj._ppsubs = as_tuple(frozendict(i) for i in kwargs.get("subs", []))
obj._ppsubs = as_tuple(frozendict(i) for i in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be nonsensical?

And even then, it would be an issue in master as well since we never attempt to read from the ppsubs, no?

As far as I understand, ppsubs is just an internal attribute that stashes user-provided subs via .subs(...). The only reason I added it is because now Derivative reconstruction exploits (finally!) the Reconstructable infrastructure based on __rargs__ and __rkwargs__, and ppsubs is one such __rkwargs__

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I would ditch subs kwarg, it's only used in one place (xreplace) so would replace it there by _ppsubs=subs and only have one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I tried that but something backfired, not sure I remember what now. I'll give it another try tomorrow morning

devito/finite_differences/differentiable.py Show resolved Hide resolved
devito/operator/operator.py Show resolved Hide resolved
devito/operator/profiling.py Show resolved Hide resolved
.github/workflows/pytest-core-nompi.yml Show resolved Hide resolved

# Transform e.g. `w[i0] -> w[i0 + 2]` for alignment with the
# StencilDimensions starting points
subs = {expr.weights: expr.weights.subs(d, d - d._min) for d in dims}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to have the weights directly indexed w.r.t min

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow, can you elaborate?

devito/symbolics/inspection.py Show resolved Hide resolved
devito/tools/data_structures.py Show resolved Hide resolved
devito/types/dimension.py Show resolved Hide resolved
else:
return None

@call_highest_priority('__radd__')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these to IndexAccessFunction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't, because they're returning AffineIndexAccessFunctions, which would be a subclass...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but IndexAccessFunction now doesn't really make sense since it's in here, using something like self.func and moving it to IndexAccessFunction would make more sense to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get you now. Changing as per your suggestion

devito/passes/iet/definitions.py Show resolved Hide resolved
devito/tools/dtypes_lowering.py Show resolved Hide resolved
C/C++ sense. 'constant' and 'shared' mean that the Array represents an
object allocated in so called constant and shared memory, respectively,
which are typical of device architectures. If 'shared' is specified but
the underlying architecture does not something akin to shared memory, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not do? something akin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not have? as below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing

@@ -27,11 +27,11 @@ def test_basic():
assert 'include "%s.h"' % name in ccode

# The public `struct dataobj` only appears in the header file
assert str(f._C_typedecl) not in ccode
assert str(f._C_typedecl) in hcode
assert 'struct dataobj\n{' not in ccode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we cannot access this as field now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, simply not as a Function property, which was ugly


# Same with `struct profiler`
timers = op.parameters[-1]
assert isinstance(timers, Timer)
assert str(timers._C_typedecl) not in ccode
assert str(timers._C_typedecl) in hcode
assert 'struct profiler\n{' not in ccode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -2665,7 +2798,7 @@ def test_fullopt(self):
bns, _ = assert_blocking(op1, {'x0_blk0'}) # due to loop blocking

assert summary0[('section0', None)].ops == 50
assert summary0[('section1', None)].ops == 140
assert summary0[('section1', None)].ops == 148
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

! Εxpected ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it was miscounted before IIRC


# Sanity check
if not (expr.is_Mul and len(weightss) == 1):
raise ValueError("Expect `expr*weights`, got `%s` instead" % str(expr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expected (?)

devito/passes/iet/definitions.py Show resolved Hide resolved
elif len(dtypes) == 1:
return dtypes.pop()
else:
# E.g., mixed integer arithmetic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lift/rephrase this case in func docstring?

Copy link
Contributor

@mloubout mloubout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments but looks good to me

@@ -579,47 +599,92 @@ def __init_finalize__(self, *args, **kwargs):
assert isinstance(d, StencilDimension) and d.symbolic_size == len(weights)
assert isinstance(weights, (list, tuple, np.ndarray))

kwargs['scope'] = 'static'
try:
self._spacings = set().union(*[i.find(Spacing) for i in weights])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just set(i.find(Spacing) for i in weights) ?


@_estimate_cost.register(Derivative)
def _(expr, estimate):
return _estimate_cost(expr._evaluate(expand=False), estimate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to have the Derivative know its own cost without evaluation that can be expensive for large expression (and that will be re-eavluated). In theory it should always be expr.fd_order * 2 * _estimate_cost(expr.expr) and would be correct with and without expand for free

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure about that? doesn't it also depend on deriv_order, whether it's left/right/center , or perhaps even shifted, etc etc

Anyway, in practice you never ever run estimate_cost on unlowered expressions, hence, you never actually hit this handle...

f = TimeFunction(name='f', grid=grid, space_order=4)

term1 = f.dxdy._evaluate(expand=False)
assert len(term1.find(IndexDerivative)) == 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert depth==1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean ==2?

assert expr.sd is sd
assert expr.ofs == 1 + s

def test_sub(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why "sub"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, instead of classic add, a sub

@FabioLuporini FabioLuporini merged commit 96e618a into master Feb 13, 2023
@FabioLuporini FabioLuporini deleted the unexpansion-final branch February 13, 2023 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants