-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler: Implement graceful lowering of derivatives (aka "unexpansion") #2060
Conversation
@@ -42,12 +43,18 @@ def __init__(self, exprs, ispace=None, guards=None, properties=None, syncs=None) | |||
|
|||
self._exprs = tuple(ClusterizedEq(e, ispace=ispace) for e in as_tuple(exprs)) | |||
self._ispace = ispace | |||
self._guards = frozendict(guards or {}) | |||
self._guards = Guards(guards or {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for reviewers: to be exploited inside the compiler incrementally, so far only in a few places (and mostly in PRO)
Codecov Report
@@ Coverage Diff @@
## master #2060 +/- ##
==========================================
+ Coverage 87.69% 87.78% +0.08%
==========================================
Files 223 224 +1
Lines 37909 38653 +744
Branches 5707 5816 +109
==========================================
+ Hits 33246 33930 +684
- Misses 4126 4173 +47
- Partials 537 550 +13
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@@ -680,8 +681,9 @@ def __init__(self, name, body, retval, parameters=None, prefix=None): | |||
self.parameters = as_tuple(parameters) | |||
|
|||
def __repr__(self): | |||
param_types = [ctypes_to_cstr(i._C_ctype) for i in self.parameters] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will look, thanks
An important issue of this PR I forgot to mention. Code generation times do increase. Between 15% and 40% I'd say, but only noticeable with complex physics and high order discretizations. Cost increase is due to more sophisticated and in particular more accurate data dependence analysis. @ggorman and I discussed at length about this and in particular what to do. We propose to park it until after Rice, or at least until the stuff for Rice is ready, then sprint on it altogether to improve the situation. This time we want to use Numba |
Numba might be a good idea yeah, and for other part of the compiler as well. |
b528c17
to
2c5eb4d
Compare
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass some misc comments and questions ,will have proper look into FD
@@ -104,7 +106,8 @@ def __new__(cls, expr, *dims, **kwargs): | |||
obj._deriv_order = orders if skip else DimensionTuple(*orders, getters=obj._dims) | |||
obj._side = kwargs.get("side") | |||
obj._transpose = kwargs.get("transpose", direct) | |||
obj._ppsubs = as_tuple(frozendict(i) for i in kwargs.get("subs", [])) | |||
obj._ppsubs = as_tuple(frozendict(i) for i in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there is both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'd be nonsensical?
And even then, it would be an issue in master as well since we never attempt to read from the ppsubs, no?
As far as I understand, ppsubs is just an internal attribute that stashes user-provided subs via .subs(...)
. The only reason I added it is because now Derivative reconstruction exploits (finally!) the Reconstructable
infrastructure based on __rargs__
and __rkwargs__
, and ppsubs is one such __rkwargs__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I would ditch subs
kwarg, it's only used in one place (xreplace
) so would replace it there by _ppsubs=subs
and only have one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I tried that but something backfired, not sure I remember what now. I'll give it another try tomorrow morning
|
||
# Transform e.g. `w[i0] -> w[i0 + 2]` for alignment with the | ||
# StencilDimensions starting points | ||
subs = {expr.weights: expr.weights.subs(d, d - d._min) for d in dims} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to have the weights
directly indexed w.r.t min
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow, can you elaborate?
devito/types/dimension.py
Outdated
else: | ||
return None | ||
|
||
@call_highest_priority('__radd__') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move these to IndexAccessFunction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't, because they're returning AffineIndexAccessFunctions, which would be a subclass...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but IndexAccessFunction
now doesn't really make sense since it's in here, using something like self.func
and moving it to IndexAccessFunction
would make more sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get you now. Changing as per your suggestion
devito/types/array.py
Outdated
C/C++ sense. 'constant' and 'shared' mean that the Array represents an | ||
object allocated in so called constant and shared memory, respectively, | ||
which are typical of device architectures. If 'shared' is specified but | ||
the underlying architecture does not something akin to shared memory, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not do? something akin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not have? as below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing
@@ -27,11 +27,11 @@ def test_basic(): | |||
assert 'include "%s.h"' % name in ccode | |||
|
|||
# The public `struct dataobj` only appears in the header file | |||
assert str(f._C_typedecl) not in ccode | |||
assert str(f._C_typedecl) in hcode | |||
assert 'struct dataobj\n{' not in ccode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we cannot access this as field now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, simply not as a Function property, which was ugly
|
||
# Same with `struct profiler` | ||
timers = op.parameters[-1] | ||
assert isinstance(timers, Timer) | ||
assert str(timers._C_typedecl) not in ccode | ||
assert str(timers._C_typedecl) in hcode | ||
assert 'struct profiler\n{' not in ccode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -2665,7 +2798,7 @@ def test_fullopt(self): | |||
bns, _ = assert_blocking(op1, {'x0_blk0'}) # due to loop blocking | |||
|
|||
assert summary0[('section0', None)].ops == 50 | |||
assert summary0[('section1', None)].ops == 140 | |||
assert summary0[('section1', None)].ops == 148 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
! Εxpected ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it was miscounted before IIRC
|
||
# Sanity check | ||
if not (expr.is_Mul and len(weightss) == 1): | ||
raise ValueError("Expect `expr*weights`, got `%s` instead" % str(expr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expected (?)
elif len(dtypes) == 1: | ||
return dtypes.pop() | ||
else: | ||
# E.g., mixed integer arithmetic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lift/rephrase this case in func docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments but looks good to me
@@ -579,47 +599,92 @@ def __init_finalize__(self, *args, **kwargs): | |||
assert isinstance(d, StencilDimension) and d.symbolic_size == len(weights) | |||
assert isinstance(weights, (list, tuple, np.ndarray)) | |||
|
|||
kwargs['scope'] = 'static' | |||
try: | |||
self._spacings = set().union(*[i.find(Spacing) for i in weights]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just set(i.find(Spacing) for i in weights)
?
|
||
@_estimate_cost.register(Derivative) | ||
def _(expr, estimate): | ||
return _estimate_cost(expr._evaluate(expand=False), estimate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to have the Derivative know its own cost without evaluation that can be expensive for large expression (and that will be re-eavluated). In theory it should always be expr.fd_order * 2 * _estimate_cost(expr.expr)
and would be correct with and without expand for free
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely sure about that? doesn't it also depend on deriv_order, whether it's left/right/center , or perhaps even shifted, etc etc
Anyway, in practice you never ever run estimate_cost on unlowered expressions, hence, you never actually hit this handle...
f = TimeFunction(name='f', grid=grid, space_order=4) | ||
|
||
term1 = f.dxdy._evaluate(expand=False) | ||
assert len(term1.find(IndexDerivative)) == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert depth==1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean ==2?
assert expr.sd is sd | ||
assert expr.ofs == 1 + s | ||
|
||
def test_sub(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why "sub"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, instead of classic add, a sub
With the option
derivatives are now synthesized as loops over FD coefficients, rather than through unrolling. This required non-negligible additions and changes to the compiler, among which the enhancement of so called
StencilDimension
s, which were until now utilized in tests only and the generalization of several compiler passes (among which Cluster fusion).The PR is equipped with lots of new tests.
However, the machinery introduced here is mainly exploited in PRO, where >100 tests stressing the new compilation engine have been added.
There still are 3 TODOs left here, which I count to address between this week and next, but they shouldn't change the PR content meaningfullyEDIT: Done nowThis stuff is mandatory for Rice; hence we should aim to merge it ASAP. Apologies for the little pressure, I did my best to file it ASAP, but despite working on it full-time since December, it is only now that it's reached a decent stage)
This PR should also more neatly achieve what #2014 (@georgebisbas ) has tried to do