Ltac2 local2ptree #712

MSoegtropIMC · 2023-09-01T16:08:05Z

This PR replaces local2ptree with a version written in Ltac2 which is much faster (in measurements about 100x for a case with about 60 local variables).

The difference in make test is anyway quite small (1..2%) - the time of local2ptree is more visible in developments with many local variables.

Here are the make test times for the old and new version.

make -j 8 test  209.18s user 8.87s system 393% cpu 55.419 total
make -j 8 test  206.72s user 8.51s system 397% cpu 54.148 total

MSoegtropIMC · 2023-09-01T16:10:39Z

One note: I am using lazy instead of cbv here because it can be faster at Qed times - at tactic time it should not make much of a difference.

andrew-appel · 2023-09-01T17:43:49Z

I suggest: Add a brief comment just before the Std.lazy line (if that's the right place?), saying (* use lazy instead of cbv because Qed can be faster *)

andrew-appel · 2023-09-01T17:45:14Z

I suggest, at line 71 augment your comment with, "Normally this makes a significant but small improvement in performance; but when there are many local variables (e.g. 60) it makes a huge difference."

JasonGross · 2023-09-08T17:34:36Z

One note: I am using lazy instead of cbv here because it can be faster at Qed times - at tactic time it should not make much of a difference.

I don't understand this comment. The difference between lazy and cbv is that lazy uses the same algorithm for reduction that Qed / the kernel uses, while cbv uses a different algorithm. But both tactics leave exactly the same trace in the proof term (a cast node with, I think, the post-reduction term (though it may be the pre-reduction term, I don't recall)), and regardless of which one you do, Qed / the kernel will always use the lazy algorithm. So any difference you see between lazy and cbv at Qed-time should just be noise.

JasonGross · 2023-09-08T17:39:01Z

floyd/local2ptree_denote.v

+ Message.Format.kfprintf (fun x => Control.throw (Invalid_argument (Some x))) fmt.
+Ltac2 Notation "throw_invalid_argument" fmt(format) := throw_invalid_argument fmt.
+
+(* If one works with computed symbol lists, one can't use the tactic notation for evaluation functions (which construct such records in a more concise way) *)


Couldn't you do something like Ltac2 adjust_redflags_delta (syms : Std.reference list) (redflags : Std.red_flags) : Std.red_flags := { redflags with Std.rDelta := false; Std.rConst := syms } and then make a notation so you can write adjust_redflags_delta [id] syms or adjust_redflags_delta (beta iota) syms?

Or { redflags with Std.rConst := List.app redflags.rConst syms } and then you could do reduction_strategy:([<list of things>]) below and pass that around instead of starting with a list of references. Might be slightly more concise

I have to think about it. I tried a few things with notations, but none of this did work well in case one has mixed references, say local and global, manually specified and computed lists. In the end it was more effort and harder to understand than doing it low level. But meanwhile I am a bit smarter and can reconsider it.

JasonGross · 2023-09-08T17:39:33Z

floyd/local2ptree_denote.v

+ Std.rConst := syms
+}.
+
+Ltac2 reference_of_constr(ref : constr) : Std.reference :=


This seems worth upstreaming into the Coq standard library, I've needed something like this a couple of times too

MSoegtropIMC · 2023-09-13T08:15:54Z

Regarding lazy vs. cbv: my (not very elaborate) understanding of this is that using lazy vs cbv doesn't make a difference if both deliver the same results, but it can make a substantial difference if the results are different, because then the result of lazy should be closer to the result of the kernel and require less unification effort. For partial evaluation with a restricted symbol list as done here, I believe it can well be that lazy and cbv give different results. I have difficulties providing examples for this, though, since lazy is less lazy that I would expect.

Anyway at some point I started to prefer lazy, but I can't exclude that this was caused by doing computations directly in hypothesis, which (afaik) doesn't leave an annotation.

JasonGross · 2023-09-14T03:12:25Z

For partial evaluation with a restricted symbol list as done here, I believe it can well be that lazy and cbv give different results

If such an example exists, it's a bug in Coq. The difference between lazy and cbv is evaluation ordering and sharing of reductions, but so long as Coq reduction is confluent (which it is widely believed to be, I think), they cannot give different results.

MSoegtropIMC · 2023-09-14T05:58:24Z

so long as Coq reduction is confluent (which it is widely believed to be, I think), they cannot give different results.

Isn't it so that the argument of confluence only applies to full evaluation? We are doing partial evaluation here (with restricted delta lists). I mean hnf and cbv also give different results without impeding confluence. I didn't look into lazy into too much detail, and as I said in tests it is less lazy then I would expect, but my understanding of lazy vs cbv is that lazy does not evaluate arguments of functions unless these arguments are actually used in the expansion of a function. If the evaluation of a function a is partially blocked say because it contains a function b which is not in the delta list, and if an argument of a is passed to b, then cbv should expand the argument, while lazy should not. So lazy should result in a term where the argument of a passed to b is not expanded, while cbv should deliver the same term with the argument to b expanded. How such differences shall impede confluence I don't see.
My assumption is that such differences lead to issues in unification later, since one of the most difficult questions in unification is to decide which symbols to unfold.

I tried to make some trivial examples, but failed spending just a few minutes. If you believe my understanding of this is substantially flawed, I will spend some more time trying to understand this in more detail.

JasonGross · 2023-09-14T14:14:59Z

Isn't it so that the argument of confluence only applies to full evaluation? We are doing partial evaluation here (with restricted delta lists).

Imagine wrapping the entire environment in a lambda abstracting over the constants to be left behind. If full reduction is confluent even under lambdas it should be confluent even with restricted delta lists. Less sure about leaving out the other reductions, though I think most of them can be modeled by replacing certain constructs with opaque applications.

I mean hnf and cbv also give different results without impeding confluence

hnf is not strong, it does not recurse under binders when the head is normal. cbv and lazy do. Moreover hnf uses the simpl engine under the hood, performing refolding, so it's really not analogous. My claim applies specifically to lazy and cbv. Even VM and native can give different results, though only in two places: alpha renaming of lambda variables and in not normalizing the parameters of inductive constructors.

If the evaluation of a function a is partially blocked say because it contains a function b which is not in the delta list, and if an argument of a is passed to b, then cbv should expand the argument, while lazy should not. So lazy should result in a term where the argument of a passed to b is not expanded, while cbv should deliver the same term with the argument to b expanded.

You might be missing the fact that once there are no more redexes, cbv recurses under binders and lazy recurses under binders and into arguments to blocked symbols, so even if the argument passed to b is unexpended when it reaches b, the reduction will go back and expand it once its done reducing everything else. The terms should be the same at the end.

JasonGross · 2023-09-14T14:17:01Z

(If you're interested in reading more, there's coq/coq#17503 (comment), but most of that is not relevant here)

JasonGross · 2023-09-14T16:19:42Z

How such differences shall impede confluence I don't see.
My assumption is that such differences lead to issues in unification later, since one of the most difficult questions in unification is to decide which symbols to unfold.

Also not sure if this is the difference of understanding, but in case it is:

note that reduction does not invoke unification at all; unification is only used for filling evars (deciding which symbols to unfold when also matters for conversion--which is basically unification of evar-free terms---but it only matters for performance)
reduction that does not recurse under binders is not confluent, for the reason you identify: if cbv and lazy did not recurse under binders, then blocked symbols would be applied to differently normalized arguments at the end

JasonGross · 2023-09-14T16:22:30Z

but my understanding of lazy vs cbv is that lazy does not evaluate arguments of functions unless these arguments are actually used in the expansion of a function

And I guess a more accurate characterization here is that lazy does not expand function arguments until either they occur in the head position, as the discriminee of a redex in the head position, or until all such redexes have been evaluated and lazy is recursing into the rest of the term

MSoegtropIMC · 2023-10-11T09:43:10Z

Regarding the lazy vs. cbv question: I today came across an example where it makes a substantial speed difference - as @JasonGross explained the difference is not at Qed time but during evaluation itself. In some context where a compspecs exists (reptype takes one as implicit argument), I get:

  Time Eval cbv in reptype tvoid.
  Time Eval lazy in reptype tvoid.
=>
	 = unit
     : Type
Finished transaction in 0.284 secs (0.283u,0.s) (successful)
	 = unit
     : Type
Finished transaction in 0. secs (0.u,0.s) (successful)

I guess this is so because cbv would compute in the complete compspecs, while with lazy evaluation the compsepcs is not evaluated since it is not required to get the representation type of void.

So the comment is wrong, but I there are still good reasons in to prefer lazy over cbv in VST.

andrew-appel · 2023-10-11T13:04:21Z

Just out of curiosity, I examined this question: are the compspecs fully evaluated already? That is, would doing one extra "eval compute" in the make_compspecs tactic change anything? And the answer is, basically no. In a typical case, "progress compute" will fail to progress.
So what is taking the time? Perhaps it is this: within the compspecs there are some (potentially) big proofs, already in normal form. "cbv" walks through these proofs, "lazy" does not need to. Could that be it?

MSoegtropIMC · 2023-10-11T13:15:52Z

@andrew-appel : yes, this matches what I believe is happening. lazy looks only at what it "needs" to compute the final result, while cbv goes at least once through everything. So especially in structures which are a mix of data and proof terms, lazy should be of advantage.

MSoegtropIMC added 2 commits September 1, 2023 18:02

Allow Coq 8.17.1

e9e6049

Replace prove_local2ptree with a much faster Ltac2 version

a0e6b63

JasonGross reviewed Sep 8, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ltac2 local2ptree #712

Ltac2 local2ptree #712

MSoegtropIMC commented Sep 1, 2023

MSoegtropIMC commented Sep 1, 2023

andrew-appel commented Sep 1, 2023

andrew-appel commented Sep 1, 2023

JasonGross commented Sep 8, 2023

JasonGross Sep 8, 2023

JasonGross Sep 8, 2023

MSoegtropIMC Sep 11, 2023

JasonGross Sep 8, 2023

MSoegtropIMC commented Sep 13, 2023

JasonGross commented Sep 14, 2023

MSoegtropIMC commented Sep 14, 2023

JasonGross commented Sep 14, 2023

JasonGross commented Sep 14, 2023

JasonGross commented Sep 14, 2023

JasonGross commented Sep 14, 2023

MSoegtropIMC commented Oct 11, 2023

andrew-appel commented Oct 11, 2023

MSoegtropIMC commented Oct 11, 2023

Ltac2 local2ptree #712

Are you sure you want to change the base?

Ltac2 local2ptree #712

Conversation

MSoegtropIMC commented Sep 1, 2023

MSoegtropIMC commented Sep 1, 2023

andrew-appel commented Sep 1, 2023

andrew-appel commented Sep 1, 2023

JasonGross commented Sep 8, 2023

JasonGross Sep 8, 2023

Choose a reason for hiding this comment

JasonGross Sep 8, 2023

Choose a reason for hiding this comment

MSoegtropIMC Sep 11, 2023

Choose a reason for hiding this comment

JasonGross Sep 8, 2023

Choose a reason for hiding this comment

MSoegtropIMC commented Sep 13, 2023

JasonGross commented Sep 14, 2023

MSoegtropIMC commented Sep 14, 2023

JasonGross commented Sep 14, 2023

JasonGross commented Sep 14, 2023

JasonGross commented Sep 14, 2023

JasonGross commented Sep 14, 2023

MSoegtropIMC commented Oct 11, 2023

andrew-appel commented Oct 11, 2023

MSoegtropIMC commented Oct 11, 2023