Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ltac2 local2ptree #712

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

MSoegtropIMC
Copy link
Collaborator

This PR replaces local2ptree with a version written in Ltac2 which is much faster (in measurements about 100x for a case with about 60 local variables).

The difference in make test is anyway quite small (1..2%) - the time of local2ptree is more visible in developments with many local variables.

Here are the make test times for the old and new version.

make -j 8 test  209.18s user 8.87s system 393% cpu 55.419 total
make -j 8 test  206.72s user 8.51s system 397% cpu 54.148 total

@MSoegtropIMC
Copy link
Collaborator Author

One note: I am using lazy instead of cbv here because it can be faster at Qed times - at tactic time it should not make much of a difference.

@andrew-appel
Copy link
Collaborator

I suggest: Add a brief comment just before the Std.lazy line (if that's the right place?), saying (* use lazy instead of cbv because Qed can be faster *)

@andrew-appel
Copy link
Collaborator

I suggest, at line 71 augment your comment with, "Normally this makes a significant but small improvement in performance; but when there are many local variables (e.g. 60) it makes a huge difference."

@JasonGross
Copy link
Contributor

One note: I am using lazy instead of cbv here because it can be faster at Qed times - at tactic time it should not make much of a difference.

I don't understand this comment. The difference between lazy and cbv is that lazy uses the same algorithm for reduction that Qed / the kernel uses, while cbv uses a different algorithm. But both tactics leave exactly the same trace in the proof term (a cast node with, I think, the post-reduction term (though it may be the pre-reduction term, I don't recall)), and regardless of which one you do, Qed / the kernel will always use the lazy algorithm. So any difference you see between lazy and cbv at Qed-time should just be noise.

Message.Format.kfprintf (fun x => Control.throw (Invalid_argument (Some x))) fmt.
Ltac2 Notation "throw_invalid_argument" fmt(format) := throw_invalid_argument fmt.

(* If one works with computed symbol lists, one can't use the tactic notation for evaluation functions (which construct such records in a more concise way) *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you do something like Ltac2 adjust_redflags_delta (syms : Std.reference list) (redflags : Std.red_flags) : Std.red_flags := { redflags with Std.rDelta := false; Std.rConst := syms } and then make a notation so you can write adjust_redflags_delta [id] syms or adjust_redflags_delta (beta iota) syms?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or { redflags with Std.rConst := List.app redflags.rConst syms } and then you could do reduction_strategy:([<list of things>]) below and pass that around instead of starting with a list of references. Might be slightly more concise

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to think about it. I tried a few things with notations, but none of this did work well in case one has mixed references, say local and global, manually specified and computed lists. In the end it was more effort and harder to understand than doing it low level. But meanwhile I am a bit smarter and can reconsider it.

Std.rConst := syms
}.

Ltac2 reference_of_constr(ref : constr) : Std.reference :=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems worth upstreaming into the Coq standard library, I've needed something like this a couple of times too

@MSoegtropIMC
Copy link
Collaborator Author

Regarding lazy vs. cbv: my (not very elaborate) understanding of this is that using lazy vs cbv doesn't make a difference if both deliver the same results, but it can make a substantial difference if the results are different, because then the result of lazy should be closer to the result of the kernel and require less unification effort. For partial evaluation with a restricted symbol list as done here, I believe it can well be that lazy and cbv give different results. I have difficulties providing examples for this, though, since lazy is less lazy that I would expect.

Anyway at some point I started to prefer lazy, but I can't exclude that this was caused by doing computations directly in hypothesis, which (afaik) doesn't leave an annotation.

@JasonGross
Copy link
Contributor

For partial evaluation with a restricted symbol list as done here, I believe it can well be that lazy and cbv give different results

If such an example exists, it's a bug in Coq. The difference between lazy and cbv is evaluation ordering and sharing of reductions, but so long as Coq reduction is confluent (which it is widely believed to be, I think), they cannot give different results.

@MSoegtropIMC
Copy link
Collaborator Author

so long as Coq reduction is confluent (which it is widely believed to be, I think), they cannot give different results.

Isn't it so that the argument of confluence only applies to full evaluation? We are doing partial evaluation here (with restricted delta lists). I mean hnf and cbv also give different results without impeding confluence. I didn't look into lazy into too much detail, and as I said in tests it is less lazy then I would expect, but my understanding of lazy vs cbv is that lazy does not evaluate arguments of functions unless these arguments are actually used in the expansion of a function. If the evaluation of a function a is partially blocked say because it contains a function b which is not in the delta list, and if an argument of a is passed to b, then cbv should expand the argument, while lazy should not. So lazy should result in a term where the argument of a passed to b is not expanded, while cbv should deliver the same term with the argument to b expanded. How such differences shall impede confluence I don't see.
My assumption is that such differences lead to issues in unification later, since one of the most difficult questions in unification is to decide which symbols to unfold.

I tried to make some trivial examples, but failed spending just a few minutes. If you believe my understanding of this is substantially flawed, I will spend some more time trying to understand this in more detail.

@JasonGross
Copy link
Contributor

Isn't it so that the argument of confluence only applies to full evaluation? We are doing partial evaluation here (with restricted delta lists).

Imagine wrapping the entire environment in a lambda abstracting over the constants to be left behind. If full reduction is confluent even under lambdas it should be confluent even with restricted delta lists. Less sure about leaving out the other reductions, though I think most of them can be modeled by replacing certain constructs with opaque applications.

I mean hnf and cbv also give different results without impeding confluence

hnf is not strong, it does not recurse under binders when the head is normal. cbv and lazy do. Moreover hnf uses the simpl engine under the hood, performing refolding, so it's really not analogous. My claim applies specifically to lazy and cbv. Even VM and native can give different results, though only in two places: alpha renaming of lambda variables and in not normalizing the parameters of inductive constructors.

If the evaluation of a function a is partially blocked say because it contains a function b which is not in the delta list, and if an argument of a is passed to b, then cbv should expand the argument, while lazy should not. So lazy should result in a term where the argument of a passed to b is not expanded, while cbv should deliver the same term with the argument to b expanded.

You might be missing the fact that once there are no more redexes, cbv recurses under binders and lazy recurses under binders and into arguments to blocked symbols, so even if the argument passed to b is unexpended when it reaches b, the reduction will go back and expand it once its done reducing everything else. The terms should be the same at the end.

@JasonGross
Copy link
Contributor

(If you're interested in reading more, there's coq/coq#17503 (comment), but most of that is not relevant here)

@JasonGross
Copy link
Contributor

How such differences shall impede confluence I don't see.
My assumption is that such differences lead to issues in unification later, since one of the most difficult questions in unification is to decide which symbols to unfold.

Also not sure if this is the difference of understanding, but in case it is:

  • note that reduction does not invoke unification at all; unification is only used for filling evars (deciding which symbols to unfold when also matters for conversion--which is basically unification of evar-free terms---but it only matters for performance)
  • reduction that does not recurse under binders is not confluent, for the reason you identify: if cbv and lazy did not recurse under binders, then blocked symbols would be applied to differently normalized arguments at the end

@JasonGross
Copy link
Contributor

but my understanding of lazy vs cbv is that lazy does not evaluate arguments of functions unless these arguments are actually used in the expansion of a function

And I guess a more accurate characterization here is that lazy does not expand function arguments until either they occur in the head position, as the discriminee of a redex in the head position, or until all such redexes have been evaluated and lazy is recursing into the rest of the term

@MSoegtropIMC
Copy link
Collaborator Author

Regarding the lazy vs. cbv question: I today came across an example where it makes a substantial speed difference - as @JasonGross explained the difference is not at Qed time but during evaluation itself. In some context where a compspecs exists (reptype takes one as implicit argument), I get:

  Time Eval cbv in reptype tvoid.
  Time Eval lazy in reptype tvoid.
=>
	 = unit
     : Type
Finished transaction in 0.284 secs (0.283u,0.s) (successful)
	 = unit
     : Type
Finished transaction in 0. secs (0.u,0.s) (successful)

I guess this is so because cbv would compute in the complete compspecs, while with lazy evaluation the compsepcs is not evaluated since it is not required to get the representation type of void.

So the comment is wrong, but I there are still good reasons in to prefer lazy over cbv in VST.

@andrew-appel
Copy link
Collaborator

Just out of curiosity, I examined this question: are the compspecs fully evaluated already? That is, would doing one extra "eval compute" in the make_compspecs tactic change anything? And the answer is, basically no. In a typical case, "progress compute" will fail to progress.
So what is taking the time? Perhaps it is this: within the compspecs there are some (potentially) big proofs, already in normal form. "cbv" walks through these proofs, "lazy" does not need to. Could that be it?

@MSoegtropIMC
Copy link
Collaborator Author

@andrew-appel : yes, this matches what I believe is happening. lazy looks only at what it "needs" to compute the final result, while cbv goes at least once through everything. So especially in structures which are a mix of data and proof terms, lazy should be of advantage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants