Hello,
I have a (rather complex) tactic (ltac1) that solves automatically some goal of interest (a kind of program refinement). I managed to have the tactic run in an acceptable time but the kernel takes significantly more time to check the generated proof term (on some instance the tactic runs in 1.2s and the kernel in 5s).
I do not know if I can hope to improve this, I don't know how to retrieve useful informations about what the kernel is doing.
Currently I only have access to:
The proof term generated by the tactics (with Show Proof
or Print
). I use Set Printing All
to see the whole term but some sub-terms are replaced with ellipses (...
), is there some way to prevent this ?
The time taken by coqc on the whole file, I deduce the time used by the kernel by subtracting the time used by the tactics (measured with the time
ltac).
Would there be other way to figure out what the kernel is doing ? For instance I know that the proof term is quadratic in the size of the program I consider (but I think that the tactics still do a linear work because the quadratic part comes from duplicated terms which are seen as evars by the tactics). I expect this part to be easily type checkable but I do not know if it is really the case or if the long verification time comes from this part.
I though that having some informations like the number of reduction performed in the conversions could be useful.
I know that guard checking can be quite long but only a few simple fix
operators are visible in the proof term (but again there are ellipses).
I found this thread that suggest a way to debug the kernel, could something like this be useful here ?
Also I had a few questions about the kernel and reduction:
I there any kind of sharing in the kernel ? For instance if an evar occurs several times, is it (the term it is unified to) type checked multiple times ?
Except for the conversions and guards, is the kernel checking linear in the size of the proof term ?
I noticed that conversions performed using cbv
are sometimes long to be checked by the kernel, but my tactics uses only cbn
and hnf
. Is cbn
a good approximation for the reduction strategy used by the kernel ?
I know that cbn
can be influenced by the Arguments
command (with /
, !
and simpl never
) and the transparency/strategy of the definitions. It also seems to affect the verification time of the kernel (although I didn't found it explicitly in the manual for Arguments
). Is there other way to influence the reduction ? For instance I have some reductions that takes a significant time for cbn
(100ms) but cbv
solves them instantly (but reduces too much some parts). Can I force a call by value strategy for some parts (for cbn
and the kernel) ? Marking the arguments with !
is not enough (but already helps), maybe because it only require a constructor at head but not a completely reduced term. I'm considering writing the reduced definitions in Continuation Passing Style but this might requires significant refactoring of the proofs and I'm not sure how this would interact with a reduction under the binders.
Finally I noticed that cbn
generates a cast in the proof term ?proof : original_goal
, not generating this cast at some points (using eval
and change
) improved the kernel checking time (by almost 20% on some examples). So if there are other simple tricks like this I would be happy to try them :)
I'll try to answer some of the easier questions:
Set Printing Depth <very large number>
to get the full term. It might take a very long time to print, though. Be careful about doing this in emacs since large proof terms will make it entirely unresponsive for minutes or even longer.cbn
is basically a well-behaved variant of simpl
in that it respects the simpl never
annotation. It also isn't quite as as shy as simpl
when it comes to unfolding simple aliases even when that doesn't unlock more reduction. To me it is a good "normalize but don't destroy" reduction tactic and I use it almost everywhere. But it is in no way directly related to the kernel's internal reduction used for typechecking (which is mostly weak head reduction). The closest approximation to that is hnf
but hnf
does too much. There is a PR that will hopefully be merged at some point which implements new reduction variants: https://github.com/coq/coq/pull/17503. I think head lazy
in that PR corresponds to the kernel's weak head reduction but I am not 100% sure.cbv
(without any flags) is full normalization. It is rarely the right thing to use in tactics since one ends up generating more work for the kernel which now has to typecheck much larger terms. cbv
is best used in a very targeted way to reduce complicated terms to smaller results when it is known that this reduces term complexity in some way. Often this means selecting a whitelist of definitions to unfold (using cbv [term1 term2]
, possibly even with specific reduction flags).Arguments
by itself shouldn't influence Qed
time as far as I know. But it is entirely possible that some part of a larger tactic will take a different code path and end up with different evar instantiations, less sharing, or some other difference because of the influence of Arguments
on cbn
(and simpl
).There is a lengthy thread on the topic with various good hints which might also be helpful:
https://coq.zulipchat.com/#narrow/stream/237977-Coq-users/topic/How.20to.20Debug.20slow.20QED.3F
Janno said:
- You can use
Set Printing Depth <very large number>
to get the full term.
This does not seems to work in Coqtail but I it works when calling coqc
:+1: .
Arguments
by itself shouldn't influenceQed
time as far as I know. But it is entirely possible that some part of a larger tactic will take a different code path and end up with different evar instantiations, less sharing, or some other difference because of the influence ofArguments
oncbn
(andsimpl
).
I rechecked my exemples and it seems that it indeed only influence the time of cbn
.
I there any kind of sharing in the kernel ? For instance if an evar occurs several times, is it (the term it is unified to) type checked multiple times ?
Yes it gets checked multiple times
Except for the conversions and guards, is the kernel checking linear in the size of the proof term ?
I think so (meaning size without considering sharing)
you can use Time around a command (eg Time Qed.
) or coqc -time
to get more precise timings, although Time Qed
can be missing some of the time (it is the time spent preparing the data to be sent to the kernel, which involves a few term traversals)
This does not seems to work in Coqtail
I haven't looked into it much but I expect it's the same issue as https://github.com/coq/coq/issues/13971
@Wolf Honore do you render Pp.t -> string
in Coqtail? If not that's a separate issue from the one in CoqIDE.
So there are likely 2 issues, depending where the rendering of Coq's pretty printing type happens:
coqidetop
, before the string is sent to the clients. In this case, mutliple problems with Coq could prevent the rendering taking place.https://github.com/coq/coq/pull/17840 should fix this problem for VsCoq 1 and Coqtail if my interpretation is correct
You may be right @Emilio Jesús Gallego Arias, I'll try to look more closely when I have time
Gaëtan Gilbert said:
you can use Time around a command (eg
Time Qed.
) orcoqc -time
to get more precise timings, althoughTime Qed
can be missing some of the time (it is the time spent preparing the data to be sent to the kernel, which involves a few term traversals)
I tried using Time Qed
on my exemple . It gives me a smaller time than what I was calculating using time coqc
: I obtain 1.2s for the tactics (with the time
tactic or coqc -time
), 3.5s for Qed (using Time
or coqc -time
) and a total of 6.2s for coqc (using time coqc
). It seems that Time Qed
is missing a significant amount of time (1.5s). So I guess that it means that it is a problem with the size of the proof term ?
probably
you already said the size is quadratic so that makes it especially likely to be the problem
Could you use abstract to lift the duplicated terms into separate constants?
Abstract requires an evar-free goal, but there might be ways around that.
Paolo Giarrusso said:
Could you use abstract to lift the duplicated terms into separate constants?
To be concrete, the goal I'm solving is something like:
prog_refines p0 ?p1
where, p0
is the input program, ?p1
is a program generated by the tactic and prog_refines
is a soundness property. The tactic is solving this goal inductively on p0
and generates a corresponding ?p1
. The problem arise with combinators. For instance a sequence operator would be handled by applying a lemma:
Lemma refines_Seq p0a p0b p1a p1b
(Ha : prog_refines p0a p1a)
(Hb : prog_refines p0b p1b):
prog_refines (Seq p0a p0b) (Seq p1a p1b).
Now I think that the problem is that the sub-programs p0a
, p0b
, p1a
andp1b
appears as parameters of the lemma, which makes the proof term quadratic in the size of the program.
The generated program p1
is larger (by a constant factor) than p0
so reducing the size of the corresponding terms (by some kind of factorisation maybe ?) could be enough. However those terms are not directly generated but rather inductively instantiated so I don't know if abstract
would work here.
I'm considering instead solving a goal:
{p1 | prog_refines p0 p1}
and then reducing the proj1_sig
of the proof to obtain the generated program. But it requires a significant refactoring and I'm not sure how well this reduction would be handled by the tactics and the kernel since the term will contain the full proof.
I was able to reduce enough the kernel verification time by using transparent_abstract
to factorise into separate lemmas the subterms of the generated program.
I'm using something like:
Inductive MyGoal (n : nat) := MyGoalI.
Lemma change_MyGoal_arg (n0 n1 : nat)
(H : MyGoal n0)
(E : n0 = n1):
MyGoal n1.
Proof.
case E; exact H.
Qed.
Ltac abstract_refl :=
lazymatch goal with |- ?x0 = _ =>
(* See https://github.com/coq/coq/issues/16534#issuecomment-1256930507 *)
let x0 := eval cbv beta in x0 in
is_ground x0;
unshelve instantiate (1 := _);
[ transparent_abstract (exact x0) |]
end;
lazymatch goal with |- @eq ?T _ ?x1 =>
exact (@eq_refl T x1)
end.
Goal exists (n : nat), MyGoal n /\ n = 0.
Proof.
do 2 esplit. {
(* at each step *)
refine (change_MyGoal_arg _ _ _ _).
- (* solves the goal, unify the argument *) exact (MyGoalI O).
- (* abstracts the argument in a transparent lemma *) abstract_refl.
}
(* ..._subterm = 0 *)
cbv.
(* 0 = 0 *)
reflexivity.
Qed.
(where in my case MyGoal
is a program refinement property and the argument n
is the generated program).
This modification slightly increases the time taken by my tactics but greatly reduces the verification time of the kernel:
Original | Modified | |
---|---|---|
Tactic | 1.229s | 1.499s |
Kernel (Time Qed ) |
2.355s | 0.873s |
Total (coqc ) |
4.60s | 2.62s |
As commented, there were indeed some issues with evars, but there was already a known workaround.
Last updated: Jun 23 2024 at 04:03 UTC