I was triaging the "infinite loop"/slow simpl
from https://github.com/ejgallego/coq-lsp/issues/487 / https://coq.zulipchat.com/#narrow/stream/329642-coq-lsp/topic/Interrupting.20coq-lsp. cc @Alex Sanchez-Stern
I still wonder if it's exponential backtracking, but, it soon stops printing any debug output even with Set Debug "all"
— does that _disprove_ backtracking.
I reproduced it on __coq-platform.2022.09.0~8.16.0~2022.09~beta1 via the code below; interrupting it (in vscoq) after it stops printing debug output gives exceptions like https://gist.github.com/Blaisorblade/702765e116aa36e7ca376119e3f79165.
(* https://github.com/ejgallego/coq-lsp/issues/487 *)
From compcert Require Import Coqlib.
From compcert Require Import Integers.
From compcert Require Import Values Memory.
From compcert Require Import Cminor CminorSel.
From compcert Require Import SelectOp.
Section CMCONSTR.
Variable ge: genv.
Variable sp: val.
Variable e: env.
Variable m: mem.
Definition unary_constructor_sound (cstr: expr -> expr) (sem: val -> val) : Prop :=
forall le a x,
eval_expr ge sp e m le a x ->
exists v, eval_expr ge sp e m le (cstr a) v /\ Val.lessdef (sem x) v.
Lemma eval_mulimm_base:
forall n, unary_constructor_sound (mulimm_base n) (fun x => Val.mul x (Vint n)).
induction n.
econstructor.
(* Set Debug "RAKAM,unification,tactic-unification". *)
Fail progress simpl.
Time simpl.
unfold mulimm_base.
Fail progress simpl.
Time simpl.
unfold Int.one_bits.
split.
Set Debug "RAKAM,unification,tactic-unification".
Test Debug.
Set Debug "all".
Test Debug.
simpl. (* loops, and stops printing anything. *)
Actually, nevermind. The time and _output size_ of simpl
on a subterm appears to blow up, get slow around 11, and is called on 32:
(* https://github.com/ejgallego/coq-lsp/issues/487 *)
From compcert Require Import Coqlib.
From compcert Require Import Integers.
From compcert Require Import Values Memory.
From compcert Require Import Cminor CminorSel.
From compcert Require Import SelectOp.
Section CMCONSTR.
Variable intval: Z.
Time Eval simpl in Zbits.Z_one_bits 0 intval 0.
Time Eval simpl in Zbits.Z_one_bits 1 intval 0.
Time Eval simpl in Zbits.Z_one_bits 2 intval 0.
Time Eval simpl in Zbits.Z_one_bits 3 intval 0.
Time Eval simpl in Zbits.Z_one_bits 4 intval 0.
Time Eval simpl in Zbits.Z_one_bits 5 intval 0.
Time Eval simpl in Zbits.Z_one_bits 6 intval 0.
Time Eval simpl in Zbits.Z_one_bits 7 intval 0.
Time Eval simpl in Zbits.Z_one_bits 8 intval 0. (* 0.1s *)
Time Eval simpl in Zbits.Z_one_bits 9 intval 0. (* 0.4s *)
Time Eval simpl in Zbits.Z_one_bits 10 intval 0. (* 1.64s *)
Paolo Giarrusso has marked this topic as resolved.
I found this after using debugging on cbn
, and seeing that cbn
was counting _very_ slowly 0 + 1
, 0 + 1 + 1
, 0 + 1 + 1 + 1
— then I started poking and got here.
Set Debug "RAKAM,unification,tactic-unification".
cbn.
Paolo Giarrusso has marked this topic as unresolved.
(unresolving in case Alex has follow-up questions)
Ah okay, so it is just really slow. Good to know, thanks!
I'll kill my 5.5 hour simpl
run :)
yeah different category of blow-up — explosion in open terms. Most fun is that Zbits.Z_one_bits 32 (Z.pow 2 32 - 1) 0.
evaluates immediately, the problem is just because here you have an open term
Great that you generated a backtrace @Paolo Giarrusso , that should give us a hint on where to insert a polling point.
Note that this is a Coq upstream bug so it'd be quite useful if you folks could file a bug. Choosing the right polling point is tricky as not to impact performance a lot.
The good thing is that fixing this will likely help with some other places where Coq gets stuck in a loop that requires OS-level interrupt support
[Note that as of today, OS-level interrupt support is not working on Win and JS, so relying on it essentially is still a bit of a problem, that will be solved but it is a different kind of effort than just adding the right polling point in Coq)
That backtrace is from interruption in vscoq1 on coq8.16, I didn't use a debugger
feel free to use anything for a bug report, but I'm not sure what the bug should say (and gotta get back to other things)
Hmm I'm not sure I understand the internals well enough to file a bug. If the bug is just "simpl
is way slower than it should be for open terms", I can copy and paste some stuff from the discussion here into the bug report. But for the polling, I can't make heads or tails of that backtrace.
if there's a bug it's about polling, and I'm skeptical — interruption worked in vscoq1
and no, if anything simpl
is faster than it should be (okay, not literally)
vm_compute
is the lower-bound, but its output size and time blows up the same way — just a bit faster
Hmm, logically why should simplifying open terms be significantly slower than simplifying closed terms?
Re polling, I think Emilio was saying there were multiple interrupt mechanisms, a OS level interrupt, and a gentler polling based interrupt. The former isn't compatible with all environments, but the latter needs polling inserted in more places. VSCoq1 uses the former, so presumably that wouldn't have worked on Windows or in JS-land.
since the computation is stuck on intval
, the output is just "all the computation left after the first point that's stuck, inlined". Here are the first few outputs:
Time Eval simpl in Zbits.Z_one_bits 0 intval 0.
Time Eval simpl in Zbits.Z_one_bits 1 intval 0.
Time Eval simpl in Zbits.Z_one_bits 2 intval 0.
(*
Outputs
= nil
: list Z
Finished transaction in 0. secs (0.u,0.s) (successful)
= if Z.odd intval then 0 :: nil else nil
: list Z
Finished transaction in 0. secs (0.u,0.s) (successful)
= if Z.odd intval
then 0 :: (if Z.odd (Z.div2 intval) then 1 :: nil else nil)
else if Z.odd (Z.div2 intval) then 1 :: nil else nil
: list Z
Finished transaction in 0. secs (0.u,0.s) (successful)
*)
Ah I see, so it's actually unfolding exponentially more definitions, because it has to unfold both branches of the match instead of being able to compute and only unfold one. Makes sense.
yep! This is even more common with vm_compute
on open terms.
Yes, the problem is with polling for an interrupt request not working for this example.
It is good we have polling for interrupt working for now, hopefully we can remove it but will need some support from OCaml runtime
Folks as noted in the issue please submit all bad interruption examples you got so I can test the new solution.
I think I only had the one, since in my main branch tactics get interrupted when they're taking too long. But I'll keep an eye out for other examples.
Last updated: Oct 13 2024 at 01:02 UTC