Stream: Coq users

Topic: Super-slow/diverging `simpl` without debug output


view this post on Zulip Paolo Giarrusso (Apr 18 2023 at 20:39):

I was triaging the "infinite loop"/slow simpl from https://github.com/ejgallego/coq-lsp/issues/487 / https://coq.zulipchat.com/#narrow/stream/329642-coq-lsp/topic/Interrupting.20coq-lsp. cc @Alex Sanchez-Stern

I still wonder if it's exponential backtracking, but, it soon stops printing any debug output even with Set Debug "all" — does that _disprove_ backtracking.

I reproduced it on __coq-platform.2022.09.0~8.16.0~2022.09~beta1 via the code below; interrupting it (in vscoq) after it stops printing debug output gives exceptions like https://gist.github.com/Blaisorblade/702765e116aa36e7ca376119e3f79165.

(* https://github.com/ejgallego/coq-lsp/issues/487 *)
From compcert Require Import Coqlib.
From compcert Require Import Integers.
From compcert Require Import Values Memory.
From compcert Require Import Cminor CminorSel.
From compcert Require Import SelectOp.

Section CMCONSTR.
Variable ge: genv.
Variable sp: val.
Variable e: env.
Variable m: mem.
Definition unary_constructor_sound (cstr: expr -> expr) (sem: val -> val) : Prop :=
  forall le a x,
  eval_expr ge sp e m le a x ->
  exists v, eval_expr ge sp e m le (cstr a) v /\ Val.lessdef (sem x) v.
Lemma eval_mulimm_base:
  forall n, unary_constructor_sound (mulimm_base n) (fun x => Val.mul x (Vint n)).
induction n.
econstructor.
(* Set Debug "RAKAM,unification,tactic-unification". *)
Fail progress simpl.
Time simpl.
unfold mulimm_base.
Fail progress simpl.
Time simpl.
unfold Int.one_bits.
split.
Set Debug "RAKAM,unification,tactic-unification".
Test Debug.
Set Debug "all".
Test Debug.
simpl. (* loops, and stops printing anything. *)

view this post on Zulip Paolo Giarrusso (Apr 18 2023 at 20:59):

Actually, nevermind. The time and _output size_ of simpl on a subterm appears to blow up, get slow around 11, and is called on 32:

(* https://github.com/ejgallego/coq-lsp/issues/487 *)
From compcert Require Import Coqlib.
From compcert Require Import Integers.
From compcert Require Import Values Memory.
From compcert Require Import Cminor CminorSel.
From compcert Require Import SelectOp.


Section CMCONSTR.
Variable intval: Z.

Time Eval simpl in Zbits.Z_one_bits 0 intval 0.
Time Eval simpl in Zbits.Z_one_bits 1 intval 0.
Time Eval simpl in Zbits.Z_one_bits 2 intval 0.
Time Eval simpl in Zbits.Z_one_bits 3 intval 0.
Time Eval simpl in Zbits.Z_one_bits 4 intval 0.
Time Eval simpl in Zbits.Z_one_bits 5 intval 0.
Time Eval simpl in Zbits.Z_one_bits 6 intval 0.
Time Eval simpl in Zbits.Z_one_bits 7 intval 0.
Time Eval simpl in Zbits.Z_one_bits 8 intval 0. (* 0.1s *)
Time Eval simpl in Zbits.Z_one_bits 9 intval 0. (* 0.4s *)
Time Eval simpl in Zbits.Z_one_bits 10 intval 0. (* 1.64s *)

view this post on Zulip Notification Bot (Apr 18 2023 at 20:59):

Paolo Giarrusso has marked this topic as resolved.

view this post on Zulip Paolo Giarrusso (Apr 18 2023 at 21:00):

I found this after using debugging on cbn, and seeing that cbn was counting _very_ slowly 0 + 1, 0 + 1 + 1, 0 + 1 + 1 + 1 — then I started poking and got here.

Set Debug "RAKAM,unification,tactic-unification".
cbn.

view this post on Zulip Notification Bot (Apr 18 2023 at 21:01):

Paolo Giarrusso has marked this topic as unresolved.

view this post on Zulip Paolo Giarrusso (Apr 18 2023 at 21:01):

(unresolving in case Alex has follow-up questions)

view this post on Zulip Alex Sanchez-Stern (Apr 18 2023 at 21:10):

Ah okay, so it is just really slow. Good to know, thanks!

view this post on Zulip Alex Sanchez-Stern (Apr 18 2023 at 21:10):

I'll kill my 5.5 hour simpl run :)

view this post on Zulip Paolo Giarrusso (Apr 18 2023 at 21:13):

yeah different category of blow-up — explosion in open terms. Most fun is that Zbits.Z_one_bits 32 (Z.pow 2 32 - 1) 0. evaluates immediately, the problem is just because here you have an open term

view this post on Zulip Emilio Jesús Gallego Arias (Apr 18 2023 at 22:59):

Great that you generated a backtrace @Paolo Giarrusso , that should give us a hint on where to insert a polling point.

Note that this is a Coq upstream bug so it'd be quite useful if you folks could file a bug. Choosing the right polling point is tricky as not to impact performance a lot.

The good thing is that fixing this will likely help with some other places where Coq gets stuck in a loop that requires OS-level interrupt support

[Note that as of today, OS-level interrupt support is not working on Win and JS, so relying on it essentially is still a bit of a problem, that will be solved but it is a different kind of effort than just adding the right polling point in Coq)

view this post on Zulip Paolo Giarrusso (Apr 19 2023 at 01:42):

That backtrace is from interruption in vscoq1 on coq8.16, I didn't use a debugger

view this post on Zulip Paolo Giarrusso (Apr 19 2023 at 01:47):

feel free to use anything for a bug report, but I'm not sure what the bug should say (and gotta get back to other things)

view this post on Zulip Alex Sanchez-Stern (Apr 19 2023 at 14:51):

Hmm I'm not sure I understand the internals well enough to file a bug. If the bug is just "simpl is way slower than it should be for open terms", I can copy and paste some stuff from the discussion here into the bug report. But for the polling, I can't make heads or tails of that backtrace.

view this post on Zulip Paolo Giarrusso (Apr 19 2023 at 20:16):

if there's a bug it's about polling, and I'm skeptical — interruption worked in vscoq1

view this post on Zulip Paolo Giarrusso (Apr 19 2023 at 20:17):

and no, if anything simpl is faster than it should be (okay, not literally)

view this post on Zulip Paolo Giarrusso (Apr 19 2023 at 20:18):

vm_compute is the lower-bound, but its output size and time blows up the same way — just a bit faster

view this post on Zulip Alex Sanchez-Stern (Apr 19 2023 at 20:19):

Hmm, logically why should simplifying open terms be significantly slower than simplifying closed terms?

view this post on Zulip Alex Sanchez-Stern (Apr 19 2023 at 20:21):

Re polling, I think Emilio was saying there were multiple interrupt mechanisms, a OS level interrupt, and a gentler polling based interrupt. The former isn't compatible with all environments, but the latter needs polling inserted in more places. VSCoq1 uses the former, so presumably that wouldn't have worked on Windows or in JS-land.

view this post on Zulip Paolo Giarrusso (Apr 19 2023 at 20:24):

since the computation is stuck on intval, the output is just "all the computation left after the first point that's stuck, inlined". Here are the first few outputs:

Time Eval simpl in Zbits.Z_one_bits 0 intval 0.
Time Eval simpl in Zbits.Z_one_bits 1 intval 0.
Time Eval simpl in Zbits.Z_one_bits 2 intval 0.
(*
Outputs
     = nil
     : list Z
Finished transaction in 0. secs (0.u,0.s) (successful)
     = if Z.odd intval then 0 :: nil else nil
     : list Z
Finished transaction in 0. secs (0.u,0.s) (successful)
     = if Z.odd intval
       then 0 :: (if Z.odd (Z.div2 intval) then 1 :: nil else nil)
       else if Z.odd (Z.div2 intval) then 1 :: nil else nil
     : list Z
Finished transaction in 0. secs (0.u,0.s) (successful)
*)

view this post on Zulip Alex Sanchez-Stern (Apr 19 2023 at 20:29):

Ah I see, so it's actually unfolding exponentially more definitions, because it has to unfold both branches of the match instead of being able to compute and only unfold one. Makes sense.

view this post on Zulip Paolo Giarrusso (Apr 19 2023 at 20:37):

yep! This is even more common with vm_compute on open terms.

view this post on Zulip Emilio Jesús Gallego Arias (Apr 19 2023 at 21:07):

Yes, the problem is with polling for an interrupt request not working for this example.

view this post on Zulip Emilio Jesús Gallego Arias (Apr 19 2023 at 21:07):

It is good we have polling for interrupt working for now, hopefully we can remove it but will need some support from OCaml runtime

view this post on Zulip Emilio Jesús Gallego Arias (Jun 20 2023 at 18:39):

Folks as noted in the issue please submit all bad interruption examples you got so I can test the new solution.

view this post on Zulip Alex Sanchez-Stern (Jun 21 2023 at 13:49):

I think I only had the one, since in my main branch tactics get interrupted when they're taking too long. But I'll keep an eye out for other examples.


Last updated: Oct 13 2024 at 01:02 UTC