Stream: Coq devs & plugin devs

Topic: native_compute on the bench


view this post on Zulip Jason Gross (Feb 02 2021 at 22:22):

(cross-posting from https://github.com/coq/coq/issues/13807) How should a Coq bench package go about declaring a dependency on native_compute? (Currently it's unconditionally disabled and bench packages that require native_compute to function just fail / take forever)

view this post on Zulip Emilio Jesús Gallego Arias (Feb 02 2021 at 23:18):

See the discussion in the old bug tracker, I think that'd require significant work in the bench infra if we mean to have numbers that do matter in many cases, as you need a way to filter the time spent in the OCaml compiler when measuring .vo file compilation

view this post on Zulip Jason Gross (Feb 03 2021 at 00:51):

Can't we have support for ondemand native compilation? I have a bench job that stack overflows in vm_compute but not native_compute, and fiat-crypto has a couple of tests that are much, much, much faster in native_compute (ondemand) than with vm_compute.

view this post on Zulip Guillaume Melquiond (Feb 03 2021 at 06:56):

Jason Gross said:

I have a bench job that stack overflows in vm_compute but not native_compute

That is a bit surprising, because the VM machinery itself does not use the stack (there is no function call, just a massive loop). The compilation of VM terms uses the stack, but nowhere near what the OCaml compiler uses. So, the issue would be when decompiling terms, but that could be the case only if the executed terms were not reducible in the first place.

view this post on Zulip Pierre-Marie Pédrot (Feb 03 2021 at 06:57):

Reification is not tailrec, so it can (and will) overflow on big terms.

view this post on Zulip Guillaume Melquiond (Feb 03 2021 at 07:00):

Yes, but that is the same whether you use vm_compute or native_compute

view this post on Zulip Guillaume Melquiond (Feb 03 2021 at 07:04):

I cannot think of a situation where vm_compute would use more stack than native_compute.

view this post on Zulip Pierre-Marie Pédrot (Feb 03 2021 at 07:12):

The reification code is not exactly the same though, it might be enough to barely overflow or not the stack?

view this post on Zulip Guillaume Melquiond (Feb 03 2021 at 07:26):

I suppose so. nf_whd is taking one more argument than nf_atom, so it could make a small difference.

view this post on Zulip Jason Gross (Feb 03 2021 at 12:45):

Indeed, native_compute will stack overflow on just slightly larger terms than vm_compute (vm_compute stack overflows when my input size parameter for this particular bench is larger than about 110, native_compute when the input size parameter is larger than about 130 (see https://github.com/coq-community/coq-performance-tests/commit/ae8385b9471409387d0f47f01e38b866ba70bda1), and the size of both the input and the output is quadratic in the input size parameter))

view this post on Zulip Jason Gross (Feb 03 2021 at 13:03):

Is it possible the stack overflow occurs in handling the number of binders? Here's the small test-case for native_compute handling more than vm_compute:

Require Import Coq.ZArith.ZArith Coq.Lists.List.
Axiom Let_InAx : forall {A P} (x : A) (f : forall a : A, P a), P x.
Axiom ZAddAx : Z -> Z -> Z.

Definition map_double_cps {T} (ls : list Z) (k : list Z -> T) :=
  list_rect
    (fun _ => (list Z -> T) -> _)
    (fun k => k nil)
    (fun x xs rec k
     => Let_InAx (ZAddAx x x)
                 (fun y =>
                    rec (fun ys => k (y :: ys))))
    ls
    k.

Definition make_cps {T} (n : nat) (m : nat) (v : Z) (k : list Z -> T)
  := nat_rect
       _
       (fun k => k (List.repeat v n))
       (fun _ rec k => rec (fun ls => map_double_cps ls k))
       m
              k.
Notation goal_cps n m := (forall v, make_cps n m v (fun x => x) = nil) (only parsing).
Notation goal_red_cps n m := (ltac:(let v := eval cbv [make_cps map_double_cps] in (goal_cps n m) in exact v))
                               (only parsing).
Notation hide := (_ = _).
(* native stack overflows with 186, succeeds with 185 *)
(* vm     stack overflows with 168, succeeds with 167 *)
Goal goal_cps 168 168. cbv [make_cps map_double_cps]; intros. Time vm_compute.

view this post on Zulip Jason Gross (Feb 03 2021 at 13:07):

Interestingly, running Time Definition v := Eval native_compute in goal_red_cps x y. has different behavior, stack overflowing for 161 in both native_compute and vm_compute, and succeeding fine for 160 for both reduction strategies.


Last updated: Dec 05 2023 at 11:01 UTC