@Pierre-Marie Pédrot Is asking about why induction is so slow (https://github.com/coq/coq/pull/14174#discussion_r627670736) a good topic to add to the Coq call this week?

Oh, sorry I missed that ping. I can have a look quickly though.

Thanks!

It's not super clear that there is a single source of slowdown. It looks spread around, although mainly inside unification.

Do you happen to have a slightly more extreme example?

I tried nesting more records but I didn't get slower.

It seems that the associativity of the records matter.

@Pierre-Marie Pédrot Here's a slightly more extreme example (should work on Coq master too)

```
Set Implicit Arguments.
Definition ex_proj1 :=
fun {A : Prop} {P : A -> Prop} (x : exists y, P y) => match x with ex_intro _ a _ => a end.
Definition ex_proj2 :=
fun {A : Prop} {P : A -> Prop} (x : exists y, P y) => match x return P (ex_proj1 x) with ex_intro _ _ a => a end.
Axiom eq_sigT
: forall {A : Type} {P : A -> Type} (u v : {a : A & P a}) (p : projT1 u = projT1 v),
eq_rect (projT1 u) (fun a : A => P a) (projT2 u) (projT1 v) p = projT2 v -> u = v.
Axiom eq_sigT_rect_uncurried :
forall {A : Type} {P : A -> Type} {u v : {a : A & P a}} (Q : u = v -> Type),
(forall pq : exists p : projT1 u = projT1 v, eq_rect (projT1 u) (fun a : A => P a) (projT2 u) (projT1 v) p = projT2 v,
Q (eq_sigT u v (ex_proj1 pq) (ex_proj2 pq))) -> forall p : u = v, Q p.
Axiom ex_rect :
forall [A : Prop] [P : A -> Prop] (P0 : (exists y, P y) -> Type),
(forall (x : A) (p : P x), P0 (ex_intro P x p)) -> forall e : exists y, P y, P0 e.
Section inversion_sigma.
Unset Implicit Arguments.
Context A (B B' : A -> Prop) (C C' : forall a, B a -> Prop)
(D : forall a b, C a b -> Prop) (E : forall a b c, D a b c -> Prop)
(F : forall a b c d, E a b c d -> Prop) (G : forall a b c d e, F a b c d e -> Prop)
(H : forall a b c d e f, G a b c d e f -> Prop) (I : forall a b c d e f g, H a b c d e f g -> Prop).
Goal forall (x y : { a : A & { b : { b : B a & C a b } & { d : D a (projT1 b) (projT2 b) & { e : E _ _ _ d &
{ f : F _ _ _ _ e & { g : G _ _ _ _ _ f & { h : H _ _ _ _ _ _ g & I _ _ _ _ _ _ _ h } } } } } } })
(p : x = y), p = p.
Proof.
intros x y p; repeat match goal with H : sigT _ |- _ => destruct H end; cbn [projT1 projT2] in *.
Time induction p as [p] using (@eq_sigT_rect_uncurried _ _ _ _);
cbn beta iota
delta [proj1_sig proj2_sig proj3_sig projT1 projT2 projT3 ex_proj1 ex_proj2 sig_of_sig2 sigT_of_sigT2
sig_of_sigT sigT_of_sig sig2_of_sigT2 sigT2_of_sig2] in p.
Time induction p using ex_rect. (* Finished transaction in 2.99 secs (2.986u,0.003s) (successful) *)
```

Indeed associativity seems to matter, but you should be able to keep adding sigmas on the right here.

Strangely enough, if I do `cbn beta match delta [projT1 projT2]`

before the second `induction p`

, then the second `induction`

gets much faster. Is the issue really that unification is slow because it spends a lot of time not realizing it needs to unfold `projT1`

and `projT2`

?

So, there seems to be a problem with the conclusion.

The time is linear in the number of occurrences of `p`

in the conclusion.

If you replace `p = p`

with a dummy predicate `Blah (p, p, p, p ...)`

you can observe it easily.

I still think it's kind-of atrocious that the `induction p as [p] using (@eq_sigT_rect_uncurried _ _ _ _)`

takes .2 seconds, especially since adding `match type of p with @existT ?A ?P ?x ?y=@existT _ _ ?x' ?y' => set (Y := y) in *; set (Y' := y') in *; set (PV := P) in * end`

, which takes only 0.075s, drops the time of `Time induction p as [p] using (@eq_sigT_rect_uncurried _ _ _ _).`

down to 0.005s, and then `cbn [projT1 projT2] in *; induction p using ex_rect.`

takes only 0.006s rather than 0.157s

Yeah, I think that this is slow because of second-order induction.

Induction is trying to find your subterm in the conclusion to abstract it away.

What do you mean by second-order induction?

^this

it tries to craft a predicate

so for this it needs to abstract the stuff you're inducting on

the pattern here is a quite big term

if you know upfront what this is going to be you should pass it manually

and here you do because you've just substituted a variable

In general I don't, because I'm writing a general tactic to handle any equality of the form `existT _ _ _ = existT _ _ _`

which may be mentioned anywhere in the context and the goal

But I'm a bit confused. It seems to me the unification problem should have a fast solution

If instead of invoking `induction`

I do `revert p`

and then `Time refine (@eq_sigT_rect_uncurried _ _ _ _ _ _).`

, this still takes 0.133s, but the unification problem is unifying something of the shape `forall p : ?u = ?v, ?Q p`

with the goal, which has exactly the same shape modulo beta.

Because `p`

is not in the context of `?Q`

, this falls in the pattern fragment, and solving this should be as fast as just copying the goal. (So long as the goal syntactically has the form `forall p : @eq (@sigT _ _) _ _, _`

, the instantiation is well-typed.) But `Time match goal with |- ?G => pose G end.`

is instantaneous, and so is `cbv beta`

, so this seems much slower than copying the goal.

I am not sure how useful this is but `Set Debug Tactic Unification`

showed more failed unification attempts than my emacs buffer could handle. `apply: (ex_rect _ _ p)`

(using Unicoq because evarconv cannot infer the predicate) doesn't show any failed unification whatsoever and takes a bit less than half the time that `induction`

takes.

(It's probably not doing the same thing at all.. I was just curious about what unification was doing here.)

@Janno That is interesting, but I'm not sure what to make of it. My claim is that both calls to `induction`

should be instantaneous, and hence it sounds like even the Unicoq one is too slow. My reasoning is that, so long as the hypothesis I'm inducting on is not a section variable mentioned in the body of some definition used in the context or the goal (i.e., so long as `revert dependent p`

succeeds), then both calls to `induction`

are guaranteed to succeed and I can, at least in theory, craft the partial proof term and the resulting goal and context without ever invoking unification. So unification should not be doing anything hard.

I agree.. it is a bit baffling. And even the unicoq variant is definitely too slow. (I think unicoq spends a big chunk of its time instantiating evars but I am not sure about that.)

I've made a parameterized version at https://github.com/coq/coq/issues/14301#issuecomment-837653213 where you can fill in the nat argument to make `refine`

as slow as you'd like. The `refine`

with holes is over 100x slower than the one where I solve the unification problem manually, despite the fact that the entire problem is in the pattern fragment

I suspect we don't have a shortcut for the case what we're abstracting is actually a local variable...

Hm, but it seems that the slowness is not `pattern`

but `apply`

(and to a lesser extent beta-iota normalization of intermediate and remaining goals)?

Last updated: Oct 13 2024 at 01:02 UTC