@Pierre-Marie Pédrot Is asking about why induction is so slow (https://github.com/coq/coq/pull/14174#discussion_r627670736) a good topic to add to the Coq call this week?
Oh, sorry I missed that ping. I can have a look quickly though.
Thanks!
It's not super clear that there is a single source of slowdown. It looks spread around, although mainly inside unification.
Do you happen to have a slightly more extreme example?
I tried nesting more records but I didn't get slower.
It seems that the associativity of the records matter.
@Pierre-Marie Pédrot Here's a slightly more extreme example (should work on Coq master too)
Set Implicit Arguments.
Definition ex_proj1 :=
fun {A : Prop} {P : A -> Prop} (x : exists y, P y) => match x with ex_intro _ a _ => a end.
Definition ex_proj2 :=
fun {A : Prop} {P : A -> Prop} (x : exists y, P y) => match x return P (ex_proj1 x) with ex_intro _ _ a => a end.
Axiom eq_sigT
: forall {A : Type} {P : A -> Type} (u v : {a : A & P a}) (p : projT1 u = projT1 v),
eq_rect (projT1 u) (fun a : A => P a) (projT2 u) (projT1 v) p = projT2 v -> u = v.
Axiom eq_sigT_rect_uncurried :
forall {A : Type} {P : A -> Type} {u v : {a : A & P a}} (Q : u = v -> Type),
(forall pq : exists p : projT1 u = projT1 v, eq_rect (projT1 u) (fun a : A => P a) (projT2 u) (projT1 v) p = projT2 v,
Q (eq_sigT u v (ex_proj1 pq) (ex_proj2 pq))) -> forall p : u = v, Q p.
Axiom ex_rect :
forall [A : Prop] [P : A -> Prop] (P0 : (exists y, P y) -> Type),
(forall (x : A) (p : P x), P0 (ex_intro P x p)) -> forall e : exists y, P y, P0 e.
Section inversion_sigma.
Unset Implicit Arguments.
Context A (B B' : A -> Prop) (C C' : forall a, B a -> Prop)
(D : forall a b, C a b -> Prop) (E : forall a b c, D a b c -> Prop)
(F : forall a b c d, E a b c d -> Prop) (G : forall a b c d e, F a b c d e -> Prop)
(H : forall a b c d e f, G a b c d e f -> Prop) (I : forall a b c d e f g, H a b c d e f g -> Prop).
Goal forall (x y : { a : A & { b : { b : B a & C a b } & { d : D a (projT1 b) (projT2 b) & { e : E _ _ _ d &
{ f : F _ _ _ _ e & { g : G _ _ _ _ _ f & { h : H _ _ _ _ _ _ g & I _ _ _ _ _ _ _ h } } } } } } })
(p : x = y), p = p.
Proof.
intros x y p; repeat match goal with H : sigT _ |- _ => destruct H end; cbn [projT1 projT2] in *.
Time induction p as [p] using (@eq_sigT_rect_uncurried _ _ _ _);
cbn beta iota
delta [proj1_sig proj2_sig proj3_sig projT1 projT2 projT3 ex_proj1 ex_proj2 sig_of_sig2 sigT_of_sigT2
sig_of_sigT sigT_of_sig sig2_of_sigT2 sigT2_of_sig2] in p.
Time induction p using ex_rect. (* Finished transaction in 2.99 secs (2.986u,0.003s) (successful) *)
Indeed associativity seems to matter, but you should be able to keep adding sigmas on the right here.
Strangely enough, if I do cbn beta match delta [projT1 projT2]
before the second induction p
, then the second induction
gets much faster. Is the issue really that unification is slow because it spends a lot of time not realizing it needs to unfold projT1
and projT2
?
So, there seems to be a problem with the conclusion.
The time is linear in the number of occurrences of p
in the conclusion.
If you replace p = p
with a dummy predicate Blah (p, p, p, p ...)
you can observe it easily.
I still think it's kind-of atrocious that the induction p as [p] using (@eq_sigT_rect_uncurried _ _ _ _)
takes .2 seconds, especially since adding match type of p with @existT ?A ?P ?x ?y=@existT _ _ ?x' ?y' => set (Y := y) in *; set (Y' := y') in *; set (PV := P) in * end
, which takes only 0.075s, drops the time of Time induction p as [p] using (@eq_sigT_rect_uncurried _ _ _ _).
down to 0.005s, and then cbn [projT1 projT2] in *; induction p using ex_rect.
takes only 0.006s rather than 0.157s
Yeah, I think that this is slow because of second-order induction.
Induction is trying to find your subterm in the conclusion to abstract it away.
What do you mean by second-order induction?
^this
it tries to craft a predicate
so for this it needs to abstract the stuff you're inducting on
the pattern here is a quite big term
if you know upfront what this is going to be you should pass it manually
and here you do because you've just substituted a variable
In general I don't, because I'm writing a general tactic to handle any equality of the form existT _ _ _ = existT _ _ _
which may be mentioned anywhere in the context and the goal
But I'm a bit confused. It seems to me the unification problem should have a fast solution
If instead of invoking induction
I do revert p
and then Time refine (@eq_sigT_rect_uncurried _ _ _ _ _ _).
, this still takes 0.133s, but the unification problem is unifying something of the shape forall p : ?u = ?v, ?Q p
with the goal, which has exactly the same shape modulo beta.
Because p
is not in the context of ?Q
, this falls in the pattern fragment, and solving this should be as fast as just copying the goal. (So long as the goal syntactically has the form forall p : @eq (@sigT _ _) _ _, _
, the instantiation is well-typed.) But Time match goal with |- ?G => pose G end.
is instantaneous, and so is cbv beta
, so this seems much slower than copying the goal.
I am not sure how useful this is but Set Debug Tactic Unification
showed more failed unification attempts than my emacs buffer could handle. apply: (ex_rect _ _ p)
(using Unicoq because evarconv cannot infer the predicate) doesn't show any failed unification whatsoever and takes a bit less than half the time that induction
takes.
(It's probably not doing the same thing at all.. I was just curious about what unification was doing here.)
@Janno That is interesting, but I'm not sure what to make of it. My claim is that both calls to induction
should be instantaneous, and hence it sounds like even the Unicoq one is too slow. My reasoning is that, so long as the hypothesis I'm inducting on is not a section variable mentioned in the body of some definition used in the context or the goal (i.e., so long as revert dependent p
succeeds), then both calls to induction
are guaranteed to succeed and I can, at least in theory, craft the partial proof term and the resulting goal and context without ever invoking unification. So unification should not be doing anything hard.
I agree.. it is a bit baffling. And even the unicoq variant is definitely too slow. (I think unicoq spends a big chunk of its time instantiating evars but I am not sure about that.)
I've made a parameterized version at https://github.com/coq/coq/issues/14301#issuecomment-837653213 where you can fill in the nat argument to make refine
as slow as you'd like. The refine
with holes is over 100x slower than the one where I solve the unification problem manually, despite the fact that the entire problem is in the pattern fragment
I suspect we don't have a shortcut for the case what we're abstracting is actually a local variable...
Hm, but it seems that the slowness is not pattern
but apply
(and to a lesser extent beta-iota normalization of intermediate and remaining goals)?
Last updated: Oct 13 2024 at 01:02 UTC