I'm currently working through Programming Language Foundations in Software Foundations.
One thing I have a lot of trouble with is when to make sure to delete hypotheses from the environment that will give me unusable induction hypotheses when using the induction
tactic.
One example I just ran into is proving hoare_repeat
from the Hoare chapter.
As far as I can tell, It appears that the key to proving this is to make sure you clean up the environment prior to using induction
:
...
clear Hp Hfi Hc H1.
induction H5.
...
If you fail to clear the usused hypotheses, you get an induction hypothesis that isn't usable (cleaned up a little to make it more obvious what is going on):
st, st', st'', st''' : state
Hrepc : st =[ repeat c until b end ]=> st'''
Hc : st =[ c ]=> st'
Hrepc'1 : st' =[ c ]=> st''
H : beval st'' b = false
IHHrepc'2 : st =[ repeat c until b end ]=> st''' >
st =[ c ]=> st'' >
beval st'' b = false >
Q st'' >
Q st''' /\ beval st''' b = true
Note the second argument of the induction hypothesis (IHHrepc'2
), st =[ c ]=> st''
, which is not in my environment and I have no way of producing.
If I do clear the unused hypotheses from the environment before using induction
, then the induction hypothesis is usable:
H : beval st'' b = false
IHHrepc'2 : beval st'' b = false >
Q st'' >
Q st''' /\ beval st''' b = true
Note that the induction hypothesis no longer has an argument that I'm not able to produce.
I wasn't able to figure out the proof of this function myself. I'm aware of the general idea of using things like clear
and generalize dependent
to give an induction hypothesis that is usable, but it is often quite difficult to figure out on my own exactly what I need to do.
Are there any hints or rulesofthumb for when hypotheses need to be cleared from the environment in order to make progress in a proof? Is there a canonical simple example for when this needs to be done (the hoare_repeat
example is quite involved)?
My personal 2 cents on this, coming from large proofs on complicated ASTs:
1) Clear hypothesis as early as possible: this way, your goal is simpler, cleaner, and you run less often into troubles such as the one you have, where your useless hypothesis interferes in your induction. This typically means that once you have used inversion
or friends on an hypothesis H
, immediately clear H
– all its content is distributed into other hypothesis, so H
is now useless, and will only clutter your goal.
2) Nested inductions are often a source of brittleness and complexity. An easy way to avoid your issue is to have only one induction per proof. If you need a second, nested one, state it as an auxiliary lemma that you can apply. This way, the proof of the lemma is cleaner, because it contains exactly the required hypotheses. It is also likely that this way you think about getting a nice, general lemma, that you can reuse elsewhere.
3) It’s better to be too general than not enough: you cannot make your goal unprovable by being too general. So don’t be afraid to generalize your induction hypothesis a bit too much. In particular, the induction foo in H1, H2, …, Hn  *
pattern is imho _very_ useful, and you can easily go back and add/remove generalized variables if you need to.
@Meven LennonBertrand Thanks a lot. This is exactly the type of response I was hoping for.
I understand your suggestions (1) and (2). While I don't always do them, I can definitely seen that they are good suggestions and I should be much more liberal with following them. If you (or anyone else) had any other rules of thumb similar to this, I'd be really interested.
As for (3), what is the pattern induction foo in H1, H2, …, Hn  *
? You wouldn't happen to know if this form is introduced somewhere in Software Foundations? If not, I'll have to go look it up (instead of just waiting for it be introduced...)
State intermediate theorems in a way that induction
is the first tactic. This forces you to exactly state the induction hypothesis. Often, once you have the right induction hypothesis, the rest of the proof becomes merely a matter of enumerating hypotheses and theorems to find the next one that applies.
induction in ...  *
is not introduced in Software Foundations. There is a brief mention of it in the reference manual https://coq.inria.fr/distrib/current/refman/proofs/writingproofs/reasoninginductives.html#induction
Thanks, it looks like the in FOO  *
form is explained a little bit here: https://coq.inria.fr/distrib/current/refman/proofengine/tactics.html#grammartokengoal_occurrences
Although as a beginner I don't really understand how I can actually use it from this explanation.
The Coq manual isn't challenging just for beginners — there is an example of "induction with occurrences", but it doesn't seem very useful? Trial and error led me to this example, where induction
clears the hypotheses that were not named:
Lemma induction_test2 : forall n:nat, n = n > n  n = 0 > n <= n.
Proof.
intros n Heq Hdiff.
induction n in Heq  *.
 constructor.

I am looking for simple but not too artificial examples, where induction ... in ...
could be useful.
Here is an example where n
doesn't occur in conclusion.
Goal forall n x, n + x = n > x < 1.
intros n x H; induction n in H .
 cbn in H; subst; left.
 apply IHn; cbn in H; now injection H.
Qed.
I played around with this example, but I'm not sure what the induction n in H 
buys me over just using induction n
:
Goal forall n x, n + x = n > x < 1.
intros n x H. induction n.
 cbn in H; subst; left.
 apply IHn; cbn in H; now injection H.
Qed.
Just induction n
also seems to work fine for me. Maybe the in H 
part is giving some info to the reader about the induction?
induction ... in ...  *
clears the hypotheses that were not named
Ah, I guess this makes sense. It appears to be a way to keep only certain hypotheses, without having to explicitly clear
all the ones you don't want?
Ah, sorry, I should have been explicit : induction n in H1, … Hn  *
is roughly equivalent to revert H1, … Hn ; induction n ; intros H1 … Hn
. So it gives you an induction hypothesis that has been generalized over the hypothesis, but without having to do explicit reverting/introductions, and name fiddling. The nice part is that it is relatively easy to go back to that point and add/remove the hypothesis you want to generalize, without impacting the rest of the proof script.
It is mainly useful if amongst H1 … Hn
you have other variables, and hypothesis about those. For instance, say you want to prove some theorem forall m n : nat, P m > Q n > R m n
. You want to do this by induction over n
, but you want to generalize over m
, and keep the hypothesis around. Then you can do
intros m n p q.
induction n in m, p, q  *.
which 1) gives you the hypothesis P 0
in the first branch and P (S n)
in the second
2) gives you an induction hypothesis which is I think the natural one (it takes any m
such that P m
)
And if some time later you realize the hypothesis P n
is not needed, you can just go back, remove it in the statement of the theorem or the in
clause, and if you’re lucky (but in my experience it’s often the case), then you can keep the rest of the script as is.
@Meven LennonBertrand Thanks a really great explanation, thanks.
Last updated: Nov 29 2023 at 22:01 UTC