Stream: Miscellaneous

Topic: Citation for propositional vs setoid equality


view this post on Zulip Evgeniy Moiseenko (Dec 07 2021 at 08:25):

I'm writing a paper and I want to claim that rewriting with propositional equality is more performant (in terms of type-checking time) than rewriting with setoid equality. Is that even true, and if so what paper can I cite to support this claim?
Also, I would be happy to include any other arguments advocating for usage of prop. equality (whenever that possible).

view this post on Zulip Karl Palmskog (Dec 07 2021 at 09:05):

there are very few performance analyses of ITP related techniques. Your best bet might be Jason Gross's PhD thesis, but I'm not sure he covers this particular case. From my recollection, his general conclusion is that all forms of rewriting are slow except possibly when implemented by reflection.

view this post on Zulip Karl Palmskog (Dec 07 2021 at 09:23):

ping @Jason Gross to check if this is an accurate summary. Basically I think setoid-vs-other-rewrite differences are dwarfed by reflective-vs-non-reflective.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 09:30):

If by setoid you mean the stuff provided by the setoid_rewrite tactic, then yes, it is in general way slower than propositional rewriting, although I don't have a paper to quote.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 09:30):

There are at least two distinct sources of slowness: 1. enlarged proof size and 2. typeclass proof search.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 09:31):

from my personal experience, 2. is by far the major responsible for this slowness

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 09:31):

And that also covers the normal rewrite and ssreflect's rewrite tactics, when using setoid rewriting, no?

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 09:31):

yes, but this is ambiguous (as you mention, it's context-dependent)

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 09:32):

my point of comparison was with a theory which would have had a native setoid equality à la SetoidTT

view this post on Zulip Jason Gross (Dec 07 2021 at 12:49):

@Pierre-Marie Pédrot You are missing 3. evar instance allocation, which often comes been 1 and 2 in performance. (In fact evar allocation is responsible for much slowness in non-ssr-rewrite, not just in setoid rewrite).

view this post on Zulip Jason Gross (Dec 07 2021 at 12:52):

@Karl Palmskog your summary is accurate

view this post on Zulip Jason Gross (Dec 07 2021 at 12:59):

@Evgeniy Moiseenko you want chapter 4 of my thesis, particularly figure 4-4(a)--(c). On the following 2.5 pages there's the bit titled "Performance Bottlenecks of Proof-Producing Rewriting", which includes my analysis. You may also be interested in the performance graphs of the readme of https://github.com/coq-community/coq-performance-tests

view this post on Zulip Jason Gross (Dec 07 2021 at 12:59):

(anything including "rewrite")

view this post on Zulip Jason Gross (Dec 07 2021 at 13:03):

For example,. rewrite_repeated_app_autorewrite_noop shows that even when there is no actually rewriting to be done, prior to Coq 8.15, only ssr rewrite achieves linear performance of rewrite in the size of the goal; everything else is at least quadratic due to evars. This is fixed in master / 8.15 ( @Pierre-Marie Pédrot was this deliberate, or a side-effect of some other performance improvement you (or someone else?) made?)

view this post on Zulip Jason Gross (Dec 07 2021 at 13:05):

Some of the plots measure the time it takes to run abstract rewrite as well as rewrite, and the difference between these is a good estimate of how big of an impact the proof term size / complexity has.

view this post on Zulip Karl Palmskog (Dec 07 2021 at 13:06):

so ssr rewrite doesn't do any evar instance allocation?

view this post on Zulip Jason Gross (Dec 07 2021 at 13:06):

ssr rewrite does matching without evar allocation, I believe

view this post on Zulip Jason Gross (Dec 07 2021 at 13:09):

And for terms with sufficiently many head-symbol matches without any actual matches, this was the bottleneck in non-ssr-rewrite

view this post on Zulip Jason Gross (Dec 07 2021 at 13:11):

Following up on my previous list of examples: For example, rewrite_repeated_app_fast_rewrite_ltac2 shows that even with a custom proof term that is linear in size in the number of rewrites plus the size of the proof term (something that no other rewriting algorithm achieves, AFAIK), typechecking the proof term is still quadratic.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:14):

@Jason Gross I did a bit of tinkering around evar allocation and rewriting, the linearization is probably a consequence of that.

view this post on Zulip Jason Gross (Dec 07 2021 at 13:14):

Also, @Pierre-Marie Pédrot , typeclass insurance search is only a severe bottleneck when the typeclass database is poorly tuned. Granted, it's so hard to get the right instances at the right priorities that I've only ever bothered to fully optimize the instances for performance testing, and in every real-world case I've seen where I haven't spent a couple hours performance-optimizing it for some benchmark, instance search is a significant bottleneck. (Do you have any stats on the extent to which typeclass insurance search is itself bottlenecked on evar instance allocation/manipulation?)

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:15):

The stdlib setoid rewriting database is terrible. I've been looking a bit at it and it has funny exponential blowups

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:16):

any non-trivial rewrite I've ever seen is just monstruous

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:16):

Funny thing is that I was thinking about this precisely this weekend

view this post on Zulip Jason Gross (Dec 07 2021 at 13:16):

(I should clarify that by "fully optimize" I mean "add every single insurance required to ensure that every tc resolution problem is solved in one step")

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:16):

and I wonder why we don't have a canonical-structure-based setoid rewriting

view this post on Zulip Jason Gross (Dec 07 2021 at 13:17):

Also @Pierre-Marie Pédrot we should probably close the corresponding issue claiming that autorewrite is super linear even when there are no instances to rewrite (unless you already closed it?)

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:17):

(I mean, it's not that hard to implement but I wonder if there would be users.)

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:18):

@Jason Gross please do it if you find it, I always have trouble skimming through the ocean of issues

view this post on Zulip Karl Palmskog (Dec 07 2021 at 13:18):

I would probably be a user, since I've stayed away from setoid rewriting mainly due to the unpredictability of typeclass resolution

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:19):

Note also that evar blowup is problematic only when you start to have big contexts, whereas poor tc resolution bites you immediately.

view this post on Zulip Jason Gross (Dec 07 2021 at 13:19):

Is ssr rewrite not already based on canonical structure matching? Would CS rewrite have any benefit over ssr rewrite? (Would it allow extending ssr rewrite to run under binders?)

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:20):

AFAIK ssr rewrite uses setoid rewriting under the hood? The only difference is the pattern algorithm which is reputed to be better.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:21):

Or maybe just the non-setoid one from Equality.

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 13:21):

A big problem in TC search should be fixed by Matthieu’s MR to change Reflexive’s mode.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:21):

But yeah, the ssr rewrite is mostly about picking patterns, not so much about anything else (and in any case setoid or CS is orthogonal to that)

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:22):

@Paolo Giarrusso this is true, but logic programming is always there to try to hit you in the head when you're not aware

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:22):

CS-based rewriting is the best of both worlds

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 13:23):

Logic programming does not require backtracking, ask @Robbert Krebbers.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:23):

Well, then it's not quite LP anymore...

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 13:24):

Syntax-directed proof search is still proof search?

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 13:24):

I've also heard Robbert claim that one reason (the reason?) Iris uses typeclasses is because it definitely needs setoids. (Lots of setoids).

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:24):

When the search is deterministic I don't think you can call that a search. It's called pattern-matching.

view this post on Zulip Jason Gross (Dec 07 2021 at 13:25):

(the autorewrite issue has been closed: https://github.com/coq/coq/issues/12600)

view this post on Zulip Jason Gross (Dec 07 2021 at 13:25):

(it looks like #14253 closed it in June)

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 13:26):

Open-world pattern matching? Maybe. Regardless of how we call it, Iris Proof Mode uses typeclasses but mostly (or fully?) without backtracking (again, ask Robbert for the details).

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:27):

OK, so it'd be a good candidate for CS setoid rewriting.

view this post on Zulip Karl Palmskog (Dec 07 2021 at 13:29):

having a project with both an extensive CS packaged class hierarchy and then also going full-on typeclasses and TC setoid rewriting... I tip my hat to that, but very few projects can set it up.

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 13:29):

To clarify, the IPM interface doesn't use rewrite, but the implementation (and the ecosystem) does.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:29):

(@Jason Gross incidentally, as discussed in the issue, the source of slowness is not evar allocation but [something related to] pattern selection)

view this post on Zulip Jason Gross (Dec 07 2021 at 13:30):

I'd be interested in CS-based rewriting. In particular if it can handle https://github.com/coq-community/coq-performance-tests/blob/master/src/fiat_crypto_via_setoid_rewrite_standalone.v better than the currently-required dozen or so calls to rewite_strat, that would be a very significant improvement in the feasibility of using rewrite at scale

view this post on Zulip Jason Gross (Dec 07 2021 at 13:31):

Oh, it's not evar allocation for pattern selection? I guess I misremembered.

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:35):

It's not hard to misremember, given that we have that many quadractic-cubic-quartic algorithms spread around. When it's not the famous one that kills you, you're maimed by another one.

view this post on Zulip Evgeniy Moiseenko (Dec 07 2021 at 13:37):

@Jason Gross thank you for pointing out a reference, this is exactly what I was looking for.
Also many thanks to others for this discussion.

view this post on Zulip Jason Gross (Dec 07 2021 at 13:42):

@Evgeniy Moiseenko glad to help (and thanks @Karl Palmskog for tagging me, I would have missed this otherwise)

view this post on Zulip Jason Gross (Dec 07 2021 at 13:44):

@Pierre-Marie Pédrot is there any hope of eliminating all of these algorithms? (I have a stalled paper in the works that lays out the proof engine primitives that need to have good enough performance (I guess I need to add "constant time is_ground"), and calls for a proof engine that achieves all of these)

view this post on Zulip Pierre-Marie Pédrot (Dec 07 2021 at 13:45):

don't hold your breath, but there is still a bit of hope though

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 17:08):

Should we all just switch to Jason's technique? I've heard that might be difficult unless your rewrite rules are just right, but I haven't looked any close...

view this post on Zulip Jason Gross (Dec 07 2021 at 17:40):

The reason I haven't released a non-dev version of the rewriter I built is that (a) it's way too restrictive for everyday use (it basically doesn't support dependent types at all, and even container types like list and option have to be hard-coded by hand), and (b) the proofs are kind of unmaintainable. I think there are a couple insights to be had:
1 (already known). Reflective rewriting is necessary for efficiency. (I do have an (unpunished, mostly just on paper) variant of the rewrite_strat algorithm that, by careful accounting, ought to generate proof terms linear the sum of the number of rewrites and the size of the term, even in the presence of binders, but I have yet to find any way to get Coq to typecheck such a term in linear time (due to quadratic substitutions when typechecking nested let's/lambdas).). Hence it would be quite useful to have some out-of-the-box-usable axiom-free reflective rewriting package. I'd bet that something like MetaCoq would be a good starting point for what to base this on.

  1. The best rewrite order when binders is involved is the reduction order of NbE. rewrite_strat should support this order, and any new reflective rewriter should have this support as well
  2. By paying an upfront cost at hint-database-creation time, reflective rewriters can be specialized to the task at hand, and this can net performance benefits (around 8x in my examples, IIRC)
  3. Issues of denotation-function-completeness (e.g., not having support for particular inductive types or for a limited number of universes) can be solved by having the tactic automatically emit some definitions (ideally some inductive types as well) and some light-weight proofs, and parameterizing the overall denotation function and its proofs over such parameters

view this post on Zulip Karl Palmskog (Dec 07 2021 at 17:47):

one aspect I've wondered about: even though the HOL family has only very limited reflection (in Isabelle), they seem to do quite well performance-wise with the rewriting they have. So it seems like a harsh tradeoff to skip dependent types to get good rewriting performance in Coq.

view this post on Zulip Jason Gross (Dec 07 2021 at 18:14):

@Karl Palmskog I think dependent types are a red herring. As I understand it, when they use rewriting on terms with sufficiently many binders, they run into the same asymptotic performance issues that are present in Coq. Furthermore, most of the bottlenecks in Coq's rewriting engine have very little to do with dependent types (see the aforementioned typeclass search, closedness-checking, and evar instance allocation bottlenecks). And dependent types are hard to manage in reflection, AIIUI, sort-of only because the expressive power of your metatheory needs to be strictly greater than the expressive power of your object theory, if you want to handle everything.

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 20:54):

@Karl Palmskog term size for HOL is 0, that's a major speedup they get from not having universes :smiling_imp:

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 20:55):

or fixpoints, ahem

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 20:56):

(Isabelle _could_ produce proof terms, but Coq _can't_ skip them AFAIK, simply because Coq typechecking is not compositional, due at least to universes and guardedness checking)

view this post on Zulip Karl Palmskog (Dec 07 2021 at 20:58):

yes, proof terms are seldom recorded in HOL workflows, but they are there if you want them

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 21:01):

my point is that it's a significant performance boost, which Coq only has when using definitional equality

view this post on Zulip Jason Gross (Dec 07 2021 at 21:40):

No, I don't think this is such a big performance boost. There are some developments where the bottleneck is Qed-time (but even then, the most common bottleneck is either universe checking or conversion being slow because simpl doesn't insert enough information in the proof term, and neither of these is really about leaving behind proof terms). But most of the time, there's not much performance difference between running abstract <your tactic> vs running your tactic + running abstract admit (which counts the time it takes to typecheck the type of the goal). This suggests that typechecking the proof term doesn't take much.

Moreover, the bottleneck that forces naive rewrite to be super-linear in the number of binders arises as soon as there's a proof witness that can name its type / what it's a witness of, regardless of whether or not an entire proof term is kept. AFAIK, HOL does in fact keep around abstract proof witnesses that can name their type.

view this post on Zulip Jason Gross (Dec 07 2021 at 21:43):

I thought only guardedness/productivity checking was non-compositional? Typechecking of universes is compositional in the monad of universe constraints.

view this post on Zulip Gaëtan Gilbert (Dec 07 2021 at 22:35):

Jason Gross said:

I thought only guardedness/productivity checking was non-compositional? Typechecking of universes is compositional in the monad of universe constraints.

what does that mean?

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 23:56):

I guess that's the writer monad over the monoid of constraints under merge?

view this post on Zulip Paolo Giarrusso (Dec 07 2021 at 23:57):

and I see your point: Universes violate separate typechecking enough to prevent full checking vos/vok, but it _could_ still be compositional, if the constraints are just an output — separate compilation is just stricter.


Last updated: Aug 19 2022 at 19:03 UTC