While trying to fix the CI for the category_theory development, I've been profiling it out of curiosity, to see why was it take so long. To my surprise it's a complete catastrophe that hits right in the problem that https://github.com/coq/coq/pull/8747 is trying to fix.
Namely, we keep substituting eagerly universe instances in the conversion algorithm, leading to terrible algorithmic performances.
I wonder whether we ever tried to bench this particular dev, because it seems to be the only one in the pool of stuff I know about that really stresses that part.
UniMath is very conversion-intensive but it sidesteps universe polymorphism by being inconsistent, and HoTT for some reason is not so much conversion-heavy. So there is no real contended apart maybe from Mtac2.
This raises two questions. 1. should we resurrect the delayed substitution patch in one way or another, and 2. should we add category-theory to the bench?
Right now it seems to be really slow but once the algorithmic issue is solve I expect wondrous speedups. It's literally passing its time in universe substitutions on all files I've been profiling so far.
cc @Janno
I think adding category-theory to the bench is a good idea
Did you try rebasing #8747 and seeing how category-theory builds with that?
Pierre-Marie Pédrot said:
UniMath is very conversion-intensive but it sidesteps universe polymorphism by being inconsistent, and HoTT for some reason is not so much conversion-heavy. So there is no real contended apart maybe from Mtac2.
My guess is that HoTT is not conversion-heavy because pretty much the first thing I learnt in performance-engineering Coq developments is to avoid conversion as much as possible, and so all the performance engineering I've done with HoTT, and all the performance engineering I've helped others do with HoTT, has first looked to see if there's any way we can avoid needlessly invoking conversion
Ali Caglayan said:
I think adding category-theory to the bench is a good idea
Seconded. Is there any reason not to add it to the bench?
I guess Mtac2 really occupies a particularly weird niche with its use of kernel reduction in its interpreter.
I am glad that there is at least one other development that really stresses that code path with universe polymorphic terms
It should definitely be added to the benchmarks, I think :)
Also, I am more than happy to help in any way I can with resurrecting the pull request but I still haven't fully understood what went wrong in the PR so I am probably going to be of limited use there.
I just ran perf
on some (closed-source) Mtac2 code I have here and it reports at least 30% time spent in kernel reduction. I also just remembered that we have some very contrived benchmarks in Mtac2 itself and last time I checked those clocked in at 50% kernel reduction.
I might actually get excited for Mtac2 performance again if we can manage to fix this problem :D
I opened https://github.com/coq/coq/pull/15543 to get the category-theory development into the bench job
Last updated: Apr 20 2024 at 05:01 UTC