Stream: CUDW 2020

Topic: The scary stuff thread [you've been warned] [we troll]


view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:46):

@Enrico Tassi that's my exactly my point : why on earth do you do the wrong thing by default? backtracking should be opt-in.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:46):

I agree.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:46):

Oh my.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:47):

But this mistake has very good consequences. Without it you could not organize your programs in clauses that easily

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:47):

And this is key to => (adding a clauses to your program within a scope).

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:47):

Which in turn is key to handle binders

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:47):

and many other idioms

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:48):

I don't have a particular opinion on λ-prolog, apart that it looks like Frankestein.

view this post on Zulip Cody Roux (Dec 01 2020 at 21:48):

Popping up a level: how should terms be represented in an ideal (ly performant) system? Terms + hashconsing, DAGs with pointers everywhere, functional structures with "zippers", other?

view this post on Zulip Cody Roux (Dec 01 2020 at 21:49):

Should Evars be mutable?

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:49):

was Frankestein elegant?

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:49):

Hashconsing is another beast. Filliâtre claims that everything should hashcons implicitly, but the assumptions of this reasoning is flawed in Coq.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:49):

Cody Roux said:

Should Evars be mutable?

it's an implementation detail. According to my experience: yes.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:50):

The problem with mutable evars is that I don't see how not to have a super buggy engine.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:50):

Barring a type system à la Rust

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:50):

you trail the updates?

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:50):

you could do that, but it has a cost as well.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:51):

especially in a very backtracking setting you'll lose all the benefits.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:51):

yes and no. a pretty standard (necessay) optimization is that if you don't have choice points, then you don't trail. if you do so, you are as efficient as ocaml (with mutable state).

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:51):

looks reasonable.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:51):

but you also gain ocaml's GC on this heap we call evarmap

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:51):

today we have no GC at all

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:51):

Yeah, it's a clear bonus.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:52):

elpi "POC" (in the git repo) was written like Coq. When we move to imperative evars + trail we got 10x speedup

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:52):

No GC is bad for memory consumption but I don't think it affects speed that much.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:52):

I am not surprised.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:53):

maybe in Coq it would not be that much, but I would be surprised if it was not visible

view this post on Zulip Cody Roux (Dec 01 2020 at 21:53):

Is there some "canonical paper" for how to implement mutable Evars in this way? Are trails just suspended substitutions?

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:53):

the trail is just to undo assignments

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:54):

you still need to apply evars to all its arguments

view this post on Zulip Cody Roux (Dec 01 2020 at 21:54):

that happens when you get a unification failure?

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:54):

which is a sort of suspended substitution

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:54):

if I get what you mean

view this post on Zulip Cody Roux (Dec 01 2020 at 21:54):

I feel like a 3rd grader in this thread

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:54):

From my experiments, evar access has a clear cost but it's not the major source of speed issues.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:55):

@Cody Roux a trail is the standard trick to implement "immutability"-like behaviour with mutable state.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:55):

like in persistent arrays

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:55):

it's a log of things you have to undo

view this post on Zulip Cody Roux (Dec 01 2020 at 21:56):

Ok. But in what situations do you have to undo Evar instanciations? I feel like I should know this...

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:56):

If you backtrack.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:56):

tac1 || tac2

view this post on Zulip Cody Roux (Dec 01 2020 at 21:56):

Ok. But this doesn't happen at elaboration time.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:57):

if tac1 assigns some evars before failing, then tac2 should start from scratch

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:57):

Oh, it probably does.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:57):

Who knows what case elaboration does!

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:57):

eg, during type class instance search

view this post on Zulip Cody Roux (Dec 01 2020 at 21:57):

Right, but that's more of a bug than a feature.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:57):

Type class search is collapsed to its first result in elaboration currently.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:58):

aka "the pretyper is not in the monad"

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:58):

So in elaboration proper we don't really have backtracking. (We do in unification, type class search, tactics in terms, etc.)

view this post on Zulip Cody Roux (Dec 01 2020 at 21:59):

Ok this is good to know.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:59):

We would like to have a bit more controlled backtracking in some cases.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:59):

Like the pretyper assigning universes too eagerly.

view this post on Zulip Enrico Tassi (Dec 01 2020 at 21:59):

TC search can backtrack internally

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:59):

The dreaded "foo has type Type but was expected of type Set" error

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 21:59):

where you never wrote Type in the first place

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:00):

anyway, backtracking on the beginning of the conversation

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:00):

proof engine needs fast evars

view this post on Zulip Cody Roux (Dec 01 2020 at 22:01):

I wish I understood universe polymorphism better tbh. When I was young and stupid I thought it was easy to have explicit universe quantifiers in the kernel, but now I'm not so sure.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:01):

It is easy.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:01):

It is just a catastrophe performance-wise.

view this post on Zulip Cody Roux (Dec 01 2020 at 22:01):

Ok, maybe I was confused by the number of critical bugs involving them.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:01):

Ah, this is just a historical accident due to 8.5

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:02):

"The StillBorn 9.0 from Hell"

view this post on Zulip Cody Roux (Dec 01 2020 at 22:02):

I love how we present Coq to other people vs how we speak about it internally

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:02):

were three (four?) major branches were merged at the same time, resulting in a code mess we're currently finishing up to clean

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:02):

Note that I got user reports that CoqIDE 8.5 was better than CoqIDE 8.6 :-P

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:03):

Enrico was the leader of the ParalITP faction

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:03):

STM + univpoly + primproj + logic-monad

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:03):

"Forever thou shalt be punished for the introduction of the STM"

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:03):

what else?

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:03):

univpoly / primproj are in the same package

view this post on Zulip Cody Roux (Dec 01 2020 at 22:04):

Software Transactional Memory?

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:04):

I'm about to replace it with a new thing that does the same but is called differently

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:04):

State Transaction Machine

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:04):

the name has been long forgotten since

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:04):

it's a mix (hence the mess) of a document manager and a document-checking manager

view this post on Zulip Cody Roux (Dec 01 2020 at 22:04):

first hit on google! Nice.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:05):

it is but a cursed syntagm that often bars your path in the way to enlightment

view this post on Zulip Cody Roux (Dec 01 2020 at 22:05):

I remember confusing conversations with Markarius actually.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:05):

so yes, merging three major branches in one release is not good for the soundness of your kernel

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:06):

Pierre-Marie Pédrot said:

univpoly / primproj are in the same package

that was part of the mess, IMO ;-)

view this post on Zulip Cody Roux (Dec 01 2020 at 22:06):

ouch

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:06):

true also

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:06):

the compat layer

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:06):

oh noes

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:06):

I'm going to bed now

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:06):

you'll have nightmares

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:06):

too late

view this post on Zulip Cody Roux (Dec 01 2020 at 22:06):

Thanks to both, this was very informative!

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:07):

the eldritch horrors of side-effects are going to haunt you beyond non-euclidean monadic code

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:07):

If you say so ;-)

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:07):

let (*) x y = x + y (today, PMP)

view this post on Zulip Enrico Tassi (Dec 01 2020 at 22:07):

good night

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:07):

wut. this is perfectly legal.

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:07):

g'night

view this post on Zulip Cody Roux (Dec 01 2020 at 22:07):

No really. The only thing it takes is an amateurish experience of trying to implement your own DTT/ITP

view this post on Zulip Pierre-Marie Pédrot (Dec 01 2020 at 22:09):

proof assistants are very complex objects.

view this post on Zulip Paolo Giarrusso (Dec 01 2020 at 22:35):

No GC is bad for memory consumption but I don't think it affects speed that much.

Memory consumption should affect speed... Except if you've already given up hope for cache locality.

view this post on Zulip Paolo Giarrusso (Dec 01 2020 at 22:38):

which seems plausible anyway, ASTs are a nightmare for locality. IIRC this seemed a major pain point for Scala compilers, even with Java’s great GC — probably the lack of value types in JVMs makes locality there even worse than needed.

view this post on Zulip Cody Roux (Dec 01 2020 at 22:51):

I wonder if Lean's ref counting + reallocate if refcount = 1 and you're creating a cell of the same type helps with this

view this post on Zulip Paolo Giarrusso (Dec 01 2020 at 23:00):

ah, I was assuming imperative ASTs; purely functional ones are even worse, but destructive updates are slightly less bad

view this post on Zulip Paolo Giarrusso (Dec 01 2020 at 23:03):

I haven’t done measurements recently, but basically the cache footprint is O(size of written memory) (rounded up, since each write pulls in a whole cacheline — ~16 bytes), so n purely functional updates are likely to have n times the cache footprint of the imperative version, until you trigger a minor GC and start reusing memory.

view this post on Zulip Paolo Giarrusso (Dec 01 2020 at 23:07):

OTOH, naive refcounting can be a pretty expensive form of GC, especially when done pervasively in a language implementations — it’s even worse as soon as you need atomic refcount operations because you’re running more than one thread. There’s a reason no good virtual machines use refcounting.

view this post on Zulip Paolo Giarrusso (Dec 01 2020 at 23:08):

but that doesn’t necessarily apply when you’re doing mostly manual memory management with just a bit of refcounting as needed.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 01 2020 at 23:35):

I'm 100% with Enrico on this discussion, I think Coq's proof engine has several catastrophic performance decisions, the more I have profiled it the more scary it looks, linear memory consumption for each goal! It is hard to do worse! In that sense researchers have an easy job as almost any other approach they use will be faster.

Pierre-Marie, logic programming is a fast implementation of a slow idea. And no doubt techniques they use are _at least_ an order of magnitude faster than what we have.

[I see that simpl appeared in the scary thread, I can sleep now]

view this post on Zulip Paolo Giarrusso (Dec 02 2020 at 00:51):

Evar implementations is also something that has been studied in isolation...

view this post on Zulip Paolo Giarrusso (Dec 02 2020 at 00:52):

... and some mature LPs add tabling :-P, even if it's much scarier

view this post on Zulip Cody Roux (Dec 02 2020 at 02:52):

I'd love to see a bulleted list of the worst offenders in terms of day to day Coq usage. Obviously that might depend on the development, but what makes each goal linear for example?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:43):

As I mentioned above, the naive evar implementation has a definite cost but it's not a leading one in day-to-day use. I believe that using an efficient representation for evars wouldn't bring more than a ~15% speedup (educated guess). So it's not neglectible, but it's not a tenfold improvement.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:44):

The linearity of goal size is due to the representation of evar instances, which are just the list of its arguments.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:44):

This is a stupid representation since for most of those arguments, they will just be the corresponding variable of the context.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:44):

Unfortunately one cannot use a compact representation without seriously changing the kernel.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:45):

Typical suspects for day-to-day slowness are: 1. universes, 2. hashconsing, 3. a hell of a lot of naive term traversals in unification.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:46):

I am working on 1. because it's really a pain and it can probably be alleviated with a bit of engineering (e.g. persistent arrays)

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:49):

  1. is a real conundrum. We want hashconsing otherwise we get OOM errors, but the typical hashconsing mechanism which is supposed to be amortized O(1) doesn't apply in Coq, because we unmarshall stuff. So, in particular, we do not hashcons on the fly but instead at definite points in the proof (typically at Qed). This is slightly superlinear in the size of the proof, and proofs are big. There was quite a bit of work in that area a few years ago but from all experiments the current tradeoff was the most efficient one in practice.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 08:52):

  1. is linked to an engineering issue / OCaml language limitation. Essentially, we have this one-size-fits-all term AST, but in many cases we would like to have more refined versions of it. A typical usecase would be to store compositional information inside each term node, e.g. whether the considered subterm contains evars. Unfortunately there is no easy way to do that in OCaml (interestingly it's an area where OOP shines), because every type of refined term is different from the standard one so you need either a very heavy machinery or a lot of code duplication. The net result is that we have a lot of quadratic algorithms where we keep recomputing stuff we already have computed for the subterms.

view this post on Zulip Gaëtan Gilbert (Dec 02 2020 at 09:26):

Pierre-Marie Pédrot said:

I am working on 1. because it's really a pain and it can probably be alleviated with a bit of engineering (e.g. persistent arrays)

how do persistent arrays help universes?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 09:52):

The fact that we have a log n access cost is observable.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 09:52):

amortized O(1) would be better

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 09:52):

unfortunately the AcyclicGraph API is a bit ugly, so we have to first simplify it

view this post on Zulip Janno (Dec 02 2020 at 10:06):

  1. universes

This thread suddenly became very interesting to me! Are there any plans to address the slowness of CClosure with polymorphic constants? I have flamegraphs if people want to see scary stuff. (yay, back on topic!)

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:10):

Oh, that one is a completely different issue.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:11):

Polymorphic universes have so many performance issues that I don't think the secondary ones should be prioritized.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:12):

I mean, I won't oppose your PRs, but when universes computation represent ~ 20% runtime of some heavy monomorphic developments, I am afraid I will not spend a lot of time trying to fill a Danaides barrel

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:12):

My personal take on that is that polymorphic universes have fundamental design issues w.r.t. performance.

view this post on Zulip Janno (Dec 02 2020 at 10:13):

Ooh, sorry, I assumed you were talking about polymorphic universes.

view this post on Zulip Janno (Dec 02 2020 at 10:13):

In any case, polymorphic universes easily represent 50% of my runtime :)

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:13):

Well, polymorphic universes are also affected by slow constraint folding.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:13):

Oh, I know. My PhD studend was brave enough to start writing shallow models of CIC in CwF style.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:14):

It's a small dev, a few hundred lines.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:14):

Very quickly all definitions have hundreds of free universe variables.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:14):

And then suddently, a wild Qed appears, which takes 40mn, most of which is due to universe trickery.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:15):

It's impressive how easy it it to completely blow up the complexity with univpoly.

view this post on Zulip Janno (Dec 02 2020 at 10:16):

The trick to get fast Qed is to use the buggy Time Qed which ignores the time taken by universe minimization!

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:16):

I think that in this case minimization is not culprit, the kernel is really suffering and panting heavily.

view this post on Zulip Janno (Dec 02 2020 at 10:17):

Ah, then I have no solution to offer :D

view this post on Zulip Janno (Dec 02 2020 at 10:17):

But I do want to mention that I observe obscene CClosure slowdown even when polymorphic constants have very few universes.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:18):

Yeah, it' proportional to the term size, not the number of universes

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:18):

(as long as nuniv > 0)

view this post on Zulip Janno (Dec 02 2020 at 10:18):

Right.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:19):

I think that one of Beta's PRs could have been tweaked to be more efficient and partially solve your problem.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:19):

IIRC Beta closed it because it had a nontrivial cost over monomorphic devs.

view this post on Zulip Janno (Dec 02 2020 at 10:19):

I think that was https://github.com/coq/coq/pull/8747/, right?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:19):

Ah yeah. My bad, it was your PR.

view this post on Zulip Janno (Dec 02 2020 at 10:19):

I recently asked Josh to help me revive it but I didn't have time to get that started.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:20):

Shouldn't be very hard.

view this post on Zulip Janno (Dec 02 2020 at 10:21):

I have to admit I have no idea what exactly is going on there. The initial implementation was easy enough to understand but then Josh took over and it quickly went over my head.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:22):

I happen to be working on substitutions today.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:22):

maybe I should try to do the same patch in my variant of substs.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:23):

(https://github.com/coq/coq/pull/13537, the revival of a PR I've closed, but I am decided to fix the small slowdowns it incurs)

view this post on Zulip Janno (Dec 02 2020 at 10:24):

I am rereading the changes in 8747 now and it doesn't seem that scary, actually.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:24):

Yeah, I agree.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:31):

Regarding 1. universes I think the problem comes from the fact we use a too simplistic universe algebra too. We cannot keep l+1 in universe instances so we always introduce a fresh l' (usually with l+1 <= l') when needed. Each subsequent use of a definition where we used that will generate a fresh universe and constraint, and this will just pile up to give humongous universe instances. If we allow algebraics everywhere we shall need much less fresh universes, actually 0.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:33):

I partly subscribe to this thesis.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:34):

That is, it is fairly obvious we do generate too many universes.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:35):

But I don't believe that introducing algebraics everywhere will be a silver bullet.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:35):

Regarding your proposal Janno, I guess it would be relatively easy to make CClosure not substitute universes eagerly by just maintaining a current universe substitution. You never tried that Pierre-Marie?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:35):

Isn't that essentially what that PR did?

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:36):

I don't think it will solve all issues, and inference will be more complicated probably, but it will help

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:36):

BTW, a common offender for universe generation is setoid_rewrite.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:37):

It refreshes the pattern a gazillion times.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:37):

I suspect there is actually a bug, since we should be able to share the fresh pattern when it doesn't unify.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:37):

Pierre-Marie Pédrot said:

Isn't that essentially what that PR did?

Yep, it seems very similar

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:39):

Re setoid_rewrite. I though we had an optimization that avoided refreshing unless necessary (e.g when going under binders, and even that could be optimized by "weakening" the evars in the pattern)

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:39):

Pierre-Marie Pédrot said:

As I mentioned above, the naive evar implementation has a definite cost but it's not a leading one in day-to-day use. I believe that using an efficient representation for evars wouldn't bring more than a ~15% speedup (educated guess). So it's not neglectible, but it's not a tenfold improvement.

To me a "decent" representation of evars includes the fact that the "heap" would be garbage collected.
As we know, today the evarmap is not garbage collected, and at the end of the proof is contains the entire history of the proof building process.
Of course, not having a GC is not a problem if you have enough memory and if you never traverse it. Both assumption are false, since we have users hitting the 64G limit (of normal machines) and the ocaml GC is going to traverse (for nothing) this huge heap from time to time.

As I said elpi was coded like that a long time ago. Doing list reversal was, IIRC, 22000x slower than ocaml. By "manually" adding instruction to remove from the heap useless cells, it went to 1000x slower IIRC. Of course in elpi everything is an evar, in Coq you do other stuff as well. But using less memory is also making your (ocaml) code much faster.

I would try to use weak pointers in the evar map. I don't know if ocaml can handle so many of them, but at least we would get a figure. with "imperative" evars, one would not need weak pointers, but this change is much harder to do and get figures about.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:41):

@Enrico Tassi evar maps do not only contain evars, universes are I believe more often than not a huge memory leak.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:41):

I was profiling memory consumption of Require recently and realized that a good chunk of the allocated memory in perennial came from universe naming.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:42):

Like, we want to give fancy names for universes, and actually 99% of our users don't give a shit, but for them it will still be a few hundreds of MB

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:43):

For universes, I have a different dream. A -type-upto 3 flag which keeps at most 3 non-floating universes in the graph and collapses all floating one the these 3 after each type inference.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:43):

640k should be enough for anybody?

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:43):

I boldly claim 90% of computer scientists and 80% of mathematicians would be happy with that.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:44):

The problem with that is that you will probably make eager choices wrongly, even for 99% of users.

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:44):

Also 3 is a nice number ;-)

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:44):

I'm not so sure. 99% of the times you mean Type0

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:45):

I'd be interested in such an experiment.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:45):

But we would probably realize some horrible problem midway.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:46):

Practice should tell, but if we have algebraics everywhere, maybe a lot of people would be content to use Set, Set+1 and Set+2 directly

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:46):

At some point @Assia Mahboubi did it with math comp (a long time ago) by replacing all Type with MyType from Definition MyType := Type. IIRC she needed 3 MyType.

view this post on Zulip Théo Zimmermann (Dec 02 2020 at 10:46):

Matthieu Sozeau said:

Practice should tell, but if we have algebraics everywhere, maybe a lot of people would be content to use Set, Set+1 and Set+2 directly

It would certainly be useful in the doc.

view this post on Zulip Théo Zimmermann (Dec 02 2020 at 10:47):

Also, if Set is really Type_0, it would be nice to have a way to check for this explicitly in Coq.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:47):

@Matthieu Sozeau what are you waiting for? That's more useful that induction-recursion, which will be used by Agda fanboys to prove the termination of simply-typed λ-calculus for the 4591460th time.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:48):

Well, I need help with the Bender & al algorithm :)

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:49):

For the -type-upto 3 thing, I think any algorithm would be efficient enough ;-)

view this post on Zulip Gaëtan Gilbert (Dec 02 2020 at 10:49):

Théo Zimmermann said:

Also, if Set is really Type_0, it would be nice to have a way to check for this explicitly in Coq.

What do you mean?

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:49):

Note that in the case of Set, Set+1 and Set+2, we don't even need to check the graph most of the time

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:49):

For the algebraic case... well, not so easy I'm afraid. (I mean with succ and max freely mixed)

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:50):

@Enrico Tassi but probably better in the long run

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:51):

@Enrico Tassi I'm wondering how you could decide if you have x < y what you should assign to x and y.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:51):

X := Type0, Y := Type1

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:52):

yeah!

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:52):

So kind of "minimal" solutions?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:52):

Actually, I believe that we should have a more principled approach for UGraph where this would be easy to perform.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:53):

Currently we have this hacky type-in-type stuff.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:53):

If we could use various implementation of the graph, it could be easy to switch from them.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:53):

There is no conceptual difference between a graph algorithm that collapses everybody on one point, and one that collapses only le nodes.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:55):

For me with type-in-type we should not consult the graph at all. But yes we can also interpret < and <= as = to model the same thing

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:55):

I have the impression that in such a mode one could live with Set=type0, Type=type1, U=type2 (I mean, even be explicit in the concrete syntax).
For _ one can still delay the choice, but it seems hard to believe that one gets many i < j for i and j not related to one of the 3 named universes.

view this post on Zulip Gaëtan Gilbert (Dec 02 2020 at 10:55):

if we have algebraics in instances we can also implement non cumulative mode

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:56):

@Gaëtan Gilbert yes, indeed that's one nice benefit.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 10:56):

Down with Cumulativity! Let's turn Coq into Agda!

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:56):

Haskell!

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 10:57):

@Enrico Tassi indeed in many cases they will be related

view this post on Zulip Enrico Tassi (Dec 02 2020 at 10:58):

That is my intuition, but I understand that "it works for 3" is not very satisfactory, intellectually speaking...

view this post on Zulip Assia Mahboubi (Dec 02 2020 at 11:00):

Enrico Tassi said:

At some point Assia Mahboubi did it with math comp (a long time ago) by replacing all Type with MyType from Definition MyType := Type. IIRC she needed 3 MyType.

Indeed. Disclaimer: it was a long time ago and mathcomp was way skinnier at the time.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 11:01):

That's still some good evidence!

view this post on Zulip Assia Mahboubi (Dec 02 2020 at 11:02):

Enrico Tassi said:

That is my intuition, but I understand that "it works for 3" is not very satisfactory, intellectually speaking...

It is a common situation in computer algebra systems to implement a portfolio of algorithms for a same task (e.g. for multiplying matrices or polynomials), each of which being better for a different class of inputs, due to the size of the constants hidden in big O....

view this post on Zulip Gaëtan Gilbert (Dec 02 2020 at 11:04):

we have Print Sorted Universes to tell use how many are needed btw, no need for hand replacing Type with MyType

view this post on Zulip Assia Mahboubi (Dec 02 2020 at 11:06):

This idea was to measure the gain in performance. It was much faster with explicit definitions.

view this post on Zulip Gaëtan Gilbert (Dec 02 2020 at 11:06):

see also https://github.com/coq/coq/commit/1c25f8bc8d926e85667d321555775200ddfd17d2

view this post on Zulip Enrico Tassi (Dec 02 2020 at 11:10):

Are you saying that we test that the stdlib works with 3 universes?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:12):

@Enrico Tassi yes.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:12):

I think it's 5 currently, but that's the idea.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:12):

Actually 4.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:13):

see test-suite/misc/universes.sh

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:13):

@Gaëtan Gilbert speaking of which, I am currently extruding universe sorting / printing from AcyclicGraph / UGraph, are you fine with that?

view this post on Zulip Gaëtan Gilbert (Dec 02 2020 at 11:14):

depends how the code looks
but why?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:14):

This stuff has nothing to do in the kernel (and it's the sole user of safe_repr :/)

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:14):

I want to replace the implementation of Point, and it's a mess.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:14):

sort is a big hack, to start with.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:14):

Instead, I will just dump the graph and print that in vernac.

view this post on Zulip Enrico Tassi (Dec 02 2020 at 11:15):

if Hurkens needs 4 to prove false, then limiting to 3 universes is also about soundness ;-)

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 11:15):

@Enrico Tassi You, sir, are a genius.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 16:46):

I'm afraid @Pierre-Marie Pédrot I cannot share your analysis and much less the 15% number you come with, I think Enrico gave numbers more close to the actual evidence people have. Current engine just is not in the same class of perf than other proof search systems.

By linear, I didn't mean that memory consumption per-goal is linear, which is, but it is also linear on the number of evars ever created. That's just scary. We have discussed many times about the proof engine, and I'm still to be convinced about the defenses of many things, including the high-level design where refine is not a part of the primitives.

view this post on Zulip Enrico Tassi (Dec 02 2020 at 16:51):

@Pierre-Marie Pédrot can we try to replace the IntMap in the evd with https://caml.inria.fr/pub/docs/manual-ocaml/libref/Ephemeron.GenHashTable.html (plus a call to nf_evar on the initial evar, say, every 1K evar allocations, or every end of run of the tactic monad)?

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 16:52):

You'll probably get evars reclaimed to eagerly, but, yes, please try.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 16:53):

Also, weak cells / ephemerons are far from being free.

view this post on Zulip Enrico Tassi (Dec 02 2020 at 16:54):

I'm too busy with the release, but maybe someone at the CUWD would like to play with this

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 16:58):

I try something like that but I never managed it to work, even after adding a copy [keep in mind that evar_maps are purely functional whereas hash may not]

But indeed we have some other problems like the imperative evar_counter, that's a mess once we try ocaml-multicore

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 16:58):

Not even equality of evars in two evar maps is properly defined if we want to assume some separation rules.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 16:59):

Imagine how fast will be Coq + multicore be when each evar creation does introduce a global lock :S

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 16:59):

But let me repeat something: universes are a worse hotspot than that.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 16:59):

Not so sure in view of @Jason Gross results

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 16:59):

how can you measure that

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 16:59):

This has nothing to do with the imperative representation of evars.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 16:59):

the kind of slowdown we are talking about is hard to measure, because it is intrinsic

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 17:00):

It's linked to the representation of instance, which is not related.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:00):

we are not just disucssing "imperative representation of evars"

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:00):

that alone does little

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 17:00):

I have no idea what you want to implement then, because I don't see how you're gonna gain anything with it.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:01):

we are talking about the whole design of the proof search engine, you can have an imperative representation and perform worse than the current evar_map , some advanced tricks are needed to make it fast

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 17:01):

Evar creation is already linear in the size of the instance.

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 17:01):

Anyways, I prefer rebasing my branch than vacuously trolling, so have fun.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:01):

for example, I'm convinced that evar == goal is not the right notion; I'm just repeating the discussion

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 17:01):

But don't expect even a 2x speedup, however fast your evar will be.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:02):

we had some time ago

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:02):

instantiating a unification variable is a totally different concept than solving a goal

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:02):

and indeed the engine does not provide operations for neither

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:02):

engine I mean the proofview

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:03):

I'm not trolling by the way as I'm dead serious

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 17:03):

I am waiting for your PR.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:03):

I wish I had the time to back my claims with code

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:03):

I don't say this is a small project

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:04):

I am already busy trying to have coqc not to recheck the same stuff over and over again when a user changes a line :)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 02 2020 at 17:04):

which is like 2 order of magnitude less ambitious but very tricky still

view this post on Zulip Pierre-Marie Pédrot (Dec 02 2020 at 17:05):

I would kindly recommend getting rid of the STM if you want to stop having a gaping memory leak in coqc, btw.

view this post on Zulip Matthieu Sozeau (Dec 02 2020 at 17:21):

Re. the representation of evars: even if we made evar "closed" i.e. of closed type (forall ctx, ty), I don't see how that would help in any way with the issue of evar instance representation, rather the other way around: it would make it harder to have compact representations for instances unless we go to fully represent explicit substitutions in terms. However that experiment (using explicit substitutions pervasively) was tried and failed in both Coq and Matita IIRC.

view this post on Zulip Jason Gross (Dec 02 2020 at 18:14):

I was under the impression that the bulk of the performance issue in the rewrite variants was allocating evars every time there's a head constant match or something like that, and so there's this pervasive issue where rewriting gets slower the more lemmas being used to rewrite. Is this not the case? (Btw, if it's helpful, I can open an issue with the "runs out of RAM and is way too slow" example from my defense)

view this post on Zulip Clément Pit-Claudel (Dec 04 2020 at 16:22):

In past projects I used something like this:

type 'a ast =
  | X of 'a node
  | …
and 'a node := { annot: 'a; data: 'a ast }

Then most of the code operates on 'a ASTs and there's a function reset_annots: 'a AST -> 'b -> 'b AST that you use before doing a transformation. This works well for turning quadratic operations on the AST into linear, but you don't get incremental updates of the metadata when you make changes to the AST (you have to reset all metadata before each pass). @Pierre-Marie Pédrot have you considered something along these lines?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 04 2020 at 16:25):

Note that this is already implemented in Coq, see lib/cAst.mli

view this post on Zulip Clément Pit-Claudel (Dec 04 2020 at 23:20):

Ah, super. Then I didn't understand PMP's point about quadratic blow-up

view this post on Zulip Pierre-Marie Pédrot (Dec 05 2020 at 11:51):

@Clément Pit-Claudel that's not what the kernel uses. And even if we wanted to use this structure, it'd be an engineering nightmare. It'd require a serious rewriting of Coq, we'd pay non-trivial overheads all over the place, and it's quite dubious from a trust point of view that you can maintain meaningful invariants there.

view this post on Zulip Pierre-Marie Pédrot (Dec 05 2020 at 11:51):

CAst is only used for syntactic data the user inputs, FTR.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 06 2020 at 12:12):

Oh indeed I was talking about the AST part, kernel terms are a different story.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 06 2020 at 12:12):

But I think in most cases you are dealing with ASTs


Last updated: Sep 15 2024 at 12:01 UTC