Stream: Coq devs & plugin devs

Topic: Coqdep slow


view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:05):

Coqdep feels slow these days, doesn't it? Could we make Dune call it only once for the whole stdlib? It would then be easier to profile and optimize it.

view this post on Zulip Karl Palmskog (Mar 03 2022 at 08:06):

@Maxime Dénès have you seen the discussion of changing Dune to call coqdep once-and-for-all?

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:06):

No, thanks for the pointer!

view this post on Zulip Karl Palmskog (Mar 03 2022 at 08:07):

OK, this is then yet another independent confirmation that the Dune+coqdep issue will be rediscovered until fixed...

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 08:08):

I don't feel like Coqdep got any slower. Do you have some specific test case?

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:09):

I run dune build on Coq's master ;)

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 08:09):

Oh! So, that is not Coqdep, that is just plain Dune. If you perf it, you will notice that all the time is wasted in computing the checksums of all the files of the repository.

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:10):

It is doing that in between calls to coqdep?

view this post on Zulip Karl Palmskog (Mar 03 2022 at 08:11):

would be interesting if someone did a comparison when using https://github.com/ejgallego/dune/tree/coq+coqdep_per_theory --- but the fork may be too old

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:12):

I slightly incorrectly expressed my "feeling" indeed: what feels slow is the batch of calls to coqdep, maybe not each call. So if checksum computation is happening between calls, it could very well account for it.

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:14):

@Guillaume Melquiond is this what you were saying it does?

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:19):

Dune build does launch a lot of coqdep processes for sure...

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 08:20):

Here is what I am saying:

$ dune build && echo crap >> theories/Reals/Reals.v && dune build && perf record dune build
$ perf report
  43,79%  dune        dune                [.] caml_MD5Transform
  18,56%  dune        dune                [.] mark_slice
   6,12%  dune        dune                [.] caml_page_table_lookup

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:20):

but that doesn't account for the time spent in the coqdep processes though?

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 08:21):

It does. I didn't pass --no-inherit.

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 08:21):

Here is the first coqdep related line:

  0,37%  coqdep      [unknown]           [k] 0xffffffffa5a0405c

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:22):

@Pierre-Marie Pédrot The time spent in coqdep is an illusion IIUC. It is Dune being very slow moving from one coqdep call to the next.

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:23):

but coqdep is IO-bound, so perf won't see that will it?

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 08:25):

Indeed, but Dune reads the exact same files as Coqdep anyway. (And everything in the hot cache, so the time spent inside the kernel does not exceed 15%.)

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 08:26):

s/inside the kernel/waiting for the kernel/

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:28):

But then why do calls to ocamlc/ocamlopt look much faster?

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:30):

Yeah, there is something very fishy there. I think I remember that @Emilio Jesús Gallego Arias mentioned that coqdep fetched the presence of all dependencies on the disk and that was the real issue.

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:30):

But there was a change of behavior somewhere, no?

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:31):

Could be just me not remembering how slow things were, but I doubt it.

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:31):

dune build was always slow for me, this is not new

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:32):

and the coqdep problem was already being discussed when the switch was performed AFAIR

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:32):

But make was also calling coqdep, does Dune do something special?

view this post on Zulip Maxime Dénès (Mar 03 2022 at 08:32):

The discussion Karl pointed at starts with "coqdep 8.15 is slower so"

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:33):

Makefile relies on the timestamps, whereas dune does something content-oriented

view this post on Zulip Pierre-Marie Pédrot (Mar 03 2022 at 08:34):

the makefile build is still fast for me, i.e. I don't have to wait ~5s before the v files start compiling

view this post on Zulip Enrico Tassi (Mar 03 2022 at 08:58):

It is the usual confusion, if you just measure the time it takes to get a coqtop-prelude running, I'd say dune is OK. If you actually need to build the .vo, then dune is not great. If you change branch every ten minutes then dune thanks to the cache is blazing fast, if you cd ../another-work-tree then it is slower ;-)

view this post on Zulip Maxime Dénès (Mar 03 2022 at 09:22):

My use case was I was working on stuff in clib and wanted to build stdlib.

view this post on Zulip Maxime Dénès (Mar 03 2022 at 09:22):

No branching / switching worktrees involved.

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 09:23):

Pierre-Marie Pédrot said:

Makefile relies on the timestamps, whereas dune does something content-oriented

Just because you are content-oriented does not mean that you have to be slow. For example, Git is content-oriented and it remembers the checksum of all the files. But it only recomputes the checksum of a given file if its modification time was changed since the last time it computed its checksum.

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 09:33):

But make was also calling coqdep, does Dune do something special?

@Maxime Dénès make calls coqdep once, dune calls it for each file and has it depend on basically the entire source tree. And @Guillaume Melquiond 's analysis doesn't explain why this process is slower with coq 8.15 than 8.13 in clients...

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 09:36):

Do you have actual measurements or is this just your feeling?

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 09:37):

it’s the same kind of feeling as Maxime: the difference is obvious, the cause might be misattributed.

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 09:38):

and I haven’t run perf yet, because it doesn’t exist on MacOS.

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 09:38):

but I’ll take your point that more analysis is needed!

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 09:38):

OTOH: (1) I also upgraded to dune 3.0.2 recently, but that shouldn’t be relevant unless there’s a serious performance bug (2) I’m on MacOS not Linux, so relative performance might differ.

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 09:53):

Some still unscientific benchmarks: So, with dune 2.9.3 and Coq 8.15 doing dune build theories/ in https://github.com/bedrocksystems/brick (with 138 source files), the phase where dune calls coqdep took ~18s, with dune taking ~5% CPUs and coqdep processes often showing up with 100% CPU usage. This was after echo '(* *)' >> theories/prelude/finite.v. But these results won't be yet exactly reproducible.

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 09:55):

Also that's 18s on an M1 Max with all 10 cores loaded, so at least 18*8 = 144s CPU time (ignoring the 2 efficiency cores). Next, I'll repeat this on Coq 8.13 to test if there's indeed a difference. [And probably move to Linux so I can _also_ collect perf data]

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 10:06):

Hum... I was trying to measure the slowdown on my computer. But I am not even able to compile Coq 8.13.2 and 8.14.1 with Dune, so as to compare. The builds fail with the following kind of errors (on fresh checkouts):

$ dune build
regen_readme doc/sphinx/README.gen.rst (exit 1)
(cd _build/default/doc/sphinx && ../tools/coqrst/regen_readme.py README.gen.rst)
Traceback (most recent call last):
  File "../tools/coqrst/regen_readme.py", line 24, in <module>
    import sphinx
ModuleNotFoundError: No module named 'sphinx'
File "topbin/coqtop_byte_bin.ml", line 22, characters 29-52:
22 |     begin match Hashtbl.find Toploop.directive_table "rectypes" with
                                  ^^^^^^^^^^^^^^^^^^^^^^^
Error (alert deprecated): Toploop.directive_table
File "topbin/coqtop_byte_bin.ml", line 31, characters 42-59:
31 |            { load_obj = (fun f -> if not (Topdirs.load_file ppf f)
                                               ^^^^^^^^^^^^^^^^^
Error (alert deprecated): Topdirs.load_file

view this post on Zulip Maxime Dénès (Mar 03 2022 at 10:11):

maybe with --profile=release ?

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 10:15):

FWIW https://dune.readthedocs.io/en/stable/advanced-topics.html?highlight=timing#profiling-dune should get dune's own profiling data (I'll add it to my experiments).

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 10:18):

Maxime Dénès said:

maybe with --profile=release ?

That helps. But now, for some reason, Dune has decided to run the whole testsuite.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:20):

A few things here:

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:20):

As for the default target for dune build indeed, it includes all targets, the test-suite, the docs, etc... We have discussed restricting that, @Théo Zimmermann may know more about the status of it

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:21):

@Guillaume Melquiond see Makefile.dune for better targets, you are trying to build the docs and it fails

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 10:21):

I am pretty sure that, on master, dune build is not running the testsuite.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:22):

Yes, indeed Theo added this finally:

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:22):

(alias
 (name default)
 (deps coq-core.install coq-stdlib.install coqide-server.install coqide.install))

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:22):

but on older versions, you need to use these targets to compare

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 10:22):

@Emilio Jesús Gallego Arias the other discussion was linked, but that perf data asks a good question

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:22):

@Paolo Giarrusso do you mind repeating the question?

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 10:24):

Emilio Jesús Gallego Arias said:

dune build @check should be the default

dune build @check does not seem to build the theories. So, how do you use it to measure the time taken by Coqdep?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:25):

@Guillaume Melquiond I was referring to Maxime's use case for working on clib

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:26):

if you wanna work on a particular .v file, then dune build foo.v && dune exec -- dev/shim/coqtop-prelude is the best we have now, until dune coqtop by @Rodolphe Lepigre lands in dune

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:26):

which should happen soon thanks to him!

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:27):

for coqdep there are 3 scenarios:

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:27):

first and second require all the .v files to be in place, so it can be more expensive in some situations, for example if the generated .v files are expensive

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:28):

2 is always better than 1 in most cases, but as I mentioned some time ago, i wanted to experiment with 1, but it was not a good idea

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:28):

3 is the best, but we don't fully have a good spec for coqdep -modules

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 10:28):

@Emilio Jesús Gallego Arias to repeat the question, Guillaume's perf data suggests that (in his scenario) most time is spent in dune checksumming not on coqdep calls.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:29):

Paolo Giarrusso said:

Emilio Jesús Gallego Arias to repeat the question, Guillaume's perf data suggests that (in his scenario) most time is spent in dune checksumming not on coqdep calls.

I'm surprised to see that, my trace says something very different. hashing is only done when the heuristics to avoid it fail; there is a debug mode actually to see why a hash was triggerred

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:29):

dune checks fstat etc... to avoid re-hashing, so while we have seen bugs in the past, a zero build should do no hashing

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 10:30):

and that's in Coq builds, and my quick tests on a library blame coqdep again but they're much rougher

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:30):

let me show the code that implements this check

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:30):

yes, coqdep is the main bottleneck we have, even so with this dune is much faster than make in my experiments

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:30):

it doesn't worry me a lot as this is an easy change to make, we just need to find some manpower

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:31):

I'm still blocked until the 17th

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:31):

:(

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:31):

but I'll be happy to help , I think I already posted a blueprint on how to do it in the dune stream

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 10:33):

your advising is useful, and as I keep saying, it's no problem even if you keep having 0 coding time on this for the foreseeable future, and the community needs to step up :-). Your mentorship will still be necessary.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:34):

here's the code that decides when to re-hash

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:34):

https://github.com/ocaml/dune/blob/72b6f8a5192ade12d47beafbbcae1b32771ef5ad/src/dune_engine/cached_digest.ml#L271

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:34):

@Paolo Giarrusso I suggest we continue the discussion about implementing the one coqdep per theory on the dune stream

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 10:45):

So, here are the results on my computer with 8.13.2, 8.14.1, and master. The setting is the same as before: Coq compiles successfully, then Reals.v is filled with crap, how long does it take to reach the error.
8.13.2:

$ \time dune build --profile=release coq.install coqide-server.install coqide.install
File "./theories/Reals/Reals.v", line 35, characters 0-4: Error: Syntax error: illegal begin of vernac.
Command exited with non-zero status 1
3.76user 0.61system 0:03.95elapsed 110%CPU (0avgtext+0avgdata 363908maxresident)k
0inputs+4800outputs (0major+190906minor)pagefaults 0swaps

8.14.1:

$ \time dune build --profile=release coq-core.install coq-stdlib.install coqide-server.install coqide.install
File "./theories/Reals/Reals.v", line 35, characters 0-4: Error: Syntax error: illegal begin of vernac.
Command exited with non-zero status 1
3.90user 0.52system 0:04.01elapsed 110%CPU (0avgtext+0avgdata 357320maxresident)k
0inputs+6472outputs (0major+191039minor)pagefaults 0swaps

master:

$ \time dune build --profile=release coq-core.install coq-stdlib.install coqide-server.install coqide.install
File "./theories/Reals/Reals.v", line 35, characters 0-4: Error: Syntax error: illegal begin of vernac.
Command exited with non-zero status 1
3.92user 0.53system 0:04.03elapsed 110%CPU (0avgtext+0avgdata 353312maxresident)k
0inputs+6272outputs (0major+190656minor)pagefaults 0swaps

So, in all the three cases, elapsed time is almost exactly 4 seconds. There is no difference between the versions of Coq.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:47):

Yes @Guillaume Melquiond , why there should be? I am sorry, I missed the original hypothesis you folks were testing

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:47):

what's the dune version by the way

view this post on Zulip Guillaume Melquiond (Mar 03 2022 at 10:48):

@Paolo Giarrusso said that it got much slower between 8.13 and master.
And this is Dune 2.9.1 for me.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:48):

there should be significant differences between 2.9.x and 3.0 for zero builds

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:48):

Guillaume Melquiond said:

Paolo Giarrusso said that it got much slower between 8.13 and master.
And this is Dune 2.9.1 for me.

I see, thanks; likely due to our tweaks to targets, etc... still the workflow is in flux as we all know and suffer

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:49):

last time I checked, doing one coqdep call instead of what we do today will save around 12 secs on a zero build for the stdlib

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:49):

so that's pretty significant, in incremental builds not a lot, and in ml only build nothing

view this post on Zulip Emilio Jesús Gallego Arias (Mar 03 2022 at 10:50):

12 secs on my old machine, which I think it is pretty representative (4 cores / 8 threads power efficient mobile cpu)

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 11:01):

@Guillaume Melquiond Thanks for your data, I'll keep it in mind and look more closely; but my claim was always on user libraries, among all the other differences in environments.

view this post on Zulip Enrico Tassi (Mar 03 2022 at 12:06):

Were you composing many projects? (as in dune's slang)

view this post on Zulip Paolo Giarrusso (Mar 03 2022 at 14:53):

@Enrico Tassi my project composes 4 theories, but the 18s test involves 2 theories. (EDIT: but this isn't real "dune composition" since there's a single dune-project)

view this post on Zulip Emilio Jesús Gallego Arias (Apr 05 2022 at 19:20):

For those annoyed by repeated coqdep runs for all .v files in a theory, the issue should be fixed soon in master, turns out I had missed the right API

view this post on Zulip Emilio Jesús Gallego Arias (Apr 05 2022 at 19:21):

https://github.com/ocaml/dune/pull/5547

view this post on Zulip Rudi Grinberg (Apr 07 2022 at 05:28):

Guillaume Melquiond said:

Pierre-Marie Pédrot said:

Makefile relies on the timestamps, whereas dune does something content-oriented

Just because you are content-oriented does not mean that you have to be slow. For example, Git is content-oriented and it remembers the checksum of all the files. But it only recomputes the checksum of a given file if its modification time was changed since the last time it computed its checksum.

You might be pleased to know that dune already does that. In fact, it's the only thing dune ever did. If I had to guess, the checksumming in your trace is from comparing the digests computed from the stat calls. Doing a 0 build with all targets essentially requires stat'ing every single file in your tree, so it isn't all that cheap.

view this post on Zulip Rudi Grinberg (Apr 07 2022 at 05:37):

Emilio Jesús Gallego Arias said:

https://github.com/ocaml/dune/blob/72b6f8a5192ade12d47beafbbcae1b32771ef5ad/src/dune_engine/cached_digest.ml#L271

Note that these are the digests of the information derived from stat. To see where the digest of the contents is computed, follow the callers of Cached_digest.build_file


Last updated: Feb 05 2023 at 21:03 UTC