Stream: Coq devs & plugin devs

Topic: Deactivating `par:` process spawning


view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 13:47):

Trying to fix the fiat-crypto CI, I realized in https://github.com/coq/coq/pull/16801 that neither -async-proofs off nor -async-proofs-j 1 deactivate the coqworker process spawning when evaluating par: blocks.

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 13:48):

This is bad because we use too much memory because of the two processes.

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 13:48):

Reading the code, I've only found a very hackish way to deactivate par: altogether and make it behave like a no-op.

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 13:48):

Namely, you have to pass the option -worker-id $SOMEDUMMYNAME.

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 13:49):

Is there a more principled way to do this? I've found nothing in the doc, and only the above trick reading the code.

view this post on Zulip Enrico Tassi (Nov 17 2022 at 14:36):

No. I think the right place is today vernac/comTactic which contains a non-parallel version of par and has an API to override it. IMO there you could check for a Goption. (and hence disable via a flag to coqc).

view this post on Zulip Enrico Tassi (Nov 17 2022 at 14:37):

(there is also -async-proofs-tac-j, but passing 1 would still spawn one extra worker). If you need a quicker patch, pass that and in the STM don't call Partac.enable_blabla if nworkers is 1 (even if it was intentional to allow 1 worker but still use the machinery, to test it)

view this post on Zulip Gaëtan Gilbert (Nov 17 2022 at 14:46):

which contains a non-parallel version of par

that nobody has ever tested AFAIK

view this post on Zulip Enrico Tassi (Nov 17 2022 at 14:49):

Maybe it is the right time? I think I tested it on the test suite (not on the whole CI).

view this post on Zulip Enrico Tassi (Nov 17 2022 at 14:49):

In any case, I don't see any other way of reducing the resources used by par if not by using that code.

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 14:58):

The -worker-id trick makes Coq take this path, and on the examples I've tried it seems to work

view this post on Zulip Enrico Tassi (Nov 17 2022 at 15:09):

Can you run a full CI with your hack implementation?

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 15:12):

I have no implementation, I just added the -worker-id flag and it worked, weirdly enough

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 15:13):

(without spawning a coqworkmgr process, that is)

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 15:13):

do you want me to try hacking the fiat-crypto CI by passing this flag?

view this post on Zulip Enrico Tassi (Nov 17 2022 at 15:14):

that is the thing, then refine the PR with an actual flag for making par non-parallel

view this post on Zulip Enrico Tassi (Nov 17 2022 at 15:14):

if it does not fix the issue, then I think the priority is zero (for this extra flag)

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 15:16):

dunno, that means that if you use par: you should be ready to use twice as much as memory as needed for the equivalent non-par code

view this post on Zulip Enrico Tassi (Nov 17 2022 at 15:24):

This has always been the case. I don't get what you want to say

view this post on Zulip Enrico Tassi (Nov 17 2022 at 15:24):

(actually more than double, it depends on the -blabla-j you pass, people use ~ 10 there as far as I know)

view this post on Zulip Enrico Tassi (Nov 17 2022 at 15:26):

I think @Michael Soegtrop uses it, other reached me privately, not sure they told me how many workers they used, but for sure not 2 given their speedup

view this post on Zulip Enrico Tassi (Nov 17 2022 at 15:29):

Our CI is very limited, my smartphone is 4 years old and has more ram than our workers... my new laptop comes with 64GB (and the price in the public market is about 1K).

view this post on Zulip Théo Zimmermann (Nov 17 2022 at 17:24):

par: had 17 users in the survey respondents: https://coq.discourse.group/t/coq-community-survey-2022-results-part-iii/1777

view this post on Zulip Michael Soegtrop (Nov 17 2022 at 18:21):

I like to use par in interactive mode for proofs with many cases (I have some which have around 100 to a few 100). I default to 6 when I start an IDE. For me this is quite essential. A side note: I switched from CoqIDE to VSCoq only after par: did work there.

If I use par in batch builds depends on the machine I am on.

view this post on Zulip Enrico Tassi (Nov 17 2022 at 19:15):

I guess you sed it to all:

view this post on Zulip Pierre-Marie Pédrot (Nov 17 2022 at 19:18):

I'm not advocating for changing the default behaviour of par:, I'm just saying that for memory reasons (and probably some reproducibility) we should be able to run par: in a single process with a specific flag.

view this post on Zulip Paolo Giarrusso (Nov 17 2022 at 21:10):

Could -async-proofs-j get to know 0 and accept -async-proofs-j 0 for this?

view this post on Zulip Paolo Giarrusso (Nov 17 2022 at 21:11):

Then Coq devs (and user) might be able to guess the right option more easily

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 11:19):

I've implemented this solution in https://github.com/coq/coq/pull/16837, but it turns out that all these async options are just pure Potemkin configuration. Except for a tiny subset of them, coqc and coqtop plainly ignores them.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 11:20):

They are parsed but never set.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 11:20):

Reading the initialization code of the various toplevels is giving me a primal fear.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 11:21):

We're gladly mixing global variable mutation with state threading, with various places of the code messing with some hardcoded flags. I have literally no idea how and where to fix it.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 11:21):

Calling @Enrico Tassi for help.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 11:23):

Stm.init_core seems like the most reasonable solution but I'm still wary I might break other parts. (Notably new_doc just a few lines below overwrites the config imperatively with whatever it's passed.)

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:11):

Hum, I don't have much time but let me check why coqc ignores the flag

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:14):

Hum coqc.ml calls Stmargs.parse_args, and that one does

    |"-async-proofs-tac-j" ->
      let j = Coqargs.get_int ~opt (next ()) in
      if j <= 0 then begin
        Coqargs.error_wrong_arg ("Error: -async-proofs-tac-j only accepts values greater than or equal to 1")
      end;
      { oval with
        Stm.AsyncOpts.async_proofs_n_tacworkers = j
      }

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:14):

So I guess I reviewed your code too quickly, this is the flag which should be set to 0

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:15):

wait, the PR is ok, so why didn't git pull give me the right code

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:20):

ah ah ah, this is really funny, I did pull, but not recompile. So jump to definition in coqc.ml opened an outdated file in _build/blabla

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:20):

thank you so much for duplicating sources

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 12:21):

The PR is OK, assuming that options are correctly set. The real issue is that Coq just drops the set of options.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 12:22):

So all of this stuff that is neatly documented, including all the -async-* variants, are just no-ops.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 12:22):

(as such, it's not specific to this particular option)

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:29):

No it is not that simple, the options are passed to the STM, but are passed too late, at new_doc time. It makes no sense, I'll try a fix

view this post on Zulip Gaëtan Gilbert (Nov 24 2022 at 12:32):

I tried at https://github.com/coq/coq/pull/16896
didn't test much though

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:43):

I don't think it is right, I'll open another PR

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:48):

here, tested locally: https://github.com/coq/coq/pull/16897

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:51):

the problem is that the stm options (which are not per doc but global) were also passed to Stm.init_process (don't call my why it was named like that) but that function did not store the value, and the call to Stm.init_core which happens immediately after did not see the flags.

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 12:51):

that's a reasonable hotfix, but shouldn't we at some point get rid of the global flags? in this sense the fix is going the wrong way.

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:52):

Then the options were also passed to new_doc which would save them in a (single) global place, not document specific

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:52):

NO we should remove the STM

view this post on Zulip Enrico Tassi (Nov 24 2022 at 12:52):

(from coqc)

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 12:52):

(oh, yeah, that's even better)

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 12:54):

when is that going to happen btw? we're wasting loads of memory because of the STM

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 12:55):

this should be another high-priority topic on our merely existent roadmap

view this post on Zulip Gaëtan Gilbert (Nov 24 2022 at 12:56):

"loads of memory" = ?

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 13:02):

we keep pointers to irrelevant summaries, so nothing good

view this post on Zulip Gaëtan Gilbert (Nov 24 2022 at 13:02):

that's not very specific

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 13:03):

it's hard to tell exactly how much we would save but there was this issue exploring live data after a GC and we kept random pointers to evarmaps

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 13:03):

given that evarmaps can be big...

view this post on Zulip Enrico Tassi (Nov 24 2022 at 13:07):

IMO it not large at all, but it is true it keeps a summary for each lemma

view this post on Zulip Enrico Tassi (Nov 24 2022 at 13:08):

The patch I was trying to write about side effects (and the simple compiler) should get rid of it. But after december 12, now I'm busy

view this post on Zulip Enrico Tassi (Nov 24 2022 at 15:00):

@Pierre-Marie Pédrot @Gaëtan Gilbert would you mind giving https://github.com/coq/coq/pull/16897 a shot and merging?

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 15:08):

this works for me on the trivial examples but this is the kind of code I won't ever trust

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 15:08):

@Gaëtan Gilbert don't you want to take the responsibility for this one?

view this post on Zulip Gaëtan Gilbert (Nov 24 2022 at 15:13):

did https://github.com/coq/coq/pull/16801 make fiat pass? if so then just change it in the new PR to the new mechanism and see how it goes

view this post on Zulip Pierre-Marie Pédrot (Nov 24 2022 at 15:14):

it used to at some point, but we have had the time to break it 50 times in the meantime


Last updated: Apr 19 2024 at 18:01 UTC