Stream: Coq devs & plugin devs

Topic: OCaml 5.0 patch


view this post on Zulip Emilio Jesús Gallego Arias (Nov 16 2022 at 19:38):

Hi folks, Kate (from OCaml fame) has requested if we could patch Coq to support OCaml 5.0, I think it is pretty easy, we need to:

I don't have time this week, but seems OCaml 5 is being released in 2 week so help is welcome.

view this post on Zulip Emilio Jesús Gallego Arias (Nov 16 2022 at 19:39):

See also https://github.com/coq/coq/pull/15494

view this post on Zulip Karl Palmskog (Nov 16 2022 at 19:59):

there are some other issues as well: https://github.com/coq/coq/projects/55

view this post on Zulip Karl Palmskog (Nov 25 2022 at 16:35):

Just passing on, Kate (mentioned in this topic before, one of the OCaml opam repo maintainers) offered to help with OCaml 5.0 support.

view this post on Zulip Karl Palmskog (Nov 25 2022 at 16:36):

if we want 5.0 support ASAP, maybe https://github.com/coq/coq/issues/13940 should be marked priority: high?

view this post on Zulip kit-ty-kate (Dec 12 2022 at 14:23):

any news on that? The final release is on Friday

view this post on Zulip Karl Palmskog (Dec 12 2022 at 14:26):

latest news seems to be here: https://github.com/coq/coq/wiki/Coq-Call-2022-11-30

OCaml 5.0.
- no-naked-pointers not an issue anymore if we disable native compilation
- retargeting native-compilation to use malfunction might be the way to go

view this post on Zulip Karl Palmskog (Dec 12 2022 at 14:29):

maybe something to add to here: https://github.com/coq/coq/wiki/Coq-Call-2022-12-14

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 14:39):

I think that Coq master almost compiles with OCaml 5.0, you just have to deactivate the deprecation warnings.

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 14:39):

wse should streamline a bit the configure checks and stuff like that but it's reasonable to believe it will reach 8.17 I think

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:20):

I'm much afraid that current Coq + OCaml 5.0 is giving consistent SEGV fault that I'm unable to debug; will try a different path tho.

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:21):

it works for me locally, though

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:23):

Can you try dune build -p coq-core,coq-stdlib,coq ?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:24):

The SEGV only shows up for me when compiling in release mode

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:24):

See the CI in https://github.com/coq/coq/pull/15494

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:25):

Seems related to some very tricky GC / memory behavior, because I can only get it on CI or in batch mode, trying to run the faulty command doesn't trigger the problem

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:32):

(btw you can remove the commit that tweaks ephemerons in that PR, because it's useless now, and in addition it's modifying the API)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:33):

It's already reverted on that tree

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:33):

I'm checking if that commit could be the culprit of the SEGV

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:34):

ah, but just remove it from the list of commits then

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:35):

To answer to @kit-ty-kate , Coq's main branch should compile now with OCaml 5.0, the work we discussed was completed. How well it runs, that's a different story, it is basically untested. We should do a bench to start with.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:35):

Pierre-Marie Pédrot said:

ah, but just remove it from the list of commits then

Yup I will, I wanted to keep it in history while the PR is draft just to be able to inspect that change would we eventually need it.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:39):

Yup, still booms:

File "theories/dune", line 3104, characters 1-1198:
3104 |  (rule
3105 |   (targets ConstructiveRcomplete.glob ConstructiveRcomplete.vos
3106 |            ConstructiveRcomplete.vo)
......
3114 |         ../.././Reals/Cauchy/PosExtra.vo
3115 |         ../.././Reals/Cauchy/ConstructiveExtra.vo ../.././QArith/Qminmax.vo)
3116 |   (action (run coqc -boot -R ../../. Coq -I ../../../plugins/ltac -I ../../../plugins/derive -I ../../../plugins/btauto -I ../../../plugins/nsatz -I ../../../plugins/ring -I ../../../plugins/ssr -I ../../../plugins/firstorder -I ../../../plugins/syntax -I ../../../plugins/rtauto -I ../../../plugins/funind -I ../../../plugins/micromega -I ../../../plugins/ssrmatching -I ../../../plugins/cc -I ../../../plugins/extraction -I ../../../plugins/ltac2 -w +default -q -w -deprecated-native-compiler-option -native-compiler off %{dep:ConstructiveRcomplete.v})))
(cd _build/default/theories/Reals/Cauchy && ../../../../install/default/bin/coqc -boot -R ../../. Coq -I ../../../plugins/ltac -I ../../../plugins/derive -I ../../../plugins/btauto -I ../../../plugins/nsatz -I ../../../plugins/ring -I ../../../plugins/ssr -I ../../../plugins/firstorder -I ../../../plugins/syntax -I ../../../plugins/rtauto -I ../../../plugins/funind -I ../../../plugins/micromega -I ../../../plugins/ssrmatching -I ../../../plugins/cc -I ../../../plugins/extraction -I ../../../plugins/ltac2 -w +default -q -w -deprecated-native-compiler-option -native-compiler off ConstructiveRcomplete.v)
Command got signal SEGV.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:40):

I cut and paste the command, can't reproduce :S

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:40):

I'm gonna run that in a loop

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:40):

and add some mem pressure

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:41):

ah, I just managed to get a SIGSEV

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:43):

is it related to your PR? when I tweaked the few places to make it compile in 5.0 and it worked on the stdlib

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:48):

I can reproduce the SIGSEV with 8 cores running but not when calling the compilation by hand

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:49):

not sure how to debug that though, I would need to hook gdb somehow

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 16:51):

Can you check dmesg? Executable generated by OCaml 5 are much heavier wrt virtual memory. So, perhaps your kernel is just dying under the load?

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:53):

I'm getting stuff like that:

[23545.033006] coqc[186763]: segfault at 7f80ffc5f000 ip 000055996fad95f2 sp 00007ffce85fe070 error 6 in coqc_bin.exe[55996f3e3000+737000]
[23545.033020] Code: e0 03 48 29 42 08 48 8b 02 48 8b 4a 08 48 39 c1 0f 82 fd 26 00 00 41 8d 44 24 02 49 63 f4 48 98 48 c1 e0 0a 48 05 f7 00 00 00 <48> 89 01 48 8b 42 08 48 c7 40 10 05 00 00 00 49 63 17 48 8d 78 08

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:53):

can I objdump the binary to see what lives at that address in some way?

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:55):

(I'm getting a weird error when dumping the binary: "objdump: Erreur DWARF: numéro de ligne de section mutilé")

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:55):

is the binary corrupted?

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 16:55):

Just run gdb and type disas 0x7f80ffc5f000.

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 16:56):

My mistake, it should be disas 0x000055996fad95f2.

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:56):

"No function contains specified address."

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 16:57):

maybe a dynlinked piece of code?

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 16:58):

Make sure that /proc/sys/kernel/randomize_va_space is set to zero first. Otherwise the addresses are completely random. (But it might also be some dynlinked code indeed.)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:59):

It is indeed annoying, on the other hand reproduction is "deterministic" with dune build

I'm gonna change the build rules so we wrap every coqc call in gdb

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 16:59):

and hope that I can get to a terminal

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:02):

with randomize_va_space set to 0 I get the same error at the same place

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:02):

but still no function at that address

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:03):

ah, after a bit of linking it does work

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:03):

and tadaaaa

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:03):

it's coq_interprete from the VM

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:04):

(how surprising)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 17:06):

Good that you got some info Pierre-Marie; patch I'm using, not sure it will be enough:

diff --git a/tools/dune_rule_gen/coq_rules.ml b/tools/dune_rule_gen/coq_rules.ml
index 9e4de8cb02..3737a33879 100644
--- a/tools/dune_rule_gen/coq_rules.ml
+++ b/tools/dune_rule_gen/coq_rules.ml
@@ -229,7 +229,7 @@ let module_rule ~(cctx : Context.t) coq_module =
   let deps = cctx.async_deps @ deps in
   (* Build rule *)
   let vfile_base = Path.basename vfile in
-  let action = Format.asprintf "(run coqc %s %%{dep:%s})" flags vfile_base in
+  let action = Format.asprintf "(run gdb --eval-command=run --batch --args coqc %s %%{dep:%s})" flags vfile_base in
   let targets = Coq_module.obj_files ~tname ~rule coq_module in
   let alias = if rule = Coq_module.Rule_type.Quick then Some "vio" else None in
   { Dune_file.Rule.targets; deps; action; alias }

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:07):

not sure I can dissassemble on sight, but I can give the dump of that function on my machine

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 17:09):

Got this on my side

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 17:09):

Program received signal SIGSEGV, Segmentation fault.
0x000055555627ac6a in coq_interprete (coq_pc=0x555556ac5620, coq_accu=93825014219600, coq_atom_tbl=140737074123144, coq_global_data=93825014797848, coq_env=140702641348552, coq_extra_args=93825014715928, coq_extra_args@entry=0) at coq_interp.c:811
811         Coq_alloc_small(accu, nfuncs * 3 + nvars, Closure_tag);

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:10):

the VM closure code strikes again :shrug:

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:13):

same place on my side, we're onto something

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:15):

actually no, I misread: mine is failing on

784             Coq_alloc_small(accu, 2 + nvars, Closure_tag);

but that's a pretty close issue

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 17:17):

I'm hoping to get more info

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 17:17):

Could you look for caml_process_pending_actions_exn in the file and add Setup_for_gc on the line before and Restore_after_gc after?

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 17:19):

(The comments in the OCaml interpreter state that "we could get rid of that", but perhaps they are bit optimistic.)

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:21):

nope still a SIGSEV

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:24):

the corresponding OCaml code for setup / restore does a lot more stuff though, does this matter?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 17:28):

Not getting any luck reproducing this ...

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 17:40):

It does not matter. The OCaml interpreter has to do more work because it needs to safely be able to interrupt itself in case of thread yields. In our case, we are sure that the execution will resume from the point it stopped. (If we ever allow several vm_compute to run concurrently on the same domain through effect scheduling, we might have to reconsider.)

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:40):

hmmm, I blindly copied the macro for Alloc_small from the OCaml master branch and I don't get any more SIGSEV

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:41):

diff --git a/kernel/byterun/coq_interp.c b/kernel/byterun/coq_interp.c
index 0bc79b71ac..847839c65f 100644
--- a/kernel/byterun/coq_interp.c
+++ b/kernel/byterun/coq_interp.c
@@ -115,8 +115,13 @@ if (sp - num_args < coq_stack_threshold) {                                     \
 #define Restore_after_caml_call coq_env = *sp++;

 #if OCAML_VERSION >= 50000
-#define Coq_alloc_small(result, wosize, tag) Alloc_small(result, wosize, tag, \
-  { Setup_for_gc; caml_process_pending_actions(); Restore_after_gc; })
+#define Enter_gc(dom_st, wosize) do {                            \
+    Setup_for_gc;                                                \
+    Alloc_small_enter_GC_flags(CAML_DO_TRACK | CAML_FROM_CAML,   \
+                               dom_st, wosize);                  \
+    Restore_after_gc;                                            \
+  } while (0)
+#define Coq_alloc_small(result, wosize, tag) Alloc_small(result, wosize, tag, Enter_gc)
 #else
 #define Coq_alloc_small(result, wosize, tag) Alloc_small(result, wosize, tag)
 #endif

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 17:41):

But CAML_FROM_CAML is plain wrong in our case.

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:41):

just reporting random fidgetting with code I do not understand

view this post on Zulip Guillaume Melquiond (Dec 12 2022 at 17:42):

You should at least use CAML_FROM_C.

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:43):

one last round of compilation before I have to go

view this post on Zulip Pierre-Marie Pédrot (Dec 12 2022 at 17:46):

still no SIGSEV with CAML_FROM_C

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 18:48):

Ok, finally got a usable core dump (30Gib tho)

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 18:48):

#0  0x000055555627ac6a in coq_interprete (coq_pc=0x555556ac5620, coq_accu=93825014219600, coq_atom_tbl=140737074123144, coq_global_data=93825014797848, coq_env=140702641353472,
    coq_extra_args=93825014715928, coq_extra_args@entry=0) at coq_interp.c:811
#1  0x000055555627e6a4 in coq_interprete_ml (tcode=<optimised out>, a=1, t=140737074123144, g=93825014797848, e=140702710707168, ea=1) at coq_interp.c:2006
#2  <signal handler called>
#3  0x00005555560abe61 in camlVmsymtable__eval_to_patch_1655 () at kernel/vmsymtable.ml:325
#4  0x00005555560af828 in camlVconv__vm_conv_gen_1361 () at kernel/vmsymtable.ml:350
#5  0x00005555560afa26 in camlVconv__vm_conv_1593 () at kernel/vconv.ml:210
#6  0x00005555560ce231 in camlTypeops__check_cast_2116 () at kernel/typeops.ml:346
#7  0x00005555560d0072 in camlTypeops__execute_2688 () at kernel/typeops.ml:637
#8  0x00005555560d1e88 in camlTypeops__fun_3974 () at kernel/typeops.ml:772
#9  0x00005555561e94d2 in camlCArray__map_i_1162 () at clib/cArray.ml:518
#10 0x00005555560d1e0e in camlTypeops__execute_array_2691 () at kernel/typeops.ml:772
#11 0x00005555560d057c in camlTypeops__execute_2688 () at kernel/typeops.ml:594
#12 0x00005555560d048d in camlTypeops__execute_2688 () at kernel/typeops.ml:628
#13 0x00005555560d048d in camlTypeops__execute_2688 () at kernel/typeops.ml:628
#14 0x00005555560d048d in camlTypeops__execute_2688 () at kernel/typeops.ml:628
#15 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#16 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#17 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#18 0x00005555560d03d2 in camlTypeops__execute_2688 () at kernel/typeops.ml:623
#19 0x00005555560d1e88 in camlTypeops__fun_3974 () at kernel/typeops.ml:772
#20 0x00005555561e94d2 in camlCArray__map_i_1162 () at clib/cArray.ml:518
#21 0x00005555560d1e0e in camlTypeops__execute_array_2691 () at kernel/typeops.ml:772
#22 0x00005555560d057c in camlTypeops__execute_2688 () at kernel/typeops.ml:594
#23 0x00005555560d03d2 in camlTypeops__execute_2688 () at kernel/typeops.ml:623
#24 0x00005555560d1e88 in camlTypeops__fun_3974 () at kernel/typeops.ml:772
#25 0x00005555561e94d2 in camlCArray__map_i_1162 () at clib/cArray.ml:518
#26 0x00005555560d1e0e in camlTypeops__execute_array_2691 () at kernel/typeops.ml:772
#27 0x00005555560d057c in camlTypeops__execute_2688 () at kernel/typeops.ml:594
#28 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#29 0x00005555560d05fa in camlTypeops__execute_2688 () at kernel/typeops.ml:601
#30 0x00005555560d17cf in camlTypeops__build_one_branch_2803 () at kernel/typeops.ml:702
#31 0x00005555561e94d2 in camlCArray__map_i_1162 () at clib/cArray.ml:518
#32 0x00005555560d0d8a in camlTypeops__execute_2688 () at kernel/typeops.ml:707
#33 0x00005555560d048d in camlTypeops__execute_2688 () at kernel/typeops.ml:628
#34 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#35 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#36 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#37 0x00005555560d02c3 in camlTypeops__execute_2688 () at kernel/typeops.ml:610
#38 0x00005555560d1fec in camlTypeops__infer_2976 () at kernel/typeops.ml:785
#39 0x00005555560d6b38 in camlTerm_typing__check_delayed_1855 () at kernel/term_typing.ml:285
#40 0x00005555560ef87c in camlSafe_typing__check_opaque_3370 () at kernel/safe_typing.ml:906
#41 0x0000555555cd1ea8 in camlOpaques__fun_1316 () at vernac/opaques.ml:97
#42 0x0000555555ca2845 in camlFuture__chain_800 () at vernac/future.ml:151
#43 0x0000555555ca2aab in camlFuture__chain_820 () at vernac/future.ml:163
#44 0x0000555555cd1cfa in camlOpaques__declare_defined_opaque_898 () at vernac/opaques.ml:96
#45 0x0000555555cf3fc5 in camlDeclare__declare_constant_inner_9266 () at vernac/declare.ml:441
#46 0x0000555555cf5540 in camlDeclare__declare_entry_core_inner_9343 () at vernac/declare.ml:658
#47 0x00005555561cbab5 in camlCList__map_i_730 () at clib/cList.ml:330
#48 0x0000555555cfeb60 in camlDeclare__finalize_proof_7042 () at vernac/declare.ml:2102
#49 0x0000555555d79447 in camlVernacinterp__fun_2186 () at vernac/declare.ml:2161
#50 0x00005555561d5a38 in camlNeList__map_head_370 () at clib/neList.ml:35
#51 0x0000555555d79379 in camlVernacinterp__interp_qed_delayed_1717 () at vernac/vernacinterp.ml:228

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 18:49):

let me know folks if you need access or me running some gdb command

view this post on Zulip Emilio Jesús Gallego Arias (Dec 12 2022 at 18:51):

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055555627ac6a in coq_interprete (coq_pc=0x555556ac5620, coq_accu=93825014219600, coq_atom_tbl=140737074123144, coq_global_data=93825014797848, coq_env=140702641353472,
    coq_extra_args=93825014715928, coq_extra_args@entry=0) at coq_interp.c:811
811         Coq_alloc_small(accu, nfuncs * 3 + nvars, Closure_tag);

view this post on Zulip Guillaume Melquiond (Dec 13 2022 at 07:46):

So, from looking at the OCaml code, it seems that the assumption that running the garbage collector actually frees some memory no longer holds. (The following comment was removed: "When caml_garbage_collection returns, we assume there is enough space in the minor heap for the triggering allocation.") So, the garbage collector needs to be hammered in a loop until memory has actually been freed, which is the purpose of Alloc_small_enter_GC_flags.

view this post on Zulip Pierre-Marie Pédrot (Dec 13 2022 at 08:03):

doesn't seem neither efficient nor guaranteed to terminate to me, but well...

view this post on Zulip Guillaume Melquiond (Dec 13 2022 at 08:07):

The good news is that it will automatically "solve" https://github.com/coq/coq/issues/13439 by infinite loop.

view this post on Zulip Pierre-Marie Pédrot (Dec 13 2022 at 09:21):

@Guillaume Melquiond are you writing the PR to fix the code?

view this post on Zulip Guillaume Melquiond (Dec 13 2022 at 09:59):

I can, but only later today.

view this post on Zulip Pierre-Marie Pédrot (Dec 13 2022 at 10:08):

No hurry, it was just to be sure.

view this post on Zulip Karl Palmskog (Dec 14 2022 at 15:17):

I was hoping for a bottom line on the OCaml 5.0 thing - are we going to enable that in opam and so on?

view this post on Zulip Notification Bot (Dec 14 2022 at 15:25):

A message was moved here from #Coq devs & plugin devs > coq calls by Emilio Jesús Gallego Arias.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 14 2022 at 15:26):

Karl Palmskog said:

I was hoping for a bottom line on the OCaml 5.0 thing - are we going to enable that in opam and so on?

Support seems mostly ready modulo testing, there are 2 remaining TODOS:

view this post on Zulip Karl Palmskog (Dec 14 2022 at 15:29):

if nobody else will do it, I guess I can do the extraction fixes PR next week.

view this post on Zulip Karl Palmskog (Dec 14 2022 at 15:37):

@Emilio Jesús Gallego Arias so it is enough to make this package conflict with base-nnp/ocaml-option-nnpchecker? Is that the preferred way? https://github.com/ocaml/opam-repository/blob/master/packages/coq-native/coq-native.1/opam

view this post on Zulip Emilio Jesús Gallego Arias (Dec 14 2022 at 15:52):

Karl Palmskog said:

Emilio Jesús Gallego Arias so it is enough to make this package conflict with base-nnp/ocaml-option-nnpchecker? Is that the preferred way? https://github.com/ocaml/opam-repository/blob/master/packages/coq-native/coq-native.1/opam

That could work yes, but not sure what the official opam-repos policy is, but seems like a good idea to do that for compilers in the 4.x that don't support NP.

view this post on Zulip Karl Palmskog (Dec 14 2022 at 15:53):

perhaps we can ping @kit-ty-kate for advice on what the "proper" way is to make the virtual coq-native package mutally exclusive with OCaml 5.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 14 2022 at 15:53):

I've done the latest two items in https://github.com/coq/coq/pull/15494 , so maybe we can consider merging it.

Still a small tweak seems needed in the source code tho.

view this post on Zulip kit-ty-kate (Dec 14 2022 at 15:58):

yes conflicting with base-nnp and ocaml-option-nnpchecker is indeed the right way to do this

view this post on Zulip Karl Palmskog (Dec 14 2022 at 16:03):

https://github.com/ocaml/opam-repository/pull/22667

view this post on Zulip Karl Palmskog (Dec 14 2022 at 20:12):

looks like after 15494 is merged, the "only" thing left is to get plugins in CI compatible (and I guess there may be library and/or extraction issues as well with some projects)

view this post on Zulip Karl Palmskog (Dec 14 2022 at 20:13):

from opam repo side, we can at least let people test with OCaml 5 coq.dev by removing the nnp conflict, and add a CI job maybe

view this post on Zulip Emilio Jesús Gallego Arias (Dec 15 2022 at 11:18):

@Théo Zimmermann , all the PRs needed for OCaml 5 support are now submitted and placed in the right milestone. They seem pretty minimal to me, so they should be fine for 8.17 , but of course that's the RM call.

view this post on Zulip Karl Palmskog (Dec 16 2022 at 17:33):

this is probably known, but I just tried installing coq.dev from the core-dev opam repo with OCaml 5, and I get this error with coq-stdlib.dev:

dune build  --root . theories_dune ltac2_dune
File "lib/control.ml", line 73, characters 8-19:
73 |         Thread.exit ()
             ^^^^^^^^^^^
Error (alert deprecated): Thread.exit
Use 'raise Thread.Exit' instead.
make: *** [Makefile:123: .dune-stamp] Error 1

Will this be fixed by something like https://github.com/coq/coq/pull/15494 or am I missing something?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 16 2022 at 17:34):

Seems like your OPAM file is bogus (building with -warn-error)

Can you point out to which one your opam is using concretely?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 16 2022 at 17:35):

Sorry never mind what I said, yes, this needs a patch for the bootstrap

view this post on Zulip Emilio Jesús Gallego Arias (Dec 16 2022 at 17:35):

let me submit it

view this post on Zulip Karl Palmskog (Dec 16 2022 at 17:35):

https://github.com/coq/opam-coq-archive/blob/master/core-dev/packages/coq-stdlib/coq-stdlib.dev/opam

view this post on Zulip Emilio Jesús Gallego Arias (Dec 16 2022 at 17:35):

and yes it is fixed in the CI PR

view this post on Zulip Emilio Jesús Gallego Arias (Dec 16 2022 at 17:52):

@Karl Palmskog https://github.com/coq/coq/pull/17010

view this post on Zulip Emilio Jesús Gallego Arias (Dec 16 2022 at 17:52):

thanks for noticing that bit, was the only fix not submitted as separate PR

view this post on Zulip Karl Palmskog (Dec 16 2022 at 17:53):

thanks, I'll test everything again after it (hopefully) gets merged

view this post on Zulip Karl Palmskog (Dec 17 2022 at 07:57):

tried PR 17010 and started playing around with a bunch of plugins and extraction projects. Everything seems to work so far, all I got were some additional deprecation warnings for the extraction projects.

view this post on Zulip Karl Palmskog (Dec 17 2022 at 11:52):

now coq.dev from the core-dev opam repo works fine to install on 5.0.0! :tada:

(I hope that 8.17 backports are still on the table.)

view this post on Zulip Enrico Tassi (Dec 17 2022 at 16:23):

If master works then I believe we can run a bench

view this post on Zulip Théo Zimmermann (Dec 19 2022 at 10:24):

@Karl Palmskog all the PRs currently merged in master and targeting 8.17 (I think this is the case for all OCaml 5-compatibility PRs) have been backported to v8.17 except coq/coq#17010, which is in the process of being backported (but given that it just fixes a warning, I don't think it is required for compatibility). So in principle, coq.8.17.dev should be as compatible with OCaml 5 as coq.dev right now.

view this post on Zulip Karl Palmskog (Dec 19 2022 at 10:25):

without 17010 the installation of coq.8.17.dev will break

view this post on Zulip Théo Zimmermann (Dec 19 2022 at 10:25):

OK. Expect this PR to be pushed to v8.17 by the end of today.

view this post on Zulip Karl Palmskog (Dec 19 2022 at 10:26):

so maybe you can ping me in to that backport PR and I will update coq.8.17.dev in opam archive when merged

view this post on Zulip Théo Zimmermann (Dec 19 2022 at 10:28):

I can ping you when the PR is backported. There is no "backport PR" FTR, but you can use https://github.com/coq/coq/projects/56 or https://coq.zulipchat.com/#narrow/stream/240008-GitHub-notifications/topic/Coq.20Push.20Notifications to track what is backported FYI.

view this post on Zulip Karl Palmskog (Dec 19 2022 at 10:32):

OK, then I guess you'll either ping me, or I check later and do the opam archive PR and ping you.

view this post on Zulip Karl Palmskog (Dec 19 2022 at 10:33):

maybe a good idea to add an OCaml 5 job in opam archive at the same time.

view this post on Zulip Théo Zimmermann (Dec 19 2022 at 14:05):

@Karl Palmskog OK, that last PR is backported now.

view this post on Zulip Karl Palmskog (Dec 19 2022 at 20:46):

looks like we are open for business (have working CI job) now in testing packages with 8.17+rc1 and OCaml 5.0.0 in the opam archive

view this post on Zulip Karl Palmskog (Dec 20 2022 at 08:37):

the bench job honestly looks a bit disheartening. Based on this, most Coq projects would be better off staying on OCaml 4 if they could (no Coq project in CI saw an improvement, and most are 10-20% slower).

Or am I reading it wrong? Will stuff like flambda, etc., change the picture?

view this post on Zulip Paolo Giarrusso (Dec 20 2022 at 08:58):

Don't you want flambda for both OCaml versions? 10-20% is the speedup I'd expect from it

view this post on Zulip Paolo Giarrusso (Dec 20 2022 at 08:59):

Sorry: it's just in the right ballpark, but it could be tons of other things...

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:00):

I was thinking there was a chance that flambda could affect 4 and 5 differently.

view this post on Zulip Paolo Giarrusso (Dec 20 2022 at 09:03):

(skims log) I sed, that's comparing 4 and 5 both without flambda. Yeah, "both with flambda" would be more interesting.

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:03):

looks like 4.14.1 is also out, maybe this is what we will be using for Coq for the next few years? https://github.com/ocaml/ocaml/releases/tag/4.14.1

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:05):

another theory: the heap tuning stuff in Coq might be off since they seemingly changed memory management quite a lot.

view this post on Zulip Paolo Giarrusso (Dec 20 2022 at 09:06):

Possible, and/or 5.0.0 still has immature performance but that'll be fixed soon.

view this post on Zulip Paolo Giarrusso (Dec 20 2022 at 09:14):

https://ocaml.org/news/ocaml-5.0 has specifics on "immature performance" in the intro

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:15):

hmm, the only thing I see on performance is:

OCaml 5 as a language is fully compatible with OCaml 4 down to the performance characteristics of your programs. In other words, any code that works with OCaml 4 should work the same with OCaml 5.

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:16):

which, if we take the bench at face value, is not the case here in the Coq sphere

view this post on Zulip Pierre-Marie Pédrot (Dec 20 2022 at 09:30):

has anybody opened an OCaml issue on the topic? We have some data points with the bench.

view this post on Zulip Paolo Giarrusso (Dec 20 2022 at 09:31):

Consequently, OCaml 5.0.0 is expected to be a more experimental version of OCaml than the usual OCaml releases. [...]
the performance of ephemerons is currently (and temporarily) strongly degraded.

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:31):

I guess I could do it [open an issue], but maybe some Coq core dev with connections to the OCaml team would be a better choice

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:32):

does Coq even use ephemerons?

view this post on Zulip Paolo Giarrusso (Dec 20 2022 at 09:36):

seems so but I'd really rather defer to sb with more clue: https://github.com/coq/coq/blob/110921a449fcb830ec2a1cd07e3acc32319feae6/clib/cEphemeron.ml wraps Ephemeron and that's used a few times https://github.com/coq/coq/search?q=cephemeron

view this post on Zulip Pierre-Marie Pédrot (Dec 20 2022 at 09:44):

CEphemeron are not quite ephemerons, rather it's a weak table.

view this post on Zulip Pierre-Marie Pédrot (Dec 20 2022 at 09:44):

(Yes, this is basically the same thing, but the performance profile should be distinct.)

view this post on Zulip Pierre-Marie Pédrot (Dec 20 2022 at 09:46):

I don't think it would be enough to produce the differences we see in the bench. Rather, I'm siding with the people who believe there is some GC option that went awry.

view this post on Zulip Karl Palmskog (Dec 20 2022 at 09:47):

do we want to summon someone from the OCaml team here maybe? Gabriel has commented before, and might at least advice on next steps.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 20 2022 at 10:06):

I've forwarded the results to the multicore team in Slack

view this post on Zulip Pierre-Marie Pédrot (Dec 20 2022 at 14:13):

The top slower lines in the bench are Qeds, which put a heavy pressure on hashconsing. Given that we implement hashconsing with weak tables (and thus internally ephemerons) the slowdown might come from there.

view this post on Zulip Enrico Tassi (Dec 20 2022 at 20:32):

My understanding is that ocaml 5, if you don't take advantage of parallelism, is expected to be slightly slower. Which make sense, there is no miracle.

view this post on Zulip Michael Soegtrop (Dec 21 2022 at 08:47):

I wonder if things like CoqHammer become easier with OCaml 5.0. E.g. the reason it is currently not supported on Windows is that it relies on forks to run different reconstructions tactics in parallel starting from the current state and then kills all except the first successful one. Would OCaml 5 help with that?

view this post on Zulip Karl Palmskog (Dec 21 2022 at 08:49):

I think you would need fine-grained thread support in Coq (application level) first?

view this post on Zulip Karl Palmskog (Dec 21 2022 at 08:51):

but I think threads/tasks inside a language are generally considered much more portable/manageable than processes...

view this post on Zulip Michael Soegtrop (Dec 21 2022 at 10:13):

The reason CoqHammer forks is that it wants to do explore from the current state into different directions. The par: support likely does something similar, so I guess there is some way to "fork the current state" on thread level in Coq already. @Enrico Tassi : can you share your insights into this?

view this post on Zulip Gaëtan Gilbert (Dec 21 2022 at 10:23):

par: is processes too, not threads

view this post on Zulip Karl Palmskog (Dec 21 2022 at 10:27):

to get threads into Coq like Isabelle, I think you'd have to do something equivalent to what Makarius did, e.g., in
Shared-Memory Multiprocessing for Interactive Theorem Proving

view this post on Zulip Gaëtan Gilbert (Dec 21 2022 at 10:32):

most tactics don't touch global state so running tactics is almost thread safe, but there are exceptions, eg

view this post on Zulip Karl Palmskog (Dec 21 2022 at 10:33):

ah, but the plugins are a different story, no? So to be on safe side, you'd probably want to mark tactics as parallel-safe or not

view this post on Zulip Rodolphe Lepigre (Dec 21 2022 at 11:25):

Enrico Tassi said:

My understanding is that ocaml 5, if you don't take advantage of parallelism, is expected to be slightly slower. Which make sense, there is no miracle.

The release notes say: "OCaml 5 as a language is fully compatible with OCaml 4 down to the performance characteristics of your programs", so purely sequential programs are expected to be exactly as fast as before.

view this post on Zulip Karl Palmskog (Dec 21 2022 at 11:27):

as per the bench, the performance part is not even close to on par to regular 4.14 for Coq (some projects had >= 40% slowdown)

view this post on Zulip Michael Soegtrop (Dec 21 2022 at 11:36):

I see. Well I would appreciate if it would be easier to write plugins like CoqHammer in a platform independent way (since all the platform dependent stuff goes to my desk). If one moves to OCaml 5, one should put this on the agenda.

view this post on Zulip Karl Palmskog (Dec 21 2022 at 12:17):

I think the OCaml multicore effort took 8 years of work, hopefully it might be less for Coq

view this post on Zulip Michael Soegtrop (Dec 21 2022 at 12:24):

IMHO good automation is essential for practical use of proof assistants and efficient platform independent development of plugins is essential for automation. So this is not just a "nice to have" thing.

view this post on Zulip Karl Palmskog (Dec 21 2022 at 12:25):

I completely agree it should be a priority to get fine-grained thread parallelism into Coq. I tried to collect state of the art in proof assistant parallel processing in this ISSTA'18 paper. Some things may have happened since then, but I believe Lean 4 did not do proof-parallel processing, only file level

view this post on Zulip Enrico Tassi (Dec 21 2022 at 13:06):

Rodolphe Lepigre said:

Enrico Tassi said:

My understanding is that ocaml 5, if you don't take advantage of parallelism, is expected to be slightly slower. Which make sense, there is no miracle.

The release notes say: "OCaml 5 as a language is fully compatible with OCaml 4 down to the performance characteristics of your programs", so purely sequential programs are expected to be exactly as fast as before.

I don't recall my source, and I'm sure they did the best they could to be as fast as before, but adding features can hardly come at no cost. IMO 10% slower would still be more than acceptable (I'd pay much more for fair scheduling, which seems to be part of the deal here). Unfortunately the benchmarks are a bit worse than that (so far, tuning seems still in progress). But don't get me wrong, I think 5.0 is a very impressive piece of work.

view this post on Zulip Karl Palmskog (Dec 22 2022 at 13:55):

even with the same GC approach in 4.14/5.0, typical projects get a slowdown of 15-30% in the latest bench. That's a pretty high current price for the possibility of Coq-multicore improvements in the future...

view this post on Zulip Emilio Jesús Gallego Arias (Dec 22 2022 at 15:20):

I'd be cautious about trying to infer any useful from the current bench numbers, there are still far from final.

view this post on Zulip Enrico Tassi (Dec 22 2022 at 15:26):

The branch goes back to default GC settings, I guess to measure a baseline. These settings are know to be much worse on 4.x, so looking at these numbers is surely misleading. I guess we will need to wait for @Emilio Jesús Gallego Arias to do his measurements and write a summary.

view this post on Zulip Emilio Jesús Gallego Arias (Dec 22 2022 at 18:36):

Yes, in particular OCaml 5 GC still needs to get the porting of some nice 4.x GC features that indeed didn't make the cut for the 5.x release; I understand this is nothing intrinsic to multi-core, just that the rewrite didn't manage to catch up yet with 4.x development

view this post on Zulip Emilio Jesús Gallego Arias (Dec 22 2022 at 18:37):

The only baseline we have for Coq is the PLDI multicore paper

view this post on Zulip Emilio Jesús Gallego Arias (Dec 22 2022 at 18:37):

I mean the only baseline that has been properly verified.

view this post on Zulip Karl Palmskog (Dec 22 2022 at 20:14):

which paper is that more precisely?

view this post on Zulip Emilio Jesús Gallego Arias (Dec 22 2022 at 20:35):

IIRC "Retrofitting Parallelism onto OCaml" and "Retrofitting Effect Handlers onto OCaml"

view this post on Zulip Emilio Jesús Gallego Arias (Dec 22 2022 at 20:38):

Coq benches are only in the latter (PLDI 2021) but indeed they are against stock OCaml

view this post on Zulip Karl Palmskog (Dec 22 2022 at 20:40):

I guess you mean this and this

view this post on Zulip Guillaume Melquiond (Dec 23 2022 at 10:12):

Emilio Jesús Gallego Arias said:

this is nothing intrinsic to multi-core

One of the fundamental design choices of Multicore OCaml was that memory allocation should not require any synchronization between domains (unless memory is exhausted). In particular, none of the OCaml 4 memory allocators (including the "best-fit" one used by Coq) are suitable for OCaml 5 since they require synchronization of their free lists (or equivalent). So, OCaml 5 comes with its own naive allocator. And it has to be naive since every domain is responsible for sweeping its own allocations, so it has to be able to track them cheaply. I am sure that the OCaml developers will be able to make it less naive as time goes, but this is not just a matter of porting a few changes from OCaml 4, it is a complete reimplementation.

view this post on Zulip Karl Palmskog (Dec 23 2022 at 13:47):

I find it a bit difficult to reconcile this with the claim that "OCaml 5 as a language is fully compatible with OCaml 4 down to the performance characteristics of your programs". Seems like this is bound to lead to many letdowns (and criticism).

view this post on Zulip Karl Palmskog (Dec 23 2022 at 13:49):

why didn't they just say something like that the performance will be worse than OCaml4 in some cases [not just ephemerons], but is expected to improve with additional work on GC?

view this post on Zulip Enrico Tassi (Dec 23 2022 at 14:57):

I find the sentence a bit ambiguous, but even at the current state, I don't find the slowdown unacceptable for a first release. It is not that you cant use Coq if it is 20% slower, that is how I interpret the sentence.
IMO it is good you can finally run it, and do proper measurements. Nobody forces us to switch to 5.0 for the platform, for example, so they have time to improve it while we still get the same perf in "production".

view this post on Zulip Paolo Giarrusso (Dec 23 2022 at 17:16):

agree with both: it's acceptable for a first release, and good enough for testing, and it'll disappoint users who ship it in production

view this post on Zulip Emilio Jesús Gallego Arias (Dec 24 2022 at 01:35):

Guillaume Melquiond said:

Emilio Jesús Gallego Arias said:

this is nothing intrinsic to multi-core

One of the fundamental design choices of Multicore OCaml was that memory allocation should not require any synchronization between domains (unless memory is exhausted). In particular, none of the OCaml 4 memory allocators (including the "best-fit" one used by Coq) are suitable for OCaml 5 since they require synchronization of their free lists (or equivalent). So, OCaml 5 comes with its own naive allocator. And it has to be naive since every domain is responsible for sweeping its own allocations, so it has to be able to track them cheaply. I am sure that the OCaml developers will be able to make it less naive as time goes, but this is not just a matter of porting a few changes from OCaml 4, it is a complete reimplementation.

Indeed I'm not sure the 4.x changes being ported to OCaml 5 (such as prefetching) will help a lot, but there is a backlog there, if I understood OCaml devs correctly, and as noted in the PR.

For now, the causes of slowdowns over 10% are not well understood and indeed, the first step is to check the backlog. And indeed, if you look at the latest results it does matter, tho still not the root cause of the problem for the large slowdowns

view this post on Zulip Karl Palmskog (Jan 08 2023 at 11:58):

OK, I think I now have a better intuition for some of the OCaml 5 bench results. Some projects like fourcolor have a much smaller slowdown than others in OCaml 5, and it isn't clear why. However, fourcolor happens to be one of the most file-level parallelizable projects out there (the other one is Verdi Raft), which makes it able to "catch up" in real time to OCaml 4 using several cores. So I'd conjecture that the OCaml 5 slowdown is consistently in the 30% range with no parallelism

view this post on Zulip Paolo Giarrusso (Jan 08 2023 at 12:39):

Does that mean the slowdown is smaller in real time, but should be the same in user time?

view this post on Zulip Karl Palmskog (Jan 08 2023 at 12:40):

that's what I see locally, at least. A signficant increase in user time for all projects, but only modest real time increase for highly parallelizable projects

view this post on Zulip Paolo Giarrusso (Jan 08 2023 at 12:41):

but more reliable user time measurements are hard to get for some reason, I guess

view this post on Zulip Karl Palmskog (Jan 08 2023 at 12:42):

every measure so far is very unscientific. I hope to get access to a 16-core Ryzen 9 soon and will probably run some more informal but more robust benchmarking there

view this post on Zulip Gaëtan Gilbert (Jan 08 2023 at 12:43):

the bench runs with -j1 (except for the stdlib) so I don't understand why file level parallelism matters

view this post on Zulip Karl Palmskog (Jan 08 2023 at 12:58):

The observation wasn't only based on the bench. But, for example this bench took around 10 min for Verdi Raft. You'd need super fast single core to do the whole of Verdi Raft in 10 min, CI with multicore regularly takes 14 min.

view this post on Zulip Gaëtan Gilbert (Jan 08 2023 at 13:12):

I don't understand your argument, that bench certainly ran with -j1 (you can see the command opam ran in https://gitlab.com/coq/coq/-/jobs/3496212740/artifacts/file/_bench/logs/coq-verdi-raft.NEW.1.1.perf)
CI only uses -j2 and also includes some deps in the job (see https://github.com/coq/coq/blob/master/dev/ci/ci-verdi_raft.sh)

view this post on Zulip Gaëtan Gilbert (Jan 08 2023 at 13:14):

also 5.0 is about intra file parallelism not inter file, how would inter file parallelism change?

view this post on Zulip Gaëtan Gilbert (Jan 08 2023 at 13:14):

(and the coq version benched does not exploit the new threading capabilities)

view this post on Zulip Karl Palmskog (Jan 08 2023 at 13:17):

on my current 2019-era Ryzen running around 3.5Ghz, time make -j1 for Verdi Raft (not including its dependencies) takes around 16 min on OCaml 5 Coq. So if we take those bench numbers at face value, doesn't seem plausible they are actually done with a single core.

view this post on Zulip Gaëtan Gilbert (Jan 08 2023 at 13:29):

the bench machines are https://www.intel.com/content/www/us/en/products/sku/191043/intel-xeon-e2246g-processor-12m-cache-3-60-ghz/specifications.html

view this post on Zulip Paolo Giarrusso (Jan 08 2023 at 14:38):

The numbers in https://gitlab.com/coq/coq/-/jobs/3496212740 claim to be about user time, so they can't be affected by this "scheduling" theory

view this post on Zulip Paolo Giarrusso (Jan 08 2023 at 14:40):

OTOH, PMP claimed the slowdown affects mostly Qed times, which matter differently for different projects.

view this post on Zulip Paolo Giarrusso (Jan 08 2023 at 14:53):

Also I wouldn't bet on different machines having the same slowdown: if those Qed are memory-bound, a better memory subsystem might lessen the impact.

view this post on Zulip Paolo Giarrusso (Jan 08 2023 at 14:56):

In general, a single-core program could use multiple cores for parallel GC, and JVMs have that option; that might explain how j1 isn't really single-threaded. But user time would account for that, and Guillaume's explanation suggests that effect should not matter, by design?

view this post on Zulip Guillaume Melquiond (Jan 22 2023 at 07:36):

So, I did some perf-testing. The slowdown seems to come for the most part from dynamic linking. The code was rewritten for OCaml 5 (presumably so that plugins can be loaded concurrently from any domain) and it is now quadratic in the number of plugins. Since Coq is .cmxs-heavy, if you have lots of small .v files, performances crash down.

view this post on Zulip Pierre-Marie Pédrot (Jan 22 2023 at 08:50):

This does not seem to correspond to the per-line diff from the bench, though. Did you test with native-compiler enabled? If so this would explain why cmxs become an issue.

view this post on Zulip Pierre-Marie Pédrot (Jan 22 2023 at 08:51):

Forget about it, I just remembered that native compilation is disabled in Coq + OCaml 5.0.

view this post on Zulip Pierre-Marie Pédrot (Jan 22 2023 at 08:52):

Still, slowdowns are concentrated in Qed lines and those don't perform dynlink without native compiler on.

view this post on Zulip Guillaume Melquiond (Jan 22 2023 at 08:57):

Most .cmxs are loaded through the prelude. I doubt the bench measures the time needed to load the prelude.

view this post on Zulip Guillaume Melquiond (Jan 22 2023 at 08:58):

Stephen Dolan confirms my understanding of the code: "The trouble is that the OCaml 5 dynlink logic is fundamentally O(n^2) as it rebuilds the entire frame descriptor table each time, while the OCaml 4 version is O(n) as it mutates the existing one in-place."

view this post on Zulip Pierre-Marie Pédrot (Jan 22 2023 at 09:03):

Indeed the per-line diffs are not observing the prelude loading (but the total time does). But still, the number of dynlinked cmxs is not that big (I know that because these days the debugger stops on every dynlink) so even a quadratic blowup shouldn't that terrible.

view this post on Zulip Guillaume Melquiond (Jan 22 2023 at 09:03):

Apparently, it is quadratic in the number of modules rather than plugins.

view this post on Zulip Pierre-Marie Pédrot (Jan 22 2023 at 09:04):

Ah. That's definitely worse, indeed.

view this post on Zulip Guillaume Melquiond (Jan 22 2023 at 09:05):

Quoting again: "It's much worse: it rebuilds the table once per module contained in each loaded .cmxs, because it needs a populated table between the initialisation of any two modules."

view this post on Zulip Pierre-Marie Pédrot (Jan 22 2023 at 09:08):

any hope to do better? I don't think Coq is the only OCaml development relying on dynlink

view this post on Zulip Guillaume Melquiond (Jan 22 2023 at 09:32):

In fact, the problem was originally detected with Frama-C: "the startup time for Frama-C is about 1.2s longer, which, for short benchmarks, can increase execution by 300% or 400%."

view this post on Zulip Gaëtan Gilbert (Jan 22 2023 at 11:29):

I know that because these days the debugger stops on every dynlink

put set break_on_load off in your $HOME/.ocamldebug

view this post on Zulip Gaëtan Gilbert (Jan 22 2023 at 11:32):


we can see the increase in eg https://coq.gitlab.io/-/coq/-/jobs/3496212740/artifacts/_bench/html/coq-hott/theories/Basics/Notations.v.html and https://coq.gitlab.io/-/coq/-/jobs/3496212740/artifacts/_bench/html/coq-hott/theories/Basics/Overture.v.html

view this post on Zulip Gaëtan Gilbert (Jan 22 2023 at 11:32):

(hott is -noinit so the declare ml module commands are in the files and appear in the line diff)

view this post on Zulip Emilio Jesús Gallego Arias (Jan 22 2023 at 15:13):

Nice find @Guillaume Melquiond ; indeed what I call "document creation time" is often far from trivial; I measure it with coq-lsp.

Maybe we should add a patch so we emit a time sentence for the prelude loading? That's quite easy to do.

view this post on Zulip Emilio Jesús Gallego Arias (Jan 22 2023 at 15:14):

Just wrap the call to Stm.new_doc

view this post on Zulip Emilio Jesús Gallego Arias (Jan 22 2023 at 15:24):

PR done.

view this post on Zulip Enrico Tassi (Jan 22 2023 at 19:13):

Do we gain something by wrapping all plugins? I guess in that case each plugin is one module, but I m not sure

view this post on Zulip Gaëtan Gilbert (Jan 22 2023 at 19:17):

that's not how wrapping works

view this post on Zulip Gaëtan Gilbert (Jan 22 2023 at 19:17):

and we already wrap plugins

view this post on Zulip Enrico Tassi (Jan 22 2023 at 19:56):

so wrapping is not using -pack?

view this post on Zulip Gaëtan Gilbert (Jan 22 2023 at 20:08):

correct

view this post on Zulip Enrico Tassi (Jan 22 2023 at 22:17):

coq_makefile does, funny dune does not. I'm wondering if there is an option for it, since it could mitigate this problem.

view this post on Zulip Guillaume Melquiond (Jan 23 2023 at 06:41):

While there is only one module left after packing, there are still as many frame tables as in the unpacked case (and in fact, one more, for the packed module itself). In other words, packing is really just a concatenation of modules nowadays. So, it would not help reducing the number of frame tables (which is the actual thing that matters for dynlink performances).


Last updated: Oct 13 2024 at 01:02 UTC