I'm currently struggling to make https://github.com/ppedrot/coq/tree/move-opaque-out-of-kernel work properly, and to this end I am summoning our local STM gurus.
For the unaware, this PR moves the computation of delayed proofs outside of the kernel.
In particular there is now a table in vernac/opaques that is used to store what was before part of the safe env.
I had to store it in the vernacstate because it does not follow a standard summary behaviour.
Essentially it's global, i.e. it's never backtracked upon when exiting sections / closing modules.
Now the problem is that somewhere along the way it gets reset to an old state that is not correct, leading to opaque proofs not being properly stored into the vo file.
vernacstate is functional/sync-with-undo
yes and no, we do modify state imperatively from there.
the problem is not so much imperative vs. functional, it's that the STM manipulates the state and I don't know where.
I've been trying to printf debug the trace of the STM, but this is getting over my head.
a minimal failing example with my PR:
Definition foo : True. Proof. abstract exact I. Defined. Print foo_subproof.
this fails because at the level of the Print, the subproof is not properly declared;
yet, it is globally registered in Declare at the same time as the call to export_private_constants
hum, let me checkout the branch
so, somewhere after the Defined is performed.
the relevant changes are in vernac/, everything else is kernel-related and pure
We now have a global imperative table in vernac/opaques that is kept in sync in the vernacstate in the same way as the other imperative states.
so the culprit is somehow VernacState.unfreeze_interp_state
but I don't understand why it reinstates the correct global env, but not the correct opaque map
in particular, in my trace debug on master, summary unfreezing sets the wrong env after Defined but magically the environment becomes correct at some point.
I'm not sure I follow you idea. To me the opaque_table should not be freezed/unfreezed. Why do you do that? (also, we don't need to send it to workers, so I don't see why it is part of the vernac state.
I think you want a gensym and weak map, and let the GC of ocaml do all the work (if one undoes and makes an opaque proof garbage)
No, because it must be backtracked upon when going back.
No need if you use a weak map
That's a hack IMO.
We already have leaking issues due to the VM with this design pattern.
well, let me check at your code, but IMO a weak map is just the right data structure
And actually, it won't work.
You need to know what to export at the end of the session.
it's in the env (the revstructu of the current library)
A weak map could be an overapproximation if backtrack happens.
you collect the gensyms from that, and query the table
This looks horrible to me.
I want a purely functional API, not yet another pile of mutable state thrown into the mix.
eh eh, you are granted permission to dislike ;-)
so let me see the undo thing
ah, wait a sec;
I think I solved at least part of the problem.
I had a suspicious piece of code resetting the opaque table between Qeds
I am pretty sure I was not the one who wrote this because this is stuff I have no idea about
but removing it makes my bug go away
(might well be the case it will pop up somewhere else in another guise, let's see)
where is it?
in stm.ml line 2268
and a bit above
not that I want to find a culprit, but I think I had a vague recollection of either you or @Emilio Jesús Gallego Arias guiding me to write this snippet
that code makes little sense given what you are trying to do
that I gather
it's unclear what the invariants for Qed are and they might have changed between the first iteration of this PR and now
looks like it was designed for a system where the private constants are kept inside the env
well, if the term is opaque they are inlined
so they are in the env
but here you Defined
the problem here is specifically with Defined
test-suite is running let's see what happens
anyway, something is fishy with that branch. It is the sync one, and we call stm_qed_delay_proof ...
it's neither qed not delayed...
the branch does not touch this part of the code though
it's already like that in master
test-suite still broken but for unrelated reasons
"it's a miracle" ©
OK, thanks anyways and sorry for bothering you for nothing :/
I've made some progress since yesterday, and while the first patch in the PR is functional, I am now hitting issues with the join command of the STM.
in a nutshell, I am not sure what its specifications are.
the last commit of the PR splits the "kernel join" operation into first checking a proof, which produces a certificate (essentially a proof term + constraints + a nonce) and then adding this into the environment.
now the STM join needs to be aware of the synchronization between the kernel env and the stm opaque table, but something is amiss
when joining, the kernel proof is forced (this is good), it is filled (i.e. constraints are added), but then the STM decides to overwrite the opaque table with the old one.
that means that there is a mismatch since the kernel believes the proofs are forced, but the opaque table does not.
it actually highlight the dubious spec of the join operation, since it lives "outside" of a state.
joining changes the global state, i.e. it should introduce a node in the document, but the STM seems to just forget about that
cc @Emilio Jesús Gallego Arias as well
(I have pushed the latest version on my branch if you guys want to have a look)
I'd say that joining morally is more like forcing a "lazy"
so it doesn't introduce a new node
but updates the state of a node
it depends on if you want the univ constraints to be in the env after joining
if no then join is a query
if yes then it does update the state
how to hack that in the current state of affairs, maybe update the node data type so you can mark a node as joined
@Gaëtan Gilbert good point, can the work done in join be reused?
@Gaëtan Gilbert currently it's neither if I read the code correctly
it's an in-place modification that is invisible in the STM
Indeed @Pierre-Marie Pédrot , IIRC , it is just forcing the futures, right?
Tho a message is emitted by the kernel for IDEs
that's how the color changes, correct?
One detail about universes is that the
safe_env contains a list of lazy constraints (which are part of the thunks of lazy proofs).
IIRC join does forces the proofs, then also forces the list and adds all constraints to the env in order to check they are still satisfiable, but does not produce a new env and store it somewhere. IIRC, it has been a long time.
If you join twice, the thunks are not recomputed, but I think the univ are checkd again
I would not add a node to the stm dag to represent this operation, frankly.
@Pierre-Marie Pédrot did you touch that list of lazy constraints? Did you move it outside and put a list of ids pointing to the opaque proof (constraints)?
yes that's basically the design
so you claim we never stored the final constraints anywhere ?
this looks weird to me
(I've adapted the PR to emulate what was happening before but this looks very wrong)
no, they are in the .vo
if you do
.vio -> .vo these constraints are in a dedicated segment
if you do
.v -> .vo I think we join (outside the stm, it was in the coqc driver) and then save the library using the global, updated, safe_env
no, I agree that they are in the vo
I mean that in interactive use we never really get access to the complete env
in CoqIDE, no, the safe_env would be dropped I believe
yeah so except if you mash the "force kernel proofs" button every second you can't be sure you're actually safe
even when the proofs are not gray
well, if you are at the end of the document and you click it, then you are (until you add one more proof)
so this confirms my original suspicion then
the new API somehow forces you to realize that something is not fully checked, I guess we could reflect that in the GUI
if we had a source format:
theory foo imports body end
we could put it in the
end, but we have no such a thing so...
It is a bit like the checks for modules/sections/proofs which are not closed, there is no natural syntactic element to put the error on.
yes, but we could still store it into the VCS
not as a node, yes
as a write barrier, even if it's not written in the document
While talking about the STM, does anyone has an explanation or even a solution to the troubling
exn_on-related double printing of errors, with the second time sometimes in the wrong environment (see #5479 and a couple of related issues).
I'm afraid Hugo that fixing that would require a deep reworking of the internals.
Last updated: Nov 29 2023 at 04:01 UTC