Stream: Coq devs & plugin devs

Topic: Async and classification


view this post on Zulip Lasse (Dec 24 2020 at 01:26):

I'm trying to understand the async proof processing of Coq better, and how it works with regard to different classification options of commands. In particular, I'd appreciate any insights into the following questions:

(I'm asking these questions in relation to this breakout session we had during the CUDW: https://github.com/coq/coq/wiki/BreakOut-2020-11-30-Lasse)

Thanks!

view this post on Zulip Guillaume Melquiond (Dec 24 2020 at 07:18):

Regarding Extraction, Print, and Check, it is a matter of whether a command can access an opaque proof. This is hardcoded in Coq.

view this post on Zulip Guillaume Melquiond (Dec 24 2020 at 07:19):

Regarding VtNow and VtLater, it indicates whether the command modifies the parser and thus can change the meaning of subsequent commands.

view this post on Zulip Lasse (Jan 04 2021 at 12:32):

@Guillaume Melquiond Thanks for your answers and sorry for the late reaction (I cannot figure out how to get Zulip to send me emails only of relevant threads). Do you know whom I might ping for the remaining questions here?

view this post on Zulip Guillaume Melquiond (Jan 04 2021 at 12:41):

Presumably @Emilio Jesús Gallego Arias and perhaps @Enrico Tassi too.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 13:10):

I can answer all this, but I'm not so sure I understand what you are trying to do, especially for the questions which have no answer yet. The general picture of all this is here https://hal.inria.fr/hal-01135919 . The "names" are there for the implementation of sentences that should not even be allowed in documents, such as Reset <name>. Lemma terminating commands are not really extensible, even if the classification function lets you think so, and in general misclassifying commands will just not work.

The breakout session is not really helping me, since it is too low level. You want to gather data? In batch mode? In interactive mode? Should it be used/available in batch/interactive? To Coq? To the UI?

view this post on Zulip Enrico Tassi (Jan 04 2021 at 13:13):

I need to understand better the lifetime of the data you are interested in.

view this post on Zulip Lasse (Jan 04 2021 at 13:44):

High-level, the issue we were trying to solve during the breakout was as follows: Currently, Tactician is implemented such that all tactics are classified as side-effecting. This is a hugely evil hack (because by doing this, tactics are replayed a second time at QED-time and I have to do some crazy things to make that work). We were trying to resolve this. The proposal was to create a patch for Coq that allows (1) at Proof-time to move data from summaries to Evd.Store.field's and (2) at Qed-time move data back from fields to summaries.

I was trying to create a first patch, but I think there may be an issue with this, because we are now (1) performing queries when starting a proof and (2) making (summary) side-effects during the VtCloseProof command. I'd like to know if that violates the async model.

An additional issue is that because every proof opening is now classified as a query, they (presumably) would all create async barriers, which basically completely disables async. It occurs to me that I don't actually always need the data from my summaries in proof mode, but rather only when a proof contains the search tactic. So I'm wondering what would happen if I classify the search tactic as a query. Would that break the async model? (I'm guessing yes.) Alternatively, could I make a Search command implemented as a VtModifyProof command that is classified as VtQuery. Would that be valid?

view this post on Zulip Enrico Tassi (Jan 04 2021 at 14:02):

Please tell me what you need, not how you thought to code it. Fiddling with classification is not a good idea, similarly evar-specific data is tricky, I hope it is not the only way to go.

view this post on Zulip Lasse (Jan 04 2021 at 14:14):

What I need concretely, is to read and write to summaries at some points while in proof mode:

Note that I'm already storing much information in Evd.Store.field's for other reasons, and that is working fine. So that should not be a problem.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 15:45):

A tactic can read/write to summaries. Maybe what you don't like is that changes done inside a proof are not kept by Qed.

Now, is the data to be read/write by tactic atoms or just by the whole expressions? (eg tac1 ++ tac2, do you need the data of tac1, tac2 or both?)

view this post on Zulip Lasse (Jan 04 2021 at 15:51):

Yes, indeed I need it to persist past Qed. The writing can be at any point during the proof, since I'm already keeping the data in Evd.Store.field's. At the moment the writing happens at the granularity of whole tactic sentences.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 15:51):

More about Qed: the proof may be run in background, so you can't have Qed need that data.
Is "data gathering" a commutative operation? (if you swap two proofs, what happens?)

view this post on Zulip Lasse (Jan 04 2021 at 15:52):

Whether or not it is commutative depends on the particular version of Tactician. But for now, let's assume that it is indeed commutative.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 15:53):

The other thing is, do you need your state to be aware on Undo or not?

view this post on Zulip Lasse (Jan 04 2021 at 15:53):

Yes, it needs to synchronize with the interactive document

view this post on Zulip Enrico Tassi (Jan 04 2021 at 15:53):

Imagine your state is not a summary (hence undoing a sentence has no effect on it). Is addition idempotent?

view this post on Zulip Enrico Tassi (Jan 04 2021 at 15:54):

Can you say a bit more why?

view this post on Zulip Lasse (Jan 04 2021 at 15:56):

Imagine your state is not a summary (hence undoing a sentence has no effect on it). Is addition idempotent?

Not at the moment, but perhaps it could be in the future

Can you say a bit more why?

My state needs to be the same in batch mode and interactive mode. Otherwise a file that compiled during your interactive explorations might not compile in coqc.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 15:57):

Hum, only if you don't write down the suggestion from search, right?

view this post on Zulip Lasse (Jan 04 2021 at 16:00):

Yes, but knowing Coq users, many will be too lazy to do this :-)

Also, even when navigating back and forth in the interactive document the commands might start to fail (unless addition would indeed be idempotent).

view this post on Zulip Lasse (Jan 04 2021 at 16:01):

Now that I think about it, making addition idempotent would be very difficult.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:01):

My fear is that the hard requirements are very incompatible with the way the document can be processed interactively.
So either you don't learn from the current file, or you have to soften them (or turn off the STM).

view this post on Zulip Lasse (Jan 04 2021 at 16:02):

So you don't think that the original idea of importing/exporting the data into Evd.State.field's at proof start/end will work? @Emilio Jesús Gallego Arias seemed to think it would.

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:04):

As long as your accumulation function is associative-commutative you should be fine though.

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:05):

The STM needs to be modified to handle arbitrary such state rather than just the global monomorphic universes obviously, but that was discussed during the wg already

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:06):

Yes and no, the universe constraints for the proofs are added to the global state on explicit demand (the wheels in CoqIDE) and until then they are in a lazy (a thunk). In one reads that thunk, then it is blocking.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:06):

I don't see how that could just work.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:07):

I mean, having Qed take some data from the evar map and add it to a global store is fine, and would work across processes as well, but if the data is immediately needed by the text following the Qed, then it is a synchronization point.

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:08):

I don't think it's needed, it can be delayed, that's the point of having a partial commutative monoid

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:09):

That means more non-determinism if you're using the tactician solver, but it's already the case due to universes

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:09):

If I got it right, it is not commutative and cannot be. Example:

Lemma training : statement
Proof.
  tactic
Qed.
Lemma work : statement
Proof.
 search (with no hint)
Qed.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:09):

How can the call to search work if tactic did not run?

view this post on Zulip Lasse (Jan 04 2021 at 16:10):

Having a fully linear document would not be the end of the world. Are you saying that it would work, but just not be async at all?

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:10):

@Enrico Tassi it's already the case, c'mon

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:10):

The result of the evaluation of a document depends on the scheduling order

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:10):

I know, so why are we suggesting it is OK?

view this post on Zulip Lasse (Jan 04 2021 at 16:10):

Well, In my mind Lemma work : statement would be a query, and therefore be a sync point right?

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:11):

@Lasse the problem is the Qed, not the Lemma

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:11):

Everything inside Proof / Qed can be delayed arbitrarily

view this post on Zulip Lasse (Jan 04 2021 at 16:12):

Yes, but if Qed is classified as side-effecting, and Lemma (after Qed) is classified as a query, they will be synced right?

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:12):

"It's complicated."

view this post on Zulip Lasse (Jan 04 2021 at 16:12):

Haha, okay

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:13):

I thought that we were embracing non-determinism when discussing the implementation details at the CUDW

view this post on Zulip Lasse (Jan 04 2021 at 16:13):

Is the "complicated" part about universe unification? Because in that case I don't care. That is extremely unlikely to affect me

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:13):

Yes, let's clarify the hard requirements and the bonus ones

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:15):

Maybe I should propose a silly model and see how hard it breaks.

view this post on Zulip Lasse (Jan 04 2021 at 16:16):

I'd say that the hard requirements are (almost entirely) deterministic compilation, with a fully linear document (completely disabling async if needed). Any amount of async we could achieve would be a bonus.

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:16):

In that case you can forget about async indeed.

view this post on Zulip Lasse (Jan 04 2021 at 16:17):

(As far as I know async only works in CoqIDE right? So I suspect that very few people are actually using it.)

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:17):

Vscoq as well

view this post on Zulip Lasse (Jan 04 2021 at 16:18):

ah okay

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:18):

but we hope to move soon to another backend, without the STM, but still async

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:18):

this is why I want to understand the requirements

view this post on Zulip Guillaume Melquiond (Jan 04 2021 at 16:18):

I might have missed the answer, but is it important that the current document is taken into account (rather than just the imported libraries)?

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:19):

I was about to ask the same question. For demos, I guess so, but in real life?

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:19):

This conversation starts looking way too much like the native compilation scheme issue

view this post on Zulip Lasse (Jan 04 2021 at 16:19):

Yes, it turns out that the most effective learning happens withing a document (because there the most relevant/related proofs are)

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:20):

Is tactician monotonic in its learning state?

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:20):

i.e. if it succeeds on a given state, then it succeeds on a bigger one

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:20):

(disregarding the generated proof term)

view this post on Zulip Lasse (Jan 04 2021 at 16:21):

With the exception of the leafs of proofs. Those sometimes move into an entirely different domain, such as set containment. In that case external libs are more important.

view this post on Zulip Lasse (Jan 04 2021 at 16:22):

@Pierre-Marie Pédrot Depends on the learning agent. With the current released one yes, but only in theory. In practice, the search time might explode into infinity.

view this post on Zulip Lasse (Jan 04 2021 at 16:23):

Regarding requirements: I think that determinism is much more important than the asyncability of Tactician. So if we can just turn async off that would be a good first approach.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:23):

But then even adding a lemma in the middle of a (already compiled) file can break its compilation

view this post on Zulip Lasse (Jan 04 2021 at 16:24):

Indeed, but that is also true for auto.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:24):

I'm fine turning off async (or not learning from the current file if async is on).

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:24):

The difference is that auto doesn't use lemmas without an explicit command.

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:25):

Is vscoq even able to run with async turned off?

view this post on Zulip Lasse (Jan 04 2021 at 16:25):

Yes, but it does listen do hint databases. So you might also complain that auto failed just because you added some hint.

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:26):

I am not sure I follow the argument.

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:26):

(barring universes) auto doesn't depend on anything inside a proof / qed block, which is an invariant necessary for async processing

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:26):

(yes, the STM works with async off...)

view this post on Zulip Pierre-Marie Pédrot (Jan 04 2021 at 16:27):

That the STM works that way doesn't mean that it'll properly behave UX-wise

view this post on Zulip Lasse (Jan 04 2021 at 16:27):

The analogy:

  1. Tactician can fail because you add a lemma somewhere in the development
  2. Auto can fail because you add a lemma+a hint

I don't see much difference.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:27):

you can avoid adding a hint

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:27):

anyway, it's just a default

view this post on Zulip Lasse (Jan 04 2021 at 16:27):

With tactician you can also avoid learning from a lemma:-)

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:28):

you could imagine a #[learn=off]

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:28):

ok then

view this post on Zulip Lasse (Jan 04 2021 at 16:28):

There is already a tactician ignore tac tacticical, and a Unset Tactician Learning option.

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:30):

If we settle on "learning only happens in sequential mode" then I would just have a global summary which is not rolled back by the STM at Qed time. I don't see the point of threading things in the evar map.

view this post on Zulip Lasse (Jan 04 2021 at 16:31):

(barring universes) auto doesn't depend on anything inside a proof / qed block, which is an invariant necessary for async processing

I think we are misunderstanding each other. My argument was about wether or not it is problematic that Tactician can fail after inserting a lemma. I was not making any claim about async.

view this post on Zulip Lasse (Jan 04 2021 at 16:33):

@Enrico Tassi That would be fine with me.

view this post on Zulip Lasse (Jan 04 2021 at 16:35):

Not having async would be a fine practical compromise to me. But it is not yet clear to me why it is theoretically impossible to support async. It seems to me that all one has to do is create sync points between Qed's (where new state is added), and search tactics after those Qed's, where that state is actually used. That would indeed assume commutativity, and perhaps the universes would be slightly different. Commutativity might be a problem depending on the agent. But I doubt that the universes matter.

I'm not saying that is easy, but I don't get why it is impossible?

view this post on Zulip Lasse (Jan 04 2021 at 16:52):

@Enrico Tassi I'm thinking about how to achieve a summary that is not rolled back at Qed time. And I now also remember that the data in my summaries is also added to Libobject, I hope that doesn't invalidate the discussion above. Would the easiest way be to just add a flag indicating that the data should not be rolled back? I suppose the code would need to check that async is off when accepting such a flag...

view this post on Zulip Enrico Tassi (Jan 04 2021 at 16:59):

It's fine even if it is in libobject I guess.

The whole async thing is under the assumption that text between Proof and Qed does not matter. You learn from it, so it does. End of the game. If you accept not to learn from the current file (if the file is processed asynchronously), then it's fine. IMO, this is much better than forcing sequential execution on users of large libraries and it also encourages users to split their development into files which makes sense (and which are learnt from).

view this post on Zulip Lasse (Jan 04 2021 at 18:38):

Not learning from the current file will be really easy. This is just a question of turning off Tacticians proof mode, and then async will be fully functional again, while the search tactic is still functioning, and can still learn from Required files. It is totally fine to me if users want to use it that way, and I guess I can build in some support for that into the tactician utility.

However, making that the preferred mode of operation is not feasible, because Tacticians "proof rate" would take a significant dive. The proof rate is everything for these ML tools because otherwise, you might as well be using auto. So I'm afraid all other concerns are kind of secondary to that. And indeed, it will help if you structure your development with many small files. But for the proof rate that I'm currently measuring this will not matter, because Tacticians benchmarks are run on existing developments that are not optimized in this way.

view this post on Zulip Lasse (Jan 04 2021 at 18:46):

The whole async thing is under the assumption that text between Proof and Qed does not matter. You learn from it, so it does. End of the game.

Okay, I guess that makes sense. I'll give up on that then :-)

view this post on Zulip Matthieu Sozeau (Jan 05 2021 at 09:58):

Pierre-Marie Pédrot said:

That the STM works that way doesn't mean that it'll properly behave UX-wise

You can force synchronous processing in VSCoq and it works well, so I guess that's not an issue


Last updated: Oct 15 2021 at 21:02 UTC