Stream: Ltac2

Topic: Examples for documentation


view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 21:13):

Hi folks,
I have added some material to the Ltac2 documentation in a Git PR. It's still in draft mode, because I wanted feedback from people before I make it an "official" PR.
https://github.com/coq/coq/pull/19470
Please let me know if you have any comments.

view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 21:31):

I would like to voice a few other things that caused me difficulty when learning the language. Many of these feel obvious in hindsight; at the time of learning Ltac2 I had not used OCaml / ML before. Because these things are elementary to people who have used ML or are experienced in functional programming, it might not be appropriate to include them in the documentation.

I will therefore include some possible documentation comments, and solicit feedback of the form

Here is one point of confusion I faced:

One way to think about this is that thunks are isomorphic to this type: (unit -> 'a) ~ (unit -> exn + ('a * (exn -> 'a)))

This took me a while to understand, and when I figured it out I wished that something like this was in the documentation:

In Ltac2, like other languages in the ML family such as OCaml, there is a world of difference between a value of type () -> bool and a value of type bool. A value of type bool is simply true or false. A value s of type () -> bool represents a computation which will be initiated when the function f is applied to the unique element of the unit type. The computation may take an infinite amount of time and fail to ever return a boolean; it may interact with the external world to print to the terminal or change the value of an element of an array; the computation may be as its return value is determined by external state. Most important here is the fact that the computation may return an exception,
which we then have to handle.

view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 21:59):

Another difficulty I had was that the semantics of Ltac2 is described very loosely as "It lives in a backtracking monad", and I think there is too much burden on the reader here to understand what exactly this means. Personally this went completely over my head and it was only later with some experimentation that I was able to decipher it. I think the documentation could be improved with a simplistic formal model that explains how we can use monads to model backtracking.

Here is my attempt:

Suppose we have two types, AA and EE. We call an "exception tree" of type AA a tree, containing nodes from both AA and EE, satisfying the following properties:
1. All interior nodes of the tree are elements of AA, and all leaves are elements of EE.
2. The children of any element of AA are indexed by elements of EE - in graph-theoretic language, it is a "full E|E|-ary tree". We consider only trees with at least one node; trees containing only a root eEe\in E are called "empty".
As a first attempt at modelling Ltac2 which ignores state and side effects, we can understand that an expression of the language of type A evaluates to yield an element in the type of exception trees of type AA.
In Ltac2, we can access these trees directly within the language. An exception tree of type 'a is either:

  1. empty, and when we read it, it returns an exception; or
  2. nonempty, and when we read it, it returns the root element x of the tree, and a function f which takes an exception e and returns a new exception tree. We can think of the exceptions as keys or indices for different children of x, and f as a dictionary mapping the exceptions to the children. Formally, we have
Ltac2 Type 'a result := [ Val ('a) | Err (exn) ].
val case : (unit -> 'a) -> ('a * (exn -> 'a)) result

When case s is Err(e), we understand that s is the empty tree and e is the reason why it is empty. When case s is Val(a, f), we understand that a is the root of the tree, and (fun e => (fun _ -> f e) maps exceptions to children of a.

The set of exception trees is similar to an algebraic data type, except that because Ltac2 is a Turing complete language, exception trees can be of infinite depth.

view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 22:04):

A small nitpick which I alluded to in this post but want to be more explicit about, is that the documentation suggests in various places that thunks are capable of backtracking, but it it is not just thunks, it is (unevaluated) expressions as well. Focusing on thunks make the distinction easier to talk about, as they're of a different type, but an expression of ground type can display backtracking behavior.

view this post on Zulip Thomas Lamiaux (Aug 25 2024 at 22:18):

First, I think we all agree that Ltac2 needs more documentation, and that would be great to get.

view this post on Zulip Thomas Lamiaux (Aug 25 2024 at 22:19):

(I have not read the PR yet so my comment is only about the examples above.)
Second, I would recommend being careful to the difference between:

With other forms of documentation like

In this regard, imho, it does not seem to me that the explanation on thunks should be in the refman in more than an extra line, as the goal of the refman is not to be a course on functional programming. So it good to have a reminder / a link elsewhere, but not much more.

view this post on Zulip Thomas Lamiaux (Aug 25 2024 at 22:21):

I would like to voice a few other things that caused me difficulty when learning the language

Third, the refman is simply not meant for leaning. Tell me if I am wrong but I have the feeling it is the issue here ? The refman most likely needs improving, but from your comments, it seems to me that what is most lacking here is tutorials and how-to for Ltac2.

Within the Coq Platform Docs project, it is planned to write such kind of documentation.

As of yet, no one is working on it, but as it often asked I'll probably will start working on it soon. If you would like to contribute writing or reviewing, whatever you have the time for, that would be great. I think that such explanations would fit perfectly there, and that the opinion of someone who has not been taught Ocaml as it first programming language would greatly help to make it better.

view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 22:35):

I agree with the proposed differences between the refman and other forms of documentation. These are good points and I'd be happy to contribute some of the material above to more pedagogical and gentle references. I acknowledge that some, but not all, of the points I raised are for pedagogy/learning rather than technical refinement.

On the other hand, in addition to being austere, a refman is comprehensive, whereas what we currently have is an overview which makes several informal allusions and evocative suggestions which like "backtracking lives in a monad" and "If one views thunks as lazy lists, then zero is the empty list and plus is list concatenation, while case is pattern-matching."
I would argue that this is not "refman style" in the sense of not being fully formal; it is informal/'pedagogical' but in a way that is suitable for an expert audience who knows what it means to say that "the side effects and backtracking semantically live in a monad" from experience with, well, general research experience reading papers in PL theory.

FWIW the PR contains mostly examples, which I think are more suitable for hardcore documentation.

view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 22:38):

(I would argue that examples are not just "how-to" guides for practical tasks, they are also illustrations of the language semantics.)

view this post on Zulip Thomas Lamiaux (Aug 25 2024 at 22:46):

I agree with you, both are important and it would be good to add a few examples and clarify sentences. For the refman, I think it is just a question of how much you want to say, provided that the short version is clear, which as you point out is not always the case currently.

view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 22:47):

Yes, it is somewhat of a gradual distinction, at some point it becomes more verbose than a reference manual should be and this interferes with the goal of directly finding the material you are looking for.

view this post on Zulip Thomas Lamiaux (Aug 25 2024 at 22:47):

If you want, I can look at the PR in details when I have the time. (end of the week) Once merged, we can focus on writing tuto and how-to explain how to use it more in practice. First step for that will be understanding what we want to say, and how.
(e.g. you may want a tuto Ltac2 for Ltac'users)

view this post on Zulip Patrick Nicodemus (Aug 25 2024 at 22:48):

Cool! Thanks, I would be grateful if you take a look at it sometime, whenever is convenient.

view this post on Zulip Janno (Aug 26 2024 at 07:32):

Patrick Nicodemus said:

A small nitpick which I alluded to in this post but want to be more explicit about, is that the documentation suggests in various places that thunks are capable of backtracking, but it it is not just thunks, it is (unevaluated) expressions as well. Focusing on thunks make the distinction easier to talk about, as they're of a different type, but an expression of ground type can display backtracking behavior.

I am not sure if my understanding is wrong here or if I am using the wrong terminology but it seems to me that even values can backtrack, not just unevaluated expressions:

From Ltac2 Require Import Ltac2 Printf.
Fail Ltac2 Eval
  let v := Control.plus (fun () => 1) (fun _ => 2) in
  printf "%i" v;
  Control.zero (Invalid_argument None).

This prints

1
2
The command has indeed failed with message:
Uncaught Ltac2 exception: Invalid_argument (None)

v seems to refer to two different, fully evaluated values when it is printed.

view this post on Zulip Janno (Aug 26 2024 at 07:39):

In some sense the function Control.case implies that there are no non-backtracking values, only values with trivial handlers.

view this post on Zulip Patrick Nicodemus (Aug 26 2024 at 07:47):

Janno said:

`v` seems to refer to two different, fully evaluated values when it is printed.

This is a good point. I am not sure why I left that out. Maybe I was thinking along the lines that printf "%i" v is a program fragment, not a program, and v is not backtracking or even well defined in the context of the program fragment; but in the broader program, Control.plus (fun () => 1) (fun _ => 2) is a backtracking expression. I'm not sure about that reasoning.

Janno said:

In some sense the function Control.case implies that there are no non-backtracking values, only values with trivial handlers.

I see what you mean. I think of a "value" (of a ground type, at least) as being something computationally inert, so I would use different language, I had rather thought of it as that there are no non-backtracking expressions, but this is not an important distinction, really, it is just that my point of view is that we cannot directly denote (say) an integer in the syntax but only a "backtracking integer", which is semantically something distinct from an integer. As you say, the syntax 3 denotes a backtracking integer with trivial handler.

view this post on Zulip Pierre-Marie Pédrot (Aug 26 2024 at 08:54):

I read the proposed changes, and I believe the informal semantics is wrong or at least extremely confusing to me. The handler is not part of the computation, as it's only returning streams morally. The handler is instead given by operations like Control.case.

view this post on Zulip Pierre-Marie Pédrot (Aug 26 2024 at 08:55):

@Janno here what is backtracking are not the values, it's Control.plus, which is not a value but a computation.

view this post on Zulip Pierre-Marie Pédrot (Aug 26 2024 at 08:56):

I really think that the only sane mental model for this effect is the lazy stream one. If you start talking about backtracking values and whatnot, extreme confusion will ensue.

view this post on Zulip Pierre-Marie Pédrot (Aug 26 2024 at 08:57):

In particular, the following snippet in the PR is just wrong:

plus takes a computation r and an exception handler h, and runsthe computation r under the handler h. If r () returns a valuea, plus returns a. If r () raises an exception e, plustries to compute h e. If this raises an exception e', plus raisesthe exception e'. Otherwise, if it returns a value a, plus r hreturns a.

view this post on Zulip Pierre-Marie Pédrot (Aug 26 2024 at 08:59):

plus (fun () => v) hwith v a value is never v except if the handler h is the trivial one

view this post on Zulip Pierre-Marie Pédrot (Aug 26 2024 at 08:59):

(i.e. zero)

view this post on Zulip Janno (Aug 26 2024 at 11:05):

Pierre-Marie Pédrot said:

Janno here what is backtracking are not the values, it's Control.plus, which is not a value but a computation.
I really think that the only sane mental model for this effect is the lazy stream one. If you start talking about backtracking values and whatnot, extreme confusion will ensue.

I find that actually working with Ltac2's backtracking pushed me towards the "backtracking value" perspective. It is possible that my perspective is a little warped from generally working in the complete absence of backtracking and then only introducing it at very specific points and usually only for very specific values. At the same time I understand that it is the wrong model since the backtracking is not actually tied to v but instead to the point in the program where a value was computed for v. A function called after v gets assigned a value will observe backtracking even if it is not passed v as an argument. Teaching wrong models will indeed lead to more confusion so I'll withdraw my earlier comment.

view this post on Zulip Patrick Nicodemus (Aug 26 2024 at 15:16):

`@_Pierre-Marie Pédrot|299353 said:

The handler is instead given by operations like Control.case.

Hm, I don't understand, I would not call what case does handling the exception. Handling the exception is doing something with it, in response to it, which is what plus does when you give it a function exn -> 'a. What case does is not handling the exception, it's returning an exception handler so that one can use it as a continuation; but the exception handler is already baked into the computation. I still maintain that in plus r h, h can be regarded as an exception handler.

plus (fun () => v) hwith v a value is never v except if the handler h is the trivial one

I agree this is an oversimplification, because the two program fragments are not equivalent in a larger context. They are only equivalent if h is zero or if the downstream program context they're embedded in runs and terminates without raising an exception, so that the "tail" of the stream is never called. The problem is that I am ignoring the backtracking behavior of Control.plus within the context of the larger program and focusing only on the backtracking behavior of r and h.

However, the lazy stream model is also a simplification, because it focuses on the case where h does not discriminate between exceptions, treating them all interchangeably. If h handles some exceptions but not others, then plus r h does not behave like stream concatenation. (Note that my PR adds an example to the source code to illustrate your lazy stream point, so I understand where you're coming from, I'm not arguing with the utility of that model.)

Ltac2 b e :=
  match e with
  | Not_found => "b"
  | e => Control.zero e
  end.

Ltac2 Eval Control.plus (fun () => Control.zero No_value) b.
(* Error: Uncaught Ltac2 exception: No_value *)

Here is a case where, informally, [] ++ x != x, that is, appending the empty stream (Control.zero No_value) to the beginning of x isn't equivalent to x, it is equivalent to the empty stream, and so the "append" model breaks down. A computation "cons'ed" to the front of the stream can wipe out the entire stream behind it if it raises an exception that isn't handled.

This is why in a comment above, I suggested that a slightly better model for a computation than a stream is an exception tree, where the root of the tree is the initial value that the computation returns, and the labels indicate the next value that it should transition to as a function of the current node. However, it might be a bit unwieldy to use this language in the documentation.

At the very least we should work to agree on some clear common language and use it consistently through the documentation. For example we might like to say "an expression evaluates to a value" and indeed if I write

Ltac2 Eval Control.plus (fun () => "a") (fun _ => "b").

then certainly my terminal prints

- : string = "a"

which would seem to indicate that the expression evaluates to the given value. However, based on Pierre-Marie's response above, he disagrees with my comment that Control.plus (fun () => "a") (fun _ => "b") "returns" "a"; and I can understand this, but you can see why someone would want to use this language.

view this post on Zulip Gaëtan Gilbert (Aug 26 2024 at 15:28):

This is why in a comment above, I suggested that a slightly better model for a computation than a stream is an exception tree, where the root of the tree is the initial value that the computation returns, and the labels indicate the next value that it should transition to as a function of the current node. However, it might be a bit unwieldy to use this language in the documentation.

That is the lazy stream model IMO.

ie there are values (internally ocaml values of type valexpr) and computations (internally ocaml values of type valexpr Proofview.tactic)
values are the usual int, constructor applied to values, closure (ie a function value -> computation), etc
computations are equivalent to type comput = Nil of exn | Cons of valexpr * (exn -> comput) (a lazy stream of values)
expressions (in a lexical context ie a mapping from bound variables to values) evaluate to computations (Tac2interp.interp (after parsing / typechecking / expanding notations etc))
Ltac2 Eval evaluates an expression to the first value it produces, or fails if it doesn't produce a value

view this post on Zulip Gaëtan Gilbert (Aug 26 2024 at 15:33):

and thunks combine with call by value to be able to pass the whole lazy stream, ie ie e is an expression in f (fun () => e), f is called with a value which is basically Closure (fun () => computation of e)

view this post on Zulip Patrick Nicodemus (Aug 26 2024 at 15:39):

Okay, that is fair. It is only a matter of terminology, but to me a "stream" has no branching behavior. This is the only real distinction I'm making.

view this post on Zulip Patrick Nicodemus (Aug 26 2024 at 15:46):

I will point out that the Ltac2 documentation also uses the phrase "lazy list" and a list in my experience has no branching behavior.


Last updated: Oct 12 2024 at 13:01 UTC