Issues about extraction · Coq devs & plugin devs

Stream: Coq devs & plugin devs

Topic: Issues about extraction

Hugo Herbelin (Mar 05 2023 at 18:44):

The "extraction project" on github happens to be very convenient to have an overview of the extraction issues. Thanks in passing to who contributed to it.

Still in passing, what are the current plans about extraction and universes (e.g. on the MetaCoq side)? On the Paris side, Pierre L. and I are supervising a work on making explicit Prop<=Type subtyping, that is in characterizing what definitional rules have to hold if one replaces subtyping by boxing. This is related to extraction in the sense that the function to enter the box is where the computational contents connecting a type φ[X:=T]:Prop to its version in type φ[X]:Type when X is substituted only afterwards with a T that lowers it in Prop. I made an experimental PR #17302 to test this approach.

Ali Caglayan (Mar 06 2023 at 01:03):

I'm glad somebody found it useful. :)

Matthieu Sozeau (Mar 06 2023 at 07:06):

Hugo Herbelin said:

The "extraction project" on github happens to be very convenient to have an overview of the extraction issues. Thanks in passing to who contributed to it.

Still in passing, what are the current plans about extraction and universes (e.g. on the MetaCoq side)? On the Paris side, Pierre L. and I are supervising a work on making explicit Prop<=Type subtyping, that is in characterizing what definitional rules have to hold if one replaces subtyping by boxing. This is related to extraction in the sense that the function to enter the box is where the computational contents connecting a type φ[X:=T]:Prop to its version in type φ[X]:Type when X is substituted only afterwards with a T that lowers it in Prop. I made an experimental PR #17302 to test this approach.

We have no plans regarding Prop <= Type yet, the whole erasure development works under the assumption that we don’t use this rule. It would be great if we had a proven translation from implicit to explicit Prop <= Type ;)

Bas Spitters (Mar 06 2023 at 08:35):

Regarding meta-coq extraction, we're working with Matthieu to integrate our optimizations on typed extraction:
https://github.com/AU-COBRA/typed-extraction/
Maybe it's useful for this discussion.

Hugo Herbelin (Mar 07 2023 at 08:49):

Yes, this is very relevant for the discussion since somehow, I'm trying to understand how useful it is to work on the existing extraction plugin, i.e. what are the hopes that existing developments using extraction can eventually rely on the metacoq-based extraction and its optimizations, and at which distance in time, and with which kind of linking with the target languages.

At the current stage of my understanding of the differences, it is that there is no modules and no Prop<Type (which are both actually not optimally supported in current extraction) but there is inlining, special extraction for constants, minimal dependency graph computation (and of course proof erasure). Additionally, all that preserves type-checking (and thus interface file generation I guess). And all that is fully certified!

Bas Spitters (Mar 07 2023 at 09:34):

Yes, they absence of modules came up in practice when extracting to the LIGO smart contract language. Since smart contracts are not too big yet, it was possible to work around it.

Maxime Dénès (Mar 07 2023 at 09:38):

Hugo Herbelin said:

what are the hopes that existing developments using extraction can eventually rely on the metacoq-based extraction and its optimizations, and at which distance in time, and with which kind of linking with the target languages.

I have been wondering the same. It would be interesting to hear from the metacoq devs what their roadmap is in terms of articulating with / replacing the extraction plugin.

Hugo Herbelin (Mar 07 2023 at 10:19):

A side question to the specialists: how much the pure (module-free) part of the "miniml" language used in the extraction plugin can be interconnected with $λ_{\Box}^T$ (as presented in the Annenkov, Milo, Botsch Nielsen, Spitters' paper) and how much the different translations to the source or internal language of specific languages (OCaml, Haskell, Scheme, Liquidity, CameLIGO, Elm, Rust, Clight, ...) can be used interchangeably?

Also, among those translations, which ones are certified (including e.g. variable name scoping, which is a difficult part in the extraction plugin)? At least the translation to Clight is I guess?

Bas Spitters (Mar 07 2023 at 15:48):

@Hugo Herbelin lambda_box in metacoq was modelled by Pierre's lambda_box, so I expect it to work.
We've concentrated the optimizations as transformations of lambda_box, precisely so that they would work for all backends. https://arxiv.org/abs/2108.02995
@Matthieu Sozeau is integrating our work into the certicoq pipeline, so it should also work for Certicoq's Clight.
With Jean Pichon I have a student working on a verified wasm backend for certicoq. The translation is done. He's currently working on the proofs.
Printing to OCaml, Haskell, Scheme should hopefully go in the same way as the other languages, i.e. only the final printing part is unverified.
For Scheme, I could even imagine verifying also the final step, as the semantics of Scheme is not too complicated.
@Yannick Forster is working on verified extraction to ocaml.

Karl Palmskog (Mar 07 2023 at 15:55):

instead of Scheme, why not do a proper formally specified ML? (Just saying that would seem to make more sense if something other than OCaml is on the table)

Matthieu Sozeau (Mar 08 2023 at 13:00):

We're going to untyped code and Malfunction, right now and don't have the plan to do the Obj.magic tagging magic. Rather we hope to prove a semantic result allowing to connect to ML code (under some conditions).

Karl Palmskog (Mar 08 2023 at 13:07):

I hope those conditions can be checked somehow

Paolo Giarrusso (Mar 08 2023 at 14:05):

Are you thinking of how to handle { x | x < 3 } -> {y | y > 4}? For decidable predicates, runtime checking would amount to higher-order contract checking/gradual typing (but e.g. Racket people have that solved, maybe modulo overhead)... But you could also translate Coq types to semantic types in the model of an OCaml program logic (I'm thinking of an iris-based one, but less than irisI could well be enough)

Matthieu Sozeau (Mar 08 2023 at 14:08):

They surely won't be decidable in general, but indeed in some cases gradual typing could help.

Matthieu Sozeau (Mar 08 2023 at 14:11):

Personally I'm more interested in the (maybe unusual) setup where you have a Coq program using ML-defined primitives that can be given a strong specification in Coq. But it should also handle use cases like an ADT being entirely extracted from Coq with an abstract-enough interface that we can maintain guarantees no matter what the ML program does with it.

Matthieu Sozeau (Mar 08 2023 at 14:11):

But it' like s a jungle sometimes, ML. It makes me wonder how I keep from going under ;)

Matthieu Sozeau (Mar 08 2023 at 14:14):

Joke apart. Pierre-Marie would tell you that if you extract val f : (nat -> nat) -> nat from Coq, it's really hard to know what you can expect when interacting with ML code, which might call f (fun _ => assert false) for example

Pierre-Marie Pédrot (Mar 08 2023 at 14:16):

Assertion failures are the tip of the iceberg. There are so many more things going on in OCaml...

Pierre-Marie Pédrot (Mar 08 2023 at 14:16):

(En passant I'm actually pretty confident that the exception-raising fragment of OCaml can be lifted reasonably to extraction.)

Paolo Giarrusso (Mar 08 2023 at 15:23):

The RustBelt/NuPRL-style techniques I was hinting earlier should scale here, as long as somebody does the (nontrivial) work to get a program logic for OCaml, but I know such things exist. I'd be less confident on exploiting OCaml types..

Hugo Herbelin (Mar 08 2023 at 15:47):

So, if I read between the lines, the different components (CIC to (untyped) $\lambda_{\Box}$ ; optimizations over $\lambda_{\Box}$ , possibly preserving typing; $\lambda_{\Box}$ to source code or intermediate compilation language) are indeed rather independent and the different components from metacoq, from Annenkov, Milo, Botsch Nielsen and Spitters, and from Pierre L.'s plugin extraction could a priori be mixed together. In particular, regarding the target languages, am I correct that we could in principle have an infrastructure relying on $\lambda_{\Box}$ to which any backends from $\lambda_{\Box}$ to an arbitrary source language or arbitrary intermediate language, whether it is certified or not, could be registered?

Then, an answer to @Maxime Dénès's question would be that even if noone is currently actively working on providing such "hub", that would be a natural wish for a roadmap, right?

Matthieu Sozeau (Mar 08 2023 at 17:10):

@Hugo Herbelin yes and no :) Operationally, yes we can after the erasure to $\lambda_{\Box}$ and Annenkov's et al pipeline get code that translates easily to most functional languages, with the right operational semantics. However for typed target languages it is to be studied on a case by case basis what types can be assigned to the extracted code. Annenkov's et al work preserves some type information but not all of course. It might be necessary in some cases to use that type information in the pretty-printing phase. That's were things are not so clear to me.

Matthieu Sozeau (Mar 08 2023 at 17:10):

Pierre-Marie Pédrot said:

(En passant I'm actually pretty confident that the exception-raising fragment of OCaml can be lifted reasonably to extraction.)

Right, maybe not the best of examples

Matthieu Sozeau (Mar 08 2023 at 17:13):

Paolo Giarrusso said:

The RustBelt/NuPRL-style techniques I was hinting earlier should scale here, as long as somebody does the (nontrivial) work to get a program logic for OCaml, but I know such things exist. I'd be less confident on exploiting OCaml types..

We'd be really interested to see that kind of program logics, and if they are built on top of a typed or untyped ML variant. AFAIK only idealized versions exist, but for example we need true mutual fixpoints in the target and I've never seen that handled. @Yannick Forster is probably more knowledgeable than me.

Hugo Herbelin (Mar 08 2023 at 19:58):

However for typed target languages it is to be studied on a case by case basis what types can be assigned to the extracted code.

I agree, this is one of the parameter of the problem, and each backend registered to such "hub" would have to indicate type requirements to be able to do the export.

In the case of Haskell or Scheme, there is a way to bypass the type-checker, so this should be possible in any case (at least it is with current extraction).

In the case of Clight or Malfunction (which I don't know well), this is also untyped (IIUC), so that should be possible.

In the case of OCaml, the current extraction (mostly) knows how to adjust CIC types to simple types or ML-polymorphic types, but I agree that the general case (e.g. how to adjust to GADT, or System F, or System Fω) is also "research" work.

Bas Spitters (Apr 18 2023 at 10:55):

@Hugo Herbelin given this discussion, would there be an interest in us moving {rust,elm} extraction to a more central place?
https://github.com/AU-COBRA/coq-rust-extraction
https://github.com/AU-COBRA/coq-elm-extraction
( we'd need to wait for the upcoming meta-coq release )

Karl Palmskog (Apr 18 2023 at 12:50):

@Bas Spitters I don't know if Hugo would consider Coq-community central enough, but they'd be welcome there (maybe as a monorepo project)

Bas Spitters (Apr 18 2023 at 14:39):

Thanks! I'll wait for Hugo, but that might be a start.

Hugo Herbelin (Apr 18 2023 at 15:52):

Since it depends on MetaCoq, I would first ask @Matthieu Sozeau. I would say that there are at least two questions. About informing, we need a place to collect the different extraction targets, maybe on the coq website, or on the Coq wiki, or on coq-awesome (which actually already mentions ConCert)? For the maintenance, coq-community looks indeed like a good place.

Karl Palmskog (Apr 18 2023 at 15:54):

I'm willing to list specific extraction projects at Awesome-Coq (ConCert is already listed).

Karl Palmskog (Apr 18 2023 at 15:57):

in my view at least, MetaCoq is its own eco-subsystem inside of Coq's. Until MetaCoq begins to live in Coq's repo itself, MetaCoq ecosystem pieces would be best maintained in Coq-community or alongside MetaCoq

Bas Spitters (Apr 19 2023 at 07:41):

If @Matthieu Sozeau would like to have it as part of certicoq, that would be fine with me too. It seems natural to have a more central place for it.

Karl Palmskog (Apr 19 2023 at 07:43):

to me, it's all about ensuring maintenance, releases, packaging

Karl Palmskog (Apr 19 2023 at 07:44):

seems pretty clear we'd want this in the Coq Platform, for example, meaning labor handling Platform management issues

Bas Spitters (Apr 25 2023 at 14:36):

@Matthieu Sozeau @Yannick Forster We've now updated our extraction to metacoq 1.2, and we think it's good to move it to a more central place (and into CI). Would you like to have it in metacoq, or would the coq community be a better place?

Bas Spitters (May 31 2023 at 14:14):

@Hugo Herbelin or others, to stream line unverified extraction with certicoq, I believe adding extraction constant to certicoq seems to be one of the main open issues. I thought, I'd record the issue here:
https://github.com/CertiCoq/certicoq/issues/24

Last updated: Apr 19 2024 at 17:02 UTC