Hello, I'm Terry Herman. I'm interested in learning Coq to write verifiably correct code. As a beginner, how can I install Coq? What is the best way to learn Coq as a programmer? I am not a mathematician.
My interests: networked applications and security.
If you are a programmer, most of the resources (except maybe the most theoretical) will be appropriate. A common starting place is Software Foundations. And to install Coq, just use one of the binary installers for the Coq Platform. See the download and documentation pages on the Coq website.
Thank you.
To test Coq's code extraction feature, I wrote this in jsCoq:
Fixpoint sum (x : nat) :=
match x with
| 0 => 0
| (S n) => n + 1 + sum n
end.
Compute sum 10.
Require Extraction.
Extraction Language OCaml.
Extraction "sum.ml" sum.
It didn't print anything. Did I do anything wrong? Do I need a local Coq installation instead of using jsCoq?
I based my code on the fine tutorial available at https://learnxinyminutes.com/docs/coq/.
If you use a localCoq, the extracted code will go into a new file "sum.ml". I am not sure how to access such a file via jsCoq - you should ask in the jsCoq stream here.
If you want to work with OCaml, I would recommend to install Coq from sources using the Coq Platform scripts (not the binary installers). This way you also get OCaml and can install packages like CertiCoq (Coq -> C extraction). See (https://github.com/coq/platform#readme).
Great.
Now, I have a related question to ask.
Take this code:
Fixpoint sum (x : nat) :=
match x with
| 0 => 0
| (S n) => n + 1 + sum n
end.
Compute sum 100.
It doesn't run.
jsCoq says Stack overflow.
How can I go about fixing this code?
I have a guess: the nat type uses Peano encoding, so for moderately large numbers memory usage goes up exponentially.
When I spam Alt+Enter I get this error:
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Stack overflow.
Anomaly
"Uncaught exception Stm.Vcs_aux.Expired."
Please report at http://coq.inria.fr/bugs/.
Stack overflow.
Stack overflow.
Anomaly
"Uncaught exception Stm.Vcs_aux.Expired."
Please report at http://coq.inria.fr/bugs/.
Stack overflow.
Stack overflow.
Terry Herman said:
I have a guess: the nat type uses Peano encoding, so for moderately large numbers memory usage goes up exponentially.
Depends what you call exponentially, but yes, that is the idea. In general the basic definitions available in Coq are geared towards proof, not towards (heavy) computations.
For natural numbers, there is a binary representation available in the standard library (NArith). But of course if you use those you cannot define functions by induction the way you just did, because a binary number does not have that structure any more…
Now, let's say that I just want this to run normally. Is there a way to evaluate my function in Coq? I imagine there must be a command that automatically replaces nat with something more optimized.
Native compute? I'm just guessing.
For your function, the "easy" way is the following:
Require Import Nat Arith BinNatDef NArith.
Fixpoint sum (x : nat) : N :=
match x with
| 0 => 0
| (S n) => (N.of_nat n + 1 + sum n)%N
end.
Compute sum 100.
Here N
is the type of binary natural numbers, and you use that as a return type to have a reasonable representation of you return sum
Now I like it. When writing code that is meant to be extracted, I have to use the more efficient N
instead of nat
right? From this, I have an observation: there is a clear separation between the code extraction world and the proof world. Is that correct?
But in general the problem of relating two different representations of "the same" type is hard, because you want to transport not only functions, but also lemmas, all kinds of fancy dependencies… I think the state of the art is CoqEAL, but it’s definitely not easy
Terry Herman said:
Now I like it. When writing code that is meant to be extracted, I have to use the more efficient
N
instead ofnat
right?
Not really: you can just use nat
and instruct the extraction to extract that to a reasonable binary representation. Then you won’t be able to run heavy computations in Coq, but the extracted code would perform as you expect
The use of N
is if you want to compute inside of Coq
As for the separation between the code extraction and the proof world, I am not completely sure of what you mean, but I would say yes :)
At least, extraction lets you do all kinds of manipulations to transform a not-so-efficient program geared towards formal verification into one much more adapted for actual evaluation
The crux of this being that extraction completely removes all proof-related content, so that you do not need to compute proofs you likely do not care about in the extracted code
But – as in the case of nat
– you can also instruct it to map various types and functions to some more efficient implementations available in OCaml
But when I do so, there is a risk that the mappings I specify can be incorrect. How can I reduce that risk? Or am I overthinking and the risk is just theoretical?
OCaml is not the most interesting language for me. I want to extract code to a more low-level language such as bedrock2. My guess is the logic remains the same. I still need to specify mappings, and wrong mappings can lead to unsound results.
My questions are hypothetical because I am just starting out. If some of the questions make no sense, I apologize in advance.
I said OCaml, but indeed it applies to any target language
And as for your concerns, indeed it’s an issue: the more features of the target language you use, the more vulnerabilities you can introduce. This is where project like CoqEAL, which lets you change representation for different purposes, are very valuable
This is the path to take if you want to avoid using trusted primitives and re-code them in Coq instead
The other solution would be to prove that the primitives you trust indeed behave as you expect, but that would need a formal semantics of your target language. On the top of my head I do not have a real-life example of this, because ofc giving a formal semantics to a real-life language is daunting, but maybe there’s something like this out there that I’m unaware of
And your questions make total sense! These are indeed amongst the difficult issues with extraction
Ah, wait it seems I’ve been maybe too pessimistic (I don’t know much about bedrock/bedrock2): it seems like they have (or intend to have) such a formal semantics and the possibility to prove properties of code
So then I guess the path would be : 1) write down your primitives directly in the low-level language, and prove there that they fulfil their spec 2) once you have those, write the high-level code in Coq, extract it using your primitives
Have you seen support for extraction to bedrock2? It’s a C-like language, and Coq does not support extraction to C… so @Meven Lennon-Bertrand ‘s idea works out once you connect the bedrock2 primitives to OCaml (or another target language), but allows bugs in the FFI; but extracting to bedrock2 does not work, and instead you just write proofs about the bedrock2 code (similarly to how you use C: https://stackoverflow.com/a/46899007/53974)
can bedrock2 really be considered extraction? I don't think it processes general Coq code, just a deep embedding?
I meant to ask the same but I realize my question was ambiguous.
What I should have written: Have you seen evidence that Coq supports extracting to bedrock2?
code like https://github.com/mit-plv/bedrock2/blob/master/bedrock2/src/bedrock2Examples/chacha20.v is indeed not what we call extraction.
Ha, my bad, I really do not know what bedrock it about :upside_down: As to extraction to C, there’s CertiCoq, though!
I'll be the last one to complain about educated guesses :sweat_smile: . And I didn't know about CertiCoq!
However, I'd expect a tradeoff:
if you are writing "systems code", sometimes (in rare cases?) performance requires you to optimize purely functional code — say using destructive updates. Compilers/extraction tools can do that for you sometimes, maybe a lot of times, but not always. The other times, you might need to write the code in an imperative language directly.
(say, Bedrock2 in this example).
You can then prove that code correct using some Hoare logic (nowadays, often separation logic), often by relating that to the functional version.
Can you mix and match nowadays, translating some functions by hand and extracting others, and still compose the proofs? If not, you'd have to choose once for the whole project. Obviously it _should_ be possible, but you need proofs to be modular enough to support it, and you'd need a useful way of describing interfaces between extracted and hand-written code...
CakeML supports the mix-n-match mode, but only for HOL4. You can write some functions in HOL4 and some (say, imperative ones) in the AST of CakeML embedded in HOL4. Then you automatically translate the pure HOL4 ones to CakeML and get a full AST with your manually defined and generated functions. All reasoning, possibly with separation logic, still happens in HOL4, and extends across the whole final AST (with access to generated equivalence proofs for auto-translated stuff).
To my knowledge there is nothing similar to this for Coq with the same small trusted base, although CertiCoq may get there in the future when end-to-end proofs for Coq-to-Clight are completed.
Part of this discussion derailed here.
Thanks for the responses in the Community safety thread. I'm glad to hear that there is a code of conduct that everyone abides by. I am feeling comfortable now.
Now, I'm back with another question. Are there Coq projects that I can contribute to? I am particularly interested in contributing to a project that merges PRs quickly as long as they meet certain quality standards. While I am not a Metamath user, I know that Metamath's primary project is set.mm. The set.mm repository merges PRs very quickly. The number of pending PRs is always single digit.
Also, I prefer to contribute to projects that are permissively licensed. I am not particularly fond of copyleft.
you might be interested in the Coq-community GitHub organization, and in particular its contributing guide. There are several projects there under permissive licenses that accept and merge PRs quickly if you ask, for example Hydras & Co.
We are also looking for maintainers for some projects: https://github.com/coq-community/manifesto/issues?q=is%3Aopen+is%3Aissue+label%3Amaintainer-wanted
Finally, Coq-community has its own Zulip stream.
Thank you for the great answer.
Here is another question that I would like to ask the community.
When I do Coq tutorials, I often write everything in a single file.
However, when working on a project, putting everything into one file is not sustainable.
How do you create a Coq project?
Most Coq projects have multiple Coq files as well as some other supporting files, such as package.json, Makefile and dune.
How do I learn to create a project with all the supporting files?
Some doc is here: https://coq.inria.fr/refman/practical-tools/utilities.html#building-a-coq-project
You may also want to look at https://github.com/coq-community/templates
Also, I see that Coq has a compilation process. Can I create a CI process to automatically compile Coq files every time I commit?
This is also covered by the templates of coq-community
@Terry Herman I recommend looking at a simple project in coq-community for inspiration on build management, for example, this is a simple Coq library: https://github.com/coq-community/jmlcoq
You can use the same boilerplate but modify files like _CoqProject
and even meta.yml
(if you use the templates)
I explain in the README here some of the meanings of each boilerplate file: https://github.com/palmskog/coq-program-verification-template#file-and-directory-structure
Last updated: Oct 01 2023 at 17:01 UTC