While I enjoyed the recent hs-to-coq tutorial (and generally applaud educational stuff like this), some points that came to mind:
.hs
file into Coq seems huge compared to the one between the Coq program and the corresponding extracted .hs
file. In other words, taking the Coq program as semantic ground truth seems much preferable to using the original .hs
file.since there is no comments section at the blog, would be interesting to hear @Li-yao comment here on any of these
There's already a link to a repo addressing the last two points :)
For extraction, we have thought about it when we were verifying containers. The problem is our extraction method does not produce syntactically equivalent code, so we have only used it to validate our translation. I cannot recall the exact differences in our extractions, here's a snippet in our paper (Section 8.4, https://arxiv.org/pdf/1803.06960.pdf):
Although in a certain sense hs-to-coq and extraction are inverses, round-tripping does not produce syntactically equivalent output in either direction. In one direction, hs-to-coq extensively annotates the resulting Coq code; in the other, extraction ignores many Haskell features and inserts unsafe type coercions.
Li-yao said:
There's already a link to a repo addressing the last two points :)
that link was not easy to find, consider putting it in "Related links". Also, I would put the Coq version / hs-to-coq SHA in the blog itself for self-containment.
I view the size of the TCB as a minor concern for now, when the status quo is that people don't formally verify Haskell programs at all because there is no tooling otherwise.
Yao Li said:
For extraction, we have thought about it when we were verifying containers. The problem is our extraction method does not produce syntactically equivalent code, so we have only used it to validate our translation.
I think I understand the practical issues, but if even the validation of the shallow embedding is via extraction (p. 23 in the paper), well, then I think I would personally opt for extraction still (and cut out the middle man, in some sense).
@Li-yao also final bitrot concern: if cloning hs-to-coq repo is what you go with, it would probably be a good idea to use a Git submodule with a SHA for hs-to-coq known to work (with GHC 8.4 and Coq 8.10.2)
in the absence of packaging on both Coq and Haskell side, this is probably what one would converge on for some stability in hs-to-coq based projects currently.
I'm hoping for some proper packaging of hs-to-coq to address that :)
I think I would personally opt for extraction still (and cut out the middle man, in some sense).
I feel that's another direction: it's more about implementing trustworthy code than providing more confidence to existing code. I've also thought about something similar, for example: first implement your code in Haskell (because it's more enjoyable programming in Haskell), then verifying and refine the program in Coq, and eventually extract it to high-performance code like C or Go. But right now, we focus on verifying existing Haskell code and we haven't done anything along this direction.
well, isn't the main benefit of Haskell to get access to Haskell libraries and idioms? Even then, I think extracting back would be preferable, if the aim is trustworthiness. Basically, you have as evidence Letouzey's manual but highly scrutinized extraction proofs, and quite a few of them have now been verified in MetaCoq.
but it's interesting to contrast with the "minimum TCB" verification+compilation path using VST/CompCert:
prog.c
clightgen
to generate prog.v
prog.v
becomes your program semantic ground truthprog.v
in VSTprog.v
to machine code (this is unfortunately not tooling friendly currently, as far as I know)prog.c
and repeat abovebasically, with a shallow embedding, extraction is seemingly the closest you can get to the above (hs-to-coq
is your clightgen
, use whichever library in place of VST, then extraction+GHC in place of CompCert)
the "other" verification+compilation path with VST/CompCert is to directly compile ccomp prog.c
after VST is happy (which corresponds to not using extraction in the final phase of verification with hs-to-coq
)
both paths are possible for any program verification project (in C or Haskell, respectively), whether you have existing code or not, as far as I can see
Those are good points. My main concern was mostly that using the extraction code would require changes of existing code that depends it, but certainly a good extraction mechanism would eliminate (or reduce?) this problem. Right now, minimizing TCB is not our main concern and our current focus is to be able to verify more kinds of Haskell code, but we certainly welcome contributions along this direction to improve the tool!
I see. Well, what would be even better for the minimize-TCB approach is a deep embedding, and I believe you said you're doing something towards that (formalizing GHC)
We did some work on verifying some GHC properties (term substitutions, etc.). Details can be found in this paper: https://arxiv.org/pdf/1910.11724.pdf
But again we verify those modules not trying to reduce TCB but try to do pragmatic verification of large-scale existing Haskell code. (In fact, we even grow the formalization gap in this work and we think that's actually a good thing.)
And we choose shallow embedding also because it's much easier to reason about shallow embeddings in Coq. We think this is important for reasoning about large-scale Haskell code like GHC.
Last updated: Feb 06 2023 at 05:03 UTC