The IETF community has long used natural language to describe and specify protocols [...] In many ways this is clearly a successful approach. After all, the Internet exists as an interoperable, global, multi-vendor, network in large part due to the protocol standards the IETF develops.
https://csperkins.org/research/protocol-standards/2022-11-30-usable-formal-methods/
I guess since natural language specs didn't lead to complete disaster all the time, we'll have to keep doing them...
That's why we've been working on hacspec. It now has both a Coq and an SSProve backend.
yeah I hope you can sell hacspec to IETF, but it's still a bit backwards to me (subset of unspecified language that outputs something well-specified)
but I think they could have at least investigated: how much did all the ambiguity and misunderstandings due to natural lang specs actually cost? It seems like a low bar that they actually made some infrastructure work with it.
Hacspec has an independent operational semantics.
but as I think we discussed before, the evolution of the semantics is likely going to be constrained, if not outright dictated, by whatever people check in to the Rust compiler repository on GitHub...
or has anyone said: "Hacspec compatibility with Rust is not a high priority anymore"?
Didn't IETF require multiple interoperable implementations, or was that another standard? That doesn't seem a plain "natural language" spec.
Also the link seems about the opposite of that quote since they are exploring more formal methods :grinning:
from the minutes:
What are formal methods and why should IETFers care?
why should we care then if it’s all difficult?
I think the chances of losing this game are overwhelming: whatever FM techniques you introduce or push for, they need to ensure positive ROI up front compared to whatever old methods you were using, that never had any cost/benefit analysis
so essentially I believe they are setting up this whole "exploration" of FM to fail, and it will be a net negative for ITP and friends
I have the same sense; I further think the skepticism we are reading here is a well-deserved consequence of the PL tradition of zealously pushing for design choices that are better seen as academic simplifications.
for what it's worth I think the RISC-V formal specification competition/process was more of a step in the right direction for real-world formal methods (if we assume an ISA is a protocol, which it sort of is). At least, it ensured almost every popular ITP has a RISC-V ISA spec.
and seemingly few unwilling people were subjected to PL design choices along the way
saw this in the latest IETF formal methods research working group agenda/minutes from March 2023, regarding FM in general:
Why aren't these things already usable?
In my view, completely counterproductive adverse signaling to FM research community.
Looking at the linked talk:
Verifying a mechanical proof is correct can take many hours and cost hundreds or even thousands of dollars.
Uhm?
A successful analysis may find no issues, i.e. that the protocol meets its goals.
Such results are very hard to get published because they are not “novel”, even though they may be valuable to the community.
What community is that?
Naively, I'd also have expected at least lightweight formal methods like model checkers/TLA+ to be already relatively practical here?
I think we have plenty of workshops/conferences that welcome FM case studies that "find no issues". Of course, the workshops are not tier 1, maybe not even tier 2, but it gets added to the scholarly record. Was surprised when even TACAS (1.5 tier?) "allowed" this.
I gave a presentation there, https://datatracker.ietf.org/meeting/116/materials/slides-116-ufmrg-foundational-end-to-end-verification-of-high-speed-cryptography-00
which I believe was well received. These people will not be writing Coq code, but they there's some adoption of hacspec.
these practically minded Coq users might want to look at the history for this topic (with some relevant quotes from minutes) here on the Zulip before jumping in
Thanks for connecting it to this thread. I think there is a genuine interest, but also a trade-off between lightweight methods and fundamental tools, like Coq.
As primarily an engineer, I'd say Formal Methods are great for production, not to mention the automation of production, and I do hope they keep spreading together with an ecosystem of reusable patterns and tools, both theoretical and technical. OTOH, I'd warn you from thinking that "full formality" can be achieved, and for very fundamental reasons: an irreducible informal component (and its expression in terms of natural language) remains on top of the whole endeavour, as even if we make the whole thing strictly formal up to a formal language for requirements, we still need an explanation in natural language that goes with it and how to use it, and it's till relative to that informal specification that we ultimately are to measure the "correctness/soundness/etc." or otherwise of our formal models and constructions... (In fact, that "logic" goes beyond just software, I'd contend an informal understanding, rooted in intuition and natural language, remains basic even to pure mathematics.)
isn't this also partly the usual "find more bugs" vs. "verify" mindset [w.r.t. lightweight FM]? Having dabbled in the software engineering / software testing world, I'd say we [FM researchers] are going to lose badly against SE/testing researchers if we play the find-bugs game
Karl, if you are asking me, absolutely not (though tests do have their place, but I am refraining from getting too much into details), whence I do advocate for formal methods in production, indeed I do use them to the maximum extent possible...
my comment was a followup to what Bas said, but I appreciate the comment
here's what I consider a reasonable attempt at justifying formal methods (mainly for software) economically which is not based on bug finding: https://people.inf.ethz.ch/basin/pubs/2023-Gligor-Perrig-Basin.pdf
However, as I've also told one of the authors, I disagree with their conflation of lightweightier formal methods (e.g., Dafny) with ITP-based FM. That is, their model doesn't distinguish "hardness" of different FM techniques.
for the record, if one's goal is bug finding, I think there are very few techniques that can beat state-of-the-art fuzzing. The jury seems to be still out whether finding all those bugs by fuzzing leads to better systems overall
@Julio Di Egidio what you're describing is indeed a complete formalization — "this program is 100% bug-free" is a meaningless sentence; "our proof removes the program from what you need to trust" is a much better one, because you must still trust the spec.
Paolo, except that I have rather warned against the idea of "full formality" and with reasons: while, by the way, it is false that "bug free" is unachievable... but this is yet another story.
specification validation is to me another formal methods task (e.g., prove "reduction" lemmas saying if this is wrong someone else is terribly wrong, mutation analysis, ...)
as long as you don't try to pursue the "bug-free" mirage, sure
@Julio Di Egidio I understand; I was disagreeing, but it seems the facts we're describing are similar — of course, formal proof cannot rule out bugs in the top spec, even if bad descriptions suggest otherwise.
potentially relevant: BP: Formal Proofs, the Fine Print and Side Effects https://people.eng.unimelb.edu.au/tobym/papers/secdev2018.pdf
(h/t https://mathstodon.xyz/@TaliaRinger@types.pl/110752664541690093)
(to the general discussion of applications and limits of formal methods)
at some point you need to define "bug" or it's going to be a mirage by default. Like in other hard sciences, we can reduce systems to (1) math and (2) empirical laws. The empirical side of the reduction can be pushed almost absurdly far, probably down to very basic physics, and how far to push the math side is seemingly a judgment call.
"bug-free" mirage is just rephrasing "correctness is relative to the spec"
unfortunately, I think most of the software industry will not even concede there is a spec
was about to add "people should just list their TCB and validation, instead of trying to define bug"
re "concede", I can imagine many cases where they're right
but beyond "soft" specs (GUIs, website frontends, ...), there are lots of components which won't have hard specs even if they could, until somebody commits to maintaining one.
I agree "spec maintenance" (or management) is often overlooked. I guess in economic terms, committing to a spec is a liability, e.g., a former hardware engineer I talked to said HW companies are not likely to ever commit to fully formal ISA specs for this reason
basically, you get two avenues for sinking the company (profits): people find issues with the spec you committed to, and people find that your product doesn't follow your own spec
one example that comes to mind is x86-TSO: if I understand correctly, the whole thing was a terrible mess before academics sorted it out, but there was no commitment by Intel to anything (they just provided descriptive English prose), so no bottom line issues
if that's a concern, why would you promise to follow the spec unless you've proven you implement it? But that only happens if you view this as one of your topmost priorities
(like Intel after the Pentium fdiv bug)
for hardware, there's no obvious way to present your proof anyway without showing the goods, i.e., circuit designs (to my knowledge).
the Pentium fdiv bug just motivated Intel to _do_ the proof, not _show_ it to anybody?
I guess, I don't even know how they defined proof (when they probably showed absence of floating point bugs in late 90s)
I don't know if we have more than rumors https://danluu.com/cpu-bugs/ in general — John Harrison published, but I don't see much about "Intel" on https://www.cl.cam.ac.uk/~jrh13/papers/index.html ?
getting a bit off topic, but I heard that Intel apparently in 2000s/2010s dropped/shedded the formal methods expertise they got together after fdiv
that blog post has some (anonymous) quotes on that — and to me, this purported quote captures the tradeoff (even if that weren't accurate):
We need to move faster. Validation at Intel is taking much longer than it does for our competition.
probably the biggest problem is that for a lot of HW (and SW) customers don't even ask for, let alone demand (unambiguous) specs
so then you can come up with economic justification for the "move faster" stuff easily. Seemingly the only hope is from security angle, as per paper I linked to about FM economics (functional behavior specification/verification is only good as long as it helps making the system "safe"/secure)
RISC-V at least have their golden formal ISA model, but if I find some discrepancy between some manufacturer's RISC-V-implementing HW and that golden model, I'm not holding my breath that anything will happen
@Karl Palmskog About lightweight FM vs verification. I think they are complimentary.
The lightweight methods can help to lure developers in and also give a first approximation of the specs. E..g. as it's done in quickchick.
I think they can be complementary when integrated. But having lightweight methods as a "gateway drug" is not likely to work out in the long term. A lot of these lightweight tools are essentially black boxes with automation that can be very frustrating to tune (e.g., model checkers).
Last updated: Dec 07 2023 at 09:01 UTC