Stream: Miscellaneous

Topic: Coq in the wild

view this post on Zulip Huỳnh Trần Khanh (Feb 24 2023 at 02:44):

A new competitive programmer asked about how we can use proof assistants in contests.

Maybe it's time for us to learn competitive programming and tell those folks it's possible to use proof assistants in contests?

view this post on Zulip Paolo Giarrusso (Feb 24 2023 at 05:51):

Do you have timed competitions? I don't think having the same timings is reasonable, but otherwise why not?

I don't think proof assistants like Coq or Lean are well-suited for reasoning about imperative programs

That's not entirely true, but you'll need fairly nontrivial libraries (like VST) and the experience will likely have a steeper learning curve than Dafny or using extraction.

view this post on Zulip Guillaume Melquiond (Feb 24 2023 at 07:09):

A part of the complexity of using VST comes from the C language. If you use a different imperative language, e.g., OCaml, then a Coq verification (e.g., using CFML) becomes much leaner, on the same level as a proof + extraction (if you do not use imperative features).

view this post on Zulip Huỳnh Trần Khanh (Feb 24 2023 at 08:09):

Most contests that matter are timed, check out the announcements on for details. You can see the rules for a round when you register for the round. In a typical round, you're given 6 problems to solve in 2 hours. You can copy and paste anything on the internet into the solution as long as it is published before the contest. Collaboration is not allowed. Sometimes there are unrated team contests. We can always submit solutions to past contest problems on the website for practice.

view this post on Zulip Huỳnh Trần Khanh (Feb 24 2023 at 08:10):

I guess it might not be feasible for us to compete with formal verification yet, but we can already practice with formal verification. Formal verification forces us to think rationally and this improves our performance in real contests

view this post on Zulip Karl Palmskog (Feb 25 2023 at 12:05):

competing with a proof assistant with a small core/checker when others can use a language/tool of their choice is somewhat like bringing a knife to a gun fight. Why would anyone want to compete at such a severe disadvantage? The other languages/tools have no requirement to show any evidence of anything, let alone something that's separately checkable.

view this post on Zulip Karl Palmskog (Feb 25 2023 at 12:09):

similar thing I saw in a preprint recently: researchers compared monetary cost per line of "verified" code between an (older) proof assistant and some new specialized non-ITP program verification tool, and enthusiastically concluded that verified code is getting cheaper

view this post on Zulip Paolo Giarrusso (Feb 25 2023 at 13:00):

Indeed; a separate category could still make sense in principle, but you'd need more time and participants. Post-competition review is fine.

view this post on Zulip Michael Soegtrop (Feb 27 2023 at 11:10):

@Karl Palmskog : there are of course algorithms which are so trivial, that one neither needs any testing nor FV. As soon as you need a considerable amount of testing, IMHO it starts to be an interesting question if you are faster with testing or with FV - at least if the goal is to deliver a flawless implementation. It gets more interesting if you have only one shot - if your implementation is flawed you are out. People start to invest more time in testing then.

view this post on Zulip Michael Soegtrop (Feb 27 2023 at 11:12):

IMHO the "extra time" estimates floating around frequently compare flawed implementations with flawless implementations. This is only a fair comparison if flawed implementations are acceptable.

view this post on Zulip Karl Palmskog (Feb 27 2023 at 16:26):

I've talked to quite a few people who did programming competitions over the years. The problems are usually set up in such a way that you are 100% right or way off, i.e., there is not much more than basic testing going on before submission.

view this post on Zulip Michael Soegtrop (Feb 27 2023 at 17:15):

Well maybe we should think about problems where this is not the case.

Last updated: Jun 05 2023 at 10:01 UTC