A new competitive programmer asked about how we can use proof assistants in contests. https://codeforces.com/blog/entry/113179
Maybe it's time for us to learn competitive programming and tell those folks it's possible to use proof assistants in contests?
Do you have timed competitions? I don't think having the same timings is reasonable, but otherwise why not?
I don't think proof assistants like Coq or Lean are well-suited for reasoning about imperative programs
That's not entirely true, but you'll need fairly nontrivial libraries (like VST) and the experience will likely have a steeper learning curve than Dafny or using extraction.
A part of the complexity of using VST comes from the C language. If you use a different imperative language, e.g., OCaml, then a Coq verification (e.g., using CFML) becomes much leaner, on the same level as a proof + extraction (if you do not use imperative features).
Most contests that matter are timed, check out the announcements on https://codeforces.com/ for details. You can see the rules for a round when you register for the round. In a typical round, you're given 6 problems to solve in 2 hours. You can copy and paste anything on the internet into the solution as long as it is published before the contest. Collaboration is not allowed. Sometimes there are unrated team contests. We can always submit solutions to past contest problems on the website for practice.
I guess it might not be feasible for us to compete with formal verification yet, but we can already practice with formal verification. Formal verification forces us to think rationally and this improves our performance in real contests
competing with a proof assistant with a small core/checker when others can use a language/tool of their choice is somewhat like bringing a knife to a gun fight. Why would anyone want to compete at such a severe disadvantage? The other languages/tools have no requirement to show any evidence of anything, let alone something that's separately checkable.
similar thing I saw in a preprint recently: researchers compared monetary cost per line of "verified" code between an (older) proof assistant and some new specialized non-ITP program verification tool, and enthusiastically concluded that verified code is getting cheaper
Indeed; a separate category could still make sense in principle, but you'd need more time and participants. Post-competition review is fine.
@Karl Palmskog : there are of course algorithms which are so trivial, that one neither needs any testing nor FV. As soon as you need a considerable amount of testing, IMHO it starts to be an interesting question if you are faster with testing or with FV - at least if the goal is to deliver a flawless implementation. It gets more interesting if you have only one shot - if your implementation is flawed you are out. People start to invest more time in testing then.
IMHO the "extra time" estimates floating around frequently compare flawed implementations with flawless implementations. This is only a fair comparison if flawed implementations are acceptable.
I've talked to quite a few people who did programming competitions over the years. The problems are usually set up in such a way that you are 100% right or way off, i.e., there is not much more than basic testing going on before submission.
Well maybe we should think about problems where this is not the case.
Last updated: Dec 05 2023 at 05:01 UTC