I just saw this vulnerability announcement: https://www.lightbluetouchpaper.org/2021/11/01/trojan-source-invisible-vulnerabilities/ (also see https://blog.rust-lang.org/2021/11/01/cve-2021-42574.html).
As far as I know this attack has been folklore in the Coq community for a long time. I even have some commented-out code in Alectryon to turn off bidi reordering:
/*
CoqIDE doesn't turn off the unicode bidirectional algorithm (and PG simply
respects the user's `bidi-display-reordering` setting), so don't turn it off
here either. But beware unexpected results like `Definition test_אב := 0.`
.alectryon-io span {
direction: ltr;
unicode-bidi: bidi-override;
}
*/
I remember learning about this from Coq-club a long time ago, or possibly in a post from Makarius Wenzel maybe? I'm almost certain I even copied the example from Alectryon's code from that post. Does anyone else remember this post?
Is this the thread you are referring to ?
https://sympa.inria.fr/sympa/arc/coq-club/2016-03/msg00094.html
ah, isn't this another argument in favor of going full ASCII?
There are so many things that can go wrong with Pollack inconsistency that I don't think banning non-ASCII is going to be very efficient...
Laurent Théry said:
Is this the thread you are referring to ?
https://sympa.inria.fr/sympa/arc/coq-club/2016-03/msg00094.html
Thanks Laurent! It's almost that, but I don't think so: the thread I had in mind specifically about bidi reordering. In the meantime though other articles have popped up to point out that it wasn't just know in the Coq community, but basically everywhere (https://research.swtch.com/trojan).
Karl Palmskog said:
ah, isn't this another argument in favor of going full ASCII?
The article at https://research.swtch.com/trojan fairly convincingly argues that this is not an issue compilers should handle, but something IDEs should highlight instead (more than they already do with syntax highlighting).
@Clément Pit-Claudel but consider the following scenario: an organization [possibly malicious] has formalized some specification and verified a program which they want audited by experts for rubber-stamping. Should the experts insist on ASCII? I think I would, if I was the auditor, since one will likely look at the code through an IDE but also via other means like cat
Clément Pit-Claudel said:
Laurent Théry said:
Is this the thread you are referring to ?
https://sympa.inria.fr/sympa/arc/coq-club/2016-03/msg00094.htmlThanks Laurent! It's almost that, but I don't think so: the thread I had in mind specifically about bidi reordering. In the meantime though other articles have popped up to point out that it wasn't just know in the Coq community, but basically everywhere (https://research.swtch.com/trojan).
what about this one (last attempt :wink: )) ? https://sympa.inria.fr/sympa/arc/coq-club/2017-03/msg00042.html
Laurent Théry said:
what about this one (last attempt :wink: )) ? https://sympa.inria.fr/sympa/arc/coq-club/2017-03/msg00042.html
Boom! You did it :) Thanks so much! How did you find it??
@Karl Palmskog I think that merely reading a Coq script to understand what it does is a fairly vain endeavour. There is a reason why we have Set Printing All and a lot of interactivity in Coq.
You're never safe from elaboration doing weird things.
(I am not even talking about universe instances.)
Karl Palmskog said:
Clément Pit-Claudel but consider the following scenario: an organization [possibly malicious] has formalized some specification and verified a program which they want audited by experts for rubber-stamping. Should the experts insist on ASCII? I think I would, if I was the auditor, since one will likely look at the code through an IDE but also via other means like
cat
It's a tricky case, because problems are so easy to hide that I don't think expert audits can do more than help already-honest organizations avoid mistakes. Here is one process that I would moderately trust if I were to outsource a Coq development (in fact, this is pretty much what I do currently with student submissions for homework):
Of course coqchk isn't actually trustworthy, and it takes too long to run, so this is again mostly for assumed-honest third parties.
But note that it doesn't involve anything related to ascii or unicode: I never really look at the code (well, if it's homework, I do: I need to give feedback on it ^^)
at least Makarius gave his UTF-8-vs.-ASCII opinion in that Coq-Club post:
My conclusion: formal prover sources should be stored as plain ASCII only. Front-ends might show that with the help of Unicode, but in case of doubts, it should be always possible to look at the plain 7-bit ASCII text without any special tricks.
Cross-ref to the previous long topic on this issue in this Zulip: https://coq.zulipchat.com/#narrow/stream/237662-VsCoq-devs.20.26.20users/topic/VScode.20prettify.20symbols
@Clément Pit-Claudel how true is it that coqchk
is slow? I thought so too, but IME it’s almost constant-time (with a large constant)
OTOH, “write my own spec” only works if that’s much easier than writing the proof. Which is not always the case.
Paolo Giarrusso said:
Clément Pit-Claudel how true is it that
coqchk
is slow? I thought so too, but IME it’s almost constant-time (with a large constant)
I don't mean to badmouth coqchk, and I what I wrote above wasn't completely accurate: in my experience, it's pretty good. There's two problems I've had with it: reduction strategies (IIRC coqchk
doesn't have vm_compute
), and the fact that it doesn't cache results, so it reruns on everything for each invocation. For my own research projects it's been mostly OK (in some cases the former has been an issue). For student submissions that I compile and run one by one in a container, having to recheck most of the standard library every time has been a showstopper.
For student submissions that I compile and run one by one in a container, having to recheck most of the standard library every time has been a showstopper.
Have you heard of -norec?
https://coq.github.io/doc/master/refman/practical-tools/coq-commands.html#compiled-libraries-checker-coqchk
I guess one can have a rule of thumb that says, "write your own spec for every important problem, and only accept proofs that implement your spec", but probably becomes infeasible quite quickly.
Next step: curated databases of audited specs? Then you could say, I won't look at the thing until you reduce it to the one in version X.Y of trusted spec DB. I guess this is what homework usually is, minus the DB version.
... If you have a standard enough problem. Do you have examples?
the obvious example is a finite set implementation, you can model the operations abstractly using predicate sets, so the "spec" is just describing the effect of say, add
in terms of predicate sets
ah, that makes more sense than e.g. seL4's spec
OTOH, even for finite sets, implementations in Java and Scala don't have the same spec.
sure, there is probably no way to capture seL4's complete spec. But for example, it would be nice to have a trusted definition of linearizability of state machine operations for use in proving correct implementations of Paxos, Raft, etc. (distributed state machine replication)
in an audited spec library, one would probably have a collection of specs for each problem, and reductions between them
Linearizability is interesting; work by @Ralf Jung and others on Iris ends up proposing alternatives; reductions exist but in limited cases, and I can't comment intelligently on the limitations...
I think it's a bit of a dilemma in formalization in general. The way we gain confidence is by relating our stuff to other people's stuff, but nobody wants to maintain "bridges" between different formalizations. So the audited spec library "idea" is a way of sidestepping that by providing incentives for several projects to bridge to a single library instead of each other.
anyway, my general bias is that in 100% of my current work, the technology for stating lots of the specs comes from the very forefront of research. OTOH, I could believe that your examples are much more "first-order" and mine is a special case, so you could set it aside...
Based on the "algorithm-program" distinction mentioned in https://semantic-domain.blogspot.com/2018/04/are-functional-programs-easier-to.html, it might well be that algorithm verification avoids all of those problems.
I have to say I don't usually buy the whole "my spec is the contribution" claims in general, at least as excuses not to do any connections from the novel spec to other formalizations. I think there are many times when people can do something similar to what odd-order does here: https://github.com/math-comp/odd-order/blob/master/theories/stripped_odd_order_theorem.v#L205-L208
an imaginary auditor might say: "So your spec is so state of the art there is nothing comparable? Well, call me back when you can compare it to something."
Depends. What about the paper on perfectoid spaces? Their claim is exactly "defining the object is the contribution".
Of course, I agree it's useful to come up with a small spec — but that's indeed hard work.
pure math is a bit different to me, I think the slogan has been that a result is correct because it gave someone a Fields medal
s/is correct/matters/ :-)
Kevin [Buzzard] used this example about correctness in the math community regarding odd order formalization in this recent interview. The gist I believe is that of course odd order was correct even before Gonthier et al, because it resulted in a Fields medal
Karl Palmskog said:
sure, there is probably no way to capture seL4's complete spec. But for example, it would be nice to have a trusted definition of linearizability of state machine operations for use in proving correct implementations of Paxos, Raft, etc. (distributed state machine replication)
Stay tuned for the next FCSL paper (fingers crossed) :)
ah, then I might namedrop: https://github.com/uwplse/verdi-raft/blob/master/raft/Linearizability.v
@Karl Palmskog in a conversation with mainstream mathematicians, I'd agree with Buzzard that Fields metal proofs are likely correct for a suitable definition of correct. In this venue, we'd say "correct" with scare quotes :-)
Anyway, that’s a sidetrack.
Karl Palmskog said:
at least Makarius gave his UTF-8-vs.-ASCII opinion in that Coq-Club post:
My conclusion: formal prover sources should be stored as plain ASCII only. Front-ends might show that with the help of Unicode, but in case of doubts, it should be always possible to look at the plain 7-bit ASCII text without any special tricks.
Cross-ref to the previous long topic on this issue in this Zulip: https://coq.zulipchat.com/#narrow/stream/237662-VsCoq-devs.20.26.20users/topic/VScode.20prettify.20symbols
prettification has its own problem though. I think there are perfectly reasonable ways to use unicode in Coq, and it can help make things easier to read for the same reason we use concise notation on paper. (obviously I am biased since Iris is full of unicode. ;)
Coq's notation mechanism is powerful enough that restricting to ASCII doesnt fix enough problems to make it worth the readability cost, IMO.
Karl Palmskog said:
I think it's a bit of a dilemma in formalization in general. The way we gain confidence is by relating our stuff to other people's stuff, but nobody wants to maintain "bridges" between different formalizations. So the audited spec library "idea" is a way of sidestepping that by providing incentives for several projects to bridge to a single library instead of each other.
I think the problem with this is that many "interesting" specs are on the level library APIs, not whole-system state machines. to define such a spec you need to choose the logic you express the spec in; plain Coq predicates are not expressive enough. e.g., the standard separation logic spec of a spinlock is actually higher-order by abstracting over the user-defined lock invariant (and that invariant, crucially, is not just a "pure" coq Prop
, it can express ownership). so I dont think you can easily have such a library that would across logics. we could have such a library in Iris, but I doubt people will find that sufficiently general. ;)
Just so that @Karl Palmskog doesn’t get the wrong impression, if you have a whole-system state machine safety spec, I expect Iris should let you prove that (using Iris adequacy). But that can easily be worse than the seL4 spec.
what do you mean by "can be worse than the seL4 spec"?
But yes, Iris of course can 'close' up a proof once all the system pieces are assembled together. That is not the interesting part though, we've had complete program logics (that can in principle prove anything correct about a closed system) for a long time. but a lot of progress has been made recently in being able to specify and verify components of a system and plug them together, and to my knowledge there is no obvious notion of 'completeness' here that would let us prove that we have found the final answer to the question of how to express library specs.
"worse than the seL4 spec" was very poor phrasing — I meant "more infeasible than". It was in the context of this reply:
Karl Palmskog said:
sure, there is probably no way to capture seL4's complete spec.
Of course, the seL4 authors _have_ formalized the seL4 spec — but I guess this just means the spec is too big for this project. Whole-system state machine specs are likely bigger.
I come a bit late to this because I assumed "Pollack inconsistency" is some mathematics sufficiently far away of my area of interest.
IMHO the problem is when specs are not written by one person - of course an individual person can write a spec this person trusts - but when specs are so complicated that they are written by larger teams, where one cannot 100% guarantee that there is no one with adversarial thoughts in the team.
Another issue I see is that industrial environments might use a bit exotic tools for review processes, where it might take a bit longer until one can guarantee that this kind of vulnerability is not possible.
Regarding "notations alone are bad enough": yes, but adversarial use of this would require to sneak in the adversarial notations as well, which I think is harder than using this Unicode trick.
@Michael Soegtrop I do not agree with the fact that Pollack inconsistency only strikes when there is an adversarial behaviour in the dev process. I had many times issues with code I was writing alone, but where the Coq elaborator was behaving in unpredictable ways resulting in things that looked fine but where actually not. A prime example of an adversarial behaviour of Coq itself is the typeclass resolution mechanism, which can give weird results in presence of non-deterministic instances.
More generally, any process that suffers from emerging behaviours due to sheer complexity (notably unification) can become problematic.
@Pierre-Marie Pédrot : I didn't intend to say that Pollack inconsistency is only important in adversarial scenarios. It is definitely important to know what your spec really means, and I fully agree with @Karl Palmskog that a library of spec building blocks would be helpful for that.
I wanted to make an argument why one might want to have checks in Coq against adversarial Unicode sources.
for the record Pollack inconsistency was described in these two classic papers (open access links):
Last updated: Oct 13 2024 at 01:02 UTC