[M7] Towards an Axiomatic Basis for C++ · Coq workshop 2020

Stream: Coq workshop 2020

Topic: [M7] Towards an Axiomatic Basis for C++

Théo Zimmermann (Jun 24 2020 at 11:22):

This is the topic to ask questions on the talk "Towards an Axiomatic Basis for C++".

Emilio Jesús Gallego Arias (Jul 06 2020 at 13:15):

Talk just started!

Richard L Ford (Jul 06 2020 at 13:27):

By not supporting uninstantiated templates, do you mean that the user must specify and prove each instantiation of a template separately?

Robbert Krebbers (Jul 06 2020 at 13:28):

Question: What is currently an example of a killer feature of Iris or Coq that makes it possible what you are doing, and what killer feature of Iris or Coq are you missing that would make your work significantly easier?

Arthur Charguéraud (Jul 06 2020 at 13:28):

I have been working with a postdoc of mine on a similar front-end based on clang to convert code written in a subset of C into a clean AST representation. We used the ClangML tool (https://github.com/Antique-team/clangml), which offers an OCaml representation of Clang AST; do you? My project aims at applying verified source-to-source optimizations (transformation guided by the programmer, but applied by a tool); are you also interested in source-to-source optimizations?

Jean Pichon (Jul 06 2020 at 13:29):

(1) What is your model of C++? Is it a language model? Is it a model of clang's C++?

Richard L Ford (Jul 06 2020 at 13:31):

Have you been comparing notes with the Frama/Clang plugin for Frama-C?

Jean Pichon (Jul 06 2020 at 13:32):

(2) How big are your separation logic rules for C++ statements?

Michael Soegtrop (Jul 06 2020 at 13:33):

Many thanks for the interesting talk - I will give it a try!

Robbert Krebbers (Jul 06 2020 at 13:33):

robbert said:

Question: What is currently an example of a killer feature of Iris or Coq that makes it possible what you are doing, and what killer feature of Iris or Coq are you missing that would make your work significantly easier?

@Gregory Malecha I would be curious to know the long list of killer features you mentioned you would like (both of Iris and of Coq).

Maxime Dénès (Jul 06 2020 at 13:34):

Same here ;)

Gregory Malecha (Jul 06 2020 at 13:34):

We looked at Frama-C early on, but decided to go this route instead. It would be interesting to compare notes on where we are though.

Gregory Malecha (Jul 06 2020 at 13:36):

Jean Pichon said:

(2) How big are your separation logic rules for C++ statements?

our theories directory is ~9k that isn't all axioms though

Jean Pichon (Jul 06 2020 at 13:36):

How did you validate your model of C++? (It seems like a huge task…)

Gregory Malecha (Jul 06 2020 at 13:36):

(apparently I don't know how to respond to thing appropriately)

Gregory Malecha (Jul 06 2020 at 13:38):

We have a simple instantiation of the separation logic assertions that we provide (e.g. the identity predicate that you saw, as well as points to and such). We don't attempt to prove the wp rules because we do not have an underlying operational model of C++ to verify it against. Even if we did, it isn't obvious (to me) that an operational model of C++ would be much more trustworthy than our axiomatic one

Peter Sewell (Jul 06 2020 at 13:38):

You didn't say anything about what code has actually been verified?

Jean Pichon (Jul 06 2020 at 13:38):

Can you give a sense of what kind of things you can verify, and what kind of things you can't verify? And do you need assumptions?

Gregory Malecha (Jul 06 2020 at 13:40):

At the moment, we've finished verifications of some concurrent and non-concurrent data structures. locks, finite maps, etc. I wouldn't say that any of that part of the verification is especially novel at this point, but we have built some interesting specifications for our hypervisor in iris and are working on the verification of those

Peter Sewell (Jul 06 2020 at 13:41):

...and the underlying memory object model?

Abhishek Anand (Jul 06 2020 at 13:41):

Frama-C converts C++ to C. We decided not to do that and allow direct reasoning about C++. During proofs, it is probaby easier to look at the code as written rather than transformed code (C++ to C).

Gregory Malecha (Jul 06 2020 at 13:42):

we don't need assumptions outside of the axioms that are in the repository, though to be completely honest, we do occasionally admit arithmetic lemmas that are unpleasant to prove (though we do go back and fill in the holes)

Jean Pichon (Jul 06 2020 at 13:43):

Regarding the axiomatic vs. operational model question, cross-validation between the two would be one way to bring confidence, but I appreciate that might not be on your critical path. Also, an executable operational model can be validated by running it; do you have a way to check pre- and post-condition pairs against executions (by evaluating
simple pre- and post-conditions)?

Gregory Malecha (Jul 06 2020 at 13:44):

Peter Sewell said:

...and the underlying memory object model?

In our simple instantiation, our object memory model is quite close to what exists in CompCert, though the wp rules avoid exposing it at that level of detail. We've found that in most cases, we don't need to fall back on things like offsets and instead rely on a type "Loc" (which is in the development) of locations. So we might write somtehing like _local r "x" ., _field foo ., _field bar to compute x.foo.bar

Gordon Stewart (Jul 06 2020 at 13:45):

We've done some work on verification of device drivers as well, e.g., UARTs.

Gregory Malecha (Jul 06 2020 at 13:46):

Jean Pichon said:

(2) How big are your separation logic rules for C++ statements?

theories/lang/cpp/logic/stmt.v is ~340 lines, but some things like destructors have been broken out into separate files

Abhishek Anand (Jul 06 2020 at 13:47):

@Arthur Charguéraud : do you have a link to your source-to-source optimizations project?

Jean Pichon (Jul 06 2020 at 13:49):

Gregory Malecha said:

Jean Pichon said:

(2) How big are your separation logic rules for C++ statements?

theories/lang/cpp/logic/stmt.v is ~340 lines, but some things like destructors have been broken out into separate files

How big is the rule for an assignment, for example?

Enzo Crance (Jul 06 2020 at 13:49):

Hello. In one of your slides, you presented typing management and especially inheritance, with a general type and an optional most derived type. Considering a typing situation like in the following picture, how would you represent the T type? Do you have two instances of the predicate, one with A as general type and A2 as most derived one, and another one with B as general type and B2 as most derived one? Sorry if the question looks very simple, beginner student here. Thanks in advance! t.png

Gregory Malecha (Jul 06 2020 at 13:51):

robbert said:

robbert said:

Question: What is currently an example of a killer feature of Iris or Coq that makes it possible what you are doing, and what killer feature of Iris or Coq are you missing that would make your work significantly easier?

Gregory Malecha I would be curious to know the long list of killer features you mentioned you would like (both of Iris and of Coq).

We're able to work around a lot of things but some things that would be nice are

nicer ways to control reduction
more extensible variant of typeclasses eauto (maybe we just shouldn't use that, though it has been pretty nice)
there are some issues with modules that are annoying in the 1 place that we use modules. in particular, we have a module type (defined in pred.v and we try to instantiate it in simple_pred.v but Coq is unable to do that, it spins for a while when we end the module)
we've had some battles with unification and reduction that @Janno has helped a lot with, but I don't think that we're out of the woods on that yet.
Require Import is fairly slow in some of our code, e.g. Require Import bedrock.lang.auto.cpp (not included in the distribution) takes around 4 seconds which seems like a lot. Just Require Import iris... can take a second or two.

Jean Pichon (Jul 06 2020 at 13:52):

Gordon Stewart said:

We've done some work on verification of device drivers as well, e.g., UARTs.

What did you prove about UART? Do you have a spec of the protocol?
And how to you model talking with devices?

Gregory Malecha (Jul 06 2020 at 13:52):

Jean Pichon said:

Gregory Malecha said:

Jean Pichon said:

(2) How big are your separation logic rules for C++ statements?

theories/lang/cpp/logic/stmt.v is ~340 lines, but some things like destructors have been broken out into separate files

How big is the rule for an assignment, for example?

Assignment is actually an expression in C++. The rule are here: https://github.com/bedrocksystems/cpp2v/blob/master/theories/lang/cpp/logic/expr.v#L245

Maxime Dénès (Jul 06 2020 at 13:52):

Regarding 5., it could be interesting to split the Require and Import to measure which one is slow.

Gregory Malecha (Jul 06 2020 at 13:53):

Maxime Dénès said:

Regarding 5., it could be interesting to split the Require and Import to measure which one is slow.

Thanks. I'll take a look at that.

Gregory Malecha (Jul 06 2020 at 13:54):

Enzo Crance said:

Hello. In one of your slides, you presented typing management and especially inheritance, with a general type and an optional most derived type. Considering a typing situation like in the following picture, how would you represent the T type? Do you have two instances of the predicate, one with A as general type and A2 as most derived one, and another one with B as general type and B2 as most derived one? Sorry if the question looks very simple, beginner student here. Thanks in advance! t.png

It works something like this, let's just simplify to class T : public A , public B and suppose that you have a this : ptr which represents the pointer to T. Then you would have something like the following:

this |-> _identity "T" (Some "T") **
this ., _base "T" "A" |-> _identity "A" (Some "T") **
this ., _base "T" "B" |-> _identity "B" (Some "T")

Jean Pichon (Jul 06 2020 at 13:54):

Gregory Malecha said:

Jean Pichon said:

Gregory Malecha said:

Jean Pichon said:

(2) How big are your separation logic rules for C++ statements?

theories/lang/cpp/logic/stmt.v is ~340 lines, but some things like destructors have been broken out into separate files

How big is the rule for an assignment, for example?

Assignment is actually an expression in C++. The rule are here: https://github.com/bedrocksystems/cpp2v/blob/master/theories/lang/cpp/logic/expr.v#L245

Thanks! (I hadn't noticed the .v files were also available. I took the README too literally.)

Jean Pichon (Jul 06 2020 at 13:56):

Gregory Malecha said:

Peter Sewell said:

...and the underlying memory object model?

In our simple instantiation, our object memory model is quite close to what exists in CompCert, though the wp rules avoid exposing it at that level of detail. We've found that in most cases, we don't need to fall back on things like offsets and instead rely on a type "Loc" (which is in the development) of locations. So we might write somtehing like _local r "x" ., _field foo ., _field bar to compute x.foo.bar

Do you have a feeling and/or some experimental data of how the CompCert object layout model and the clang object layout model relate?

Abhishek Anand (Jul 06 2020 at 13:56):

Jean Pichon said:

Regarding the axiomatic vs. operational model question, cross-validation between the two would be one way to bring confidence, but I appreciate that might not be on your critical path. Also, an executable operational model can be validated by running it; do you have a way to check pre- and post-condition pairs against executions (by evaluating
simple pre- and post-conditions)?

It may be possible to skip operational semantics entirely: it may be possible to write a compiler from C++ to CompcertC and prove that the compiler preserves the axiomatic semantics (e.g. using VST as the target axiomatic semantics). Does anyone know previous work trying this approach (preservation of axiomatic semantics) to compiler verification?

Gordon Stewart (Jul 06 2020 at 13:58):

Jean Pichon said:

Gordon Stewart said:

We've done some work on verification of device drivers as well, e.g., UARTs.

What did you prove about UART? Do you have a spec of the protocol?
And how to you model talking with devices?

We built an operational model of a standard UART (ARM's PL011) and verified a trace-style spec of the driver code against the operational model. We used a variant of Iris's STSs to connect the trace spec to the device model.

Jean Pichon (Jul 06 2020 at 13:59):

Abhishek Anand said:

Jean Pichon said:

Regarding the axiomatic vs. operational model question, cross-validation between the two would be one way to bring confidence, but I appreciate that might not be on your critical path. Also, an executable operational model can be validated by running it; do you have a way to check pre- and post-condition pairs against executions (by evaluating
simple pre- and post-conditions)?

It may be possible to skip operational semantics entirely: it may be possible to write a compiler from C++ to CompcertC and prove that the compiler preserves the axiomatic semantics (e.g. using VST as the target axiomatic semantics). Does anyone know previous work trying this approach (preservation of axiomatic semantics) to compiler verification?

Is that something you want to do for your project? Writing a C++ compiler is a daunting endeavour.

Gregory Malecha (Jul 06 2020 at 13:59):

Jean Pichon said:

Gregory Malecha said:

Peter Sewell said:

...and the underlying memory object model?

In our simple instantiation, our object memory model is quite close to what exists in CompCert, though the wp rules avoid exposing it at that level of detail. We've found that in most cases, we don't need to fall back on things like offsets and instead rely on a type "Loc" (which is in the development) of locations. So we might write somtehing like _local r "x" ., _field foo ., _field bar to compute x.foo.bar

Do you have a feeling and/or some experimental data of how the CompCert object layout model and the clang object layout model relate?

Currently, we know of the CompCert model and the model that @robbert wrote in his dissertation. He can probable give you a better description of the difference than I can. Our current simple_pred.v instantiates with a model that is close to CompCert, but I believe that our rules would be compatible (or close to compatible with) with the model in @robbert 's thesis though I can't guarantee that.

Jean Pichon (Jul 06 2020 at 13:59):

Gordon Stewart said:

Jean Pichon said:

Gordon Stewart said:

We've done some work on verification of device drivers as well, e.g., UARTs.

What did you prove about UART? Do you have a spec of the protocol?
And how to you model talking with devices?

We built an operational model of a standard UART (ARM's PL011) and verified a trace-style spec of the driver code against the operational model. We used a variant of Iris's STSs to connect the trace spec to the device model.

Nice! :-)

Gregory Malecha (Jul 06 2020 at 14:00):

Jean Pichon said:

Abhishek Anand said:

Jean Pichon said:

Regarding the axiomatic vs. operational model question, cross-validation between the two would be one way to bring confidence, but I appreciate that might not be on your critical path. Also, an executable operational model can be validated by running it; do you have a way to check pre- and post-condition pairs against executions (by evaluating
simple pre- and post-conditions)?

It may be possible to skip operational semantics entirely: it may be possible to write a compiler from C++ to CompcertC and prove that the compiler preserves the axiomatic semantics (e.g. using VST as the target axiomatic semantics). Does anyone know previous work trying this approach (preservation of axiomatic semantics) to compiler verification?

Is that something you want to do for your project? Writing a C++ compiler is a daunting endeavour.

Writing a compiler isn't on the short list, but we are interested in this sort of thing in the long run (or integrating with some existing verified compiler).

Gordon Stewart (Jul 06 2020 at 14:01):

Jean Pichon said:

Gordon Stewart said:

Jean Pichon said:

Gordon Stewart said:

We've done some work on verification of device drivers as well, e.g., UARTs.

What did you prove about UART? Do you have a spec of the protocol?
And how to you model talking with devices?

We built an operational model of a standard UART (ARM's PL011) and verified a trace-style spec of the driver code against the operational model. We used a variant of Iris's STSs to connect the trace spec to the device model.

Nice! :-)

This works, but building an operational model, even of a simple device, is a lot of work. I'd love pointers on ways to re-use existing models. I know that people in Cambridge have ported models of ARM chips to tools like Sail; is there any existing work that does something similar but for peripheral devices?

Gregory Malecha (Jul 06 2020 at 14:02):

Gordon Stewart said:

Jean Pichon said:

Gordon Stewart said:

Jean Pichon said:

Gordon Stewart said:

We've done some work on verification of device drivers as well, e.g., UARTs.

What did you prove about UART? Do you have a spec of the protocol?
And how to you model talking with devices?

We built an operational model of a standard UART (ARM's PL011) and verified a trace-style spec of the driver code against the operational model. We used a variant of Iris's STSs to connect the trace spec to the device model.

Nice! :-)

This works, but building an operational model, even of a simple device, is a lot of work. I'd love pointers on ways to re-use existing models. I know that people in Cambridge have ported models of ARM chips to tools like Sail; is there any existing work that does something similar but for peripheral devices?

@Peter Sewell

Jean Pichon (Jul 06 2020 at 14:03):

Gregory Malecha said:

Jean Pichon said:

Abhishek Anand said:

Jean Pichon said:

Regarding the axiomatic vs. operational model question, cross-validation between the two would be one way to bring confidence, but I appreciate that might not be on your critical path. Also, an executable operational model can be validated by running it; do you have a way to check pre- and post-condition pairs against executions (by evaluating
simple pre- and post-conditions)?

It may be possible to skip operational semantics entirely: it may be possible to write a compiler from C++ to CompcertC and prove that the compiler preserves the axiomatic semantics (e.g. using VST as the target axiomatic semantics). Does anyone know previous work trying this approach (preservation of axiomatic semantics) to compiler verification?

Is that something you want to do for your project? Writing a C++ compiler is a daunting endeavour.

Writing a compiler isn't on the short list, but we are interested in this sort of thing in the long run (or integrating with some existing verified compiler).

What do you use in the meantime? Clang?
And if so, have you thought of connecting with Vellvm?

Gregory Malecha (Jul 06 2020 at 14:05):

Jean Pichon said:

Gregory Malecha said:

Jean Pichon said:

Abhishek Anand said:

Jean Pichon said:

Regarding the axiomatic vs. operational model question, cross-validation between the two would be one way to bring confidence, but I appreciate that might not be on your critical path. Also, an executable operational model can be validated by running it; do you have a way to check pre- and post-condition pairs against executions (by evaluating
simple pre- and post-conditions)?

It may be possible to skip operational semantics entirely: it may be possible to write a compiler from C++ to CompcertC and prove that the compiler preserves the axiomatic semantics (e.g. using VST as the target axiomatic semantics). Does anyone know previous work trying this approach (preservation of axiomatic semantics) to compiler verification?

Is that something you want to do for your project? Writing a C++ compiler is a daunting endeavour.

Writing a compiler isn't on the short list, but we are interested in this sort of thing in the long run (or integrating with some existing verified compiler).

What do you use in the meantime? Clang?
And if so, have you thought of connecting with Vellvm?

We compile with both gcc and clang at this point. Because we're so close to the source it isn't too big of a problem to use a different compiler. We aren't relying on any clang internal transformations (though of course parsing / type checking bugs would impact things), though it is an extra bit of untrusted code.
Vellvm is probably the best way for us to get to a verified compiler at the moment, I've talked with @Yannick Zakowski about it, but we haven't gone much farther than informal conversations at this point.

Gregory Malecha (Jul 06 2020 at 14:08):

I appreciate all the questions. If you have any more, feel free to ask them in the github project or send me an email: gregory@bedrocksystems.com

Jean Pichon (Jul 06 2020 at 14:10):

Thanks! It's really interesting work, and it seems quite challenging, so best of lucks! :-)

Peter Sewell (Jul 06 2020 at 14:15):

I don't know of work on real behavioural models of devices. The ETH folk (David Cock and others?) have done some SoC modelling, but AFAIK only more abstractly.

Last updated: Apr 16 2024 at 23:01 UTC