I remember many years ago Conor McBride making the comment that a bidirectional type system can be used as an explicit, fully-elaborated proof term representation that stores less typing annotation than the "obvious" approach, and thus reduce memory consumption in proof assistants. Now we have recent work from @Meven Lennon-Bertrand that provides such a bidirectional system (Complete Bidirectional Typing for CIC). Question: have people considered using it as a "better" representation of kernel terms somehow?
(If you wonder why we can reduce term size: for example you can write instead of the fully-explicit , not storing the function input type $A$ in memory at all, as this subterm will be "checked" with the full type available.)
seems not very stable by reduction
ie when you want to reduce (fun x : A -> A => x) (fun y => t)
you have to do some work to recover y : A
Well I would suspect that the paper on bidirectional typing (I haven't read it yet) explains how to have subject reduction.
The paper still keeps the annotations on lambdas (and lambdas are never checked, always inferred) . I think @Pierre-Marie Pédrot told me that in practice they are extremely useful for performance of conversion checking. There are also (somewhat open) issues with eta-rules if you throw away the domains of lambdas...
Gabriel, you can gain a lot on redundant information: e.g. cons A a (nil A) : list A can be represented as (cons a nil) if it is in checking mode (in Coq that could also work for (irrelevant) universe arguments.
(but I guess that internal sharing / hashconsing reduces the low-hanging fruit in just avoiding duplication?)
One property you lose is that if a term is well typed then any of its sub terms are (or better, they still are, but you may not be able to re-typecheck them). This is not very important for the kernel, but tactics chop sub terms all the times.
Gabriel Scherer said:
(but I guess that internal sharing / hashconsing reduces the low-hanging fruit in just avoiding duplication?)
The kernel hash conses terms before storing them to disk. We save a little space and memory when you load them back, but we make no use of this algorithmically (we don't keep the invariant that =
iff ==
)
CC @Pierre-Marie Pédrot
What I hear so far is that it may be something to try, but the people answering so far haven't thought about it much yet. Also, it sounds like a lot of work / fairly invasive change.
I gave it a try once (to see if that would reduce space) but there were so many complications with fix/cofix that I gave up. Now that things are clearer (and we already have an implementation in MetaCoq), this might be more fun
Enrico Tassi said:
Gabriel Scherer said:
(but I guess that internal sharing / hashconsing reduces the low-hanging fruit in just avoiding duplication?)
The kernel hash conses terms before storing them to disk. We save a little space and memory when you load them back, but we make no use of this algorithmically (we don't keep the invariant that
=
iff==
)
Why not? Feels like this might make a big difference (but I have no idea).
One problem I see in trying to understand what could help is that we don't have the best tools yet to actually analyze Coq memory and time use.
It is quite often we find problems in this area the moment we look at it; some work we are doing on interfaces should help to explore Coq's data structures and hopefully have a bit better analysis, but on the other hand seems like a huge amount of work.
Indeed, I need to keep annotations on λ-abstractions for things to behave well. The problem is that if you have no annotations on your abstractions, there is no way you can type-check redexes, because without annotations abstractions cannot infer a type. From what I understood this is the choice of Agda, where you can only type normal forms (and applications where β-redexes are hidden by a definition of the function, which provides a type).
In his most recent papers Conor has a less naïve approach, where he clearly separates in the syntax already between terms which can infer a type (such as variables) and those which can only be checked (such as abstractions), and uses type annotations to mediate between both. IMHO it is quite elegant, as it reduces the need for annotations, while keeping good properties such as SR. Adapting this to CIC would be interesting I guess, but it departs a lot from the internals of Coq…
Last updated: Oct 03 2023 at 20:01 UTC