Stream: CertiCoq

Topic: CertiCoq garbage collection


view this post on Zulip Bas Spitters (Jul 22 2020 at 19:29):

The 2pp certicoq abstract states that the GC is simple. Is any more information available?

view this post on Zulip Karl Palmskog (Jul 22 2020 at 19:33):

this is related, I think: https://zoep.github.io/icfp2019.pdf

view this post on Zulip Armaël Guéneau (Jul 22 2020 at 19:37):

From what I can remember from a talk from Aquinas Hobor at gallium one or two years ago, their GC uses the same strategy as OCaml's GC for the minor heap: bump pointer allocation, then when it is full, do a linear pass and move live data to the next generation. Except that they have 12 generations (instead of 2 in the case of ocaml), all using the same simple strategy. The catch being: if you end up trying to garbage collect the last generation, then your program crashes.

view this post on Zulip Armaël Guéneau (Jul 22 2020 at 19:38):

https://github.com/PrincetonUniversity/certicoq/blob/master/theories/Runtime/gc.c#L351

view this post on Zulip Armaël Guéneau (Jul 22 2020 at 19:41):

If I understand correctly, because pointers can only point to older generations, I think that would mean that long running programs have a chance of eventually crashing? Whereas a more realistic GC would have a better allocator for generation number N.

view this post on Zulip Karl Palmskog (Jul 22 2020 at 19:43):

hmm, that sounds worrying, so GC crash becomes the "replacement" for divergence for regular programming languages? ("program either GC-crashes or terminates")

view this post on Zulip Karl Palmskog (Jul 22 2020 at 19:44):

I'm not even sure how GC correctness is phrased these days, I guess I should take a look at the CakeML GC paper again: http://www.cse.chalmers.se/~myreen/itp17.pdf

view this post on Zulip Bas Spitters (Jul 22 2020 at 19:55):

@Armaël Guéneau the CoqPL abstract states that "The certicoq garbage collector is simple, as Coq is purely functional (unlike haskell, ocaml)", which suggests that they are using a simpler GC than ocaml.

view this post on Zulip Karl Palmskog (Jul 22 2020 at 19:56):

but that was 3 years ago now, right? Approach may have changed

view this post on Zulip Bas Spitters (Jul 22 2020 at 19:57):

Yes, that's why I'm asking...

view this post on Zulip Armaël Guéneau (Jul 22 2020 at 19:57):

ok, then it might be referring to the fact that they don't have to deal with mutation (e.g. ocaml has a write barrier; the GC needs to be aware that the graph of objects has changed when mutating objects).

view this post on Zulip Armaël Guéneau (Jul 22 2020 at 19:57):

so I guess that simplifies the design of the GC indeed (independently from the generations business)

view this post on Zulip Bas Spitters (Jul 22 2020 at 20:01):

This paper seems to contain the 400line GC for certicoq:
https://dl.acm.org/doi/pdf/10.1145/3360597

view this post on Zulip Armaël Guéneau (Jul 22 2020 at 20:02):

The talk by Aquinas that I saw was indeed about the work presented in that paper.

view this post on Zulip Paolo Giarrusso (Jul 23 2020 at 13:16):

I haven't studied CertiCoq's GC, but what @Armaël Guéneau describes makes some sense. Using 2 generations for simplicity, without mutation, an "old" generation cannot point to a "newer" one, but mutating the old generation breaks this invariant. Write barriers instrument writes to track pointers from older generations; those pointers must be used as additional roots when doing GC on a new generation.

view this post on Zulip Paolo Giarrusso (Jul 23 2020 at 13:19):

The paper has few details — they point to an extended version, but the link gives 404: https://www.comp.nus.edu.sg/~hobor/Publications/2019/autoquack_extended_oopsla19.pdf — while the homepage is still live (and still hosts the OOPSLA paper: https://www.comp.nus.edu.sg/~hobor/Publications/2019/Localize.pdf)

view this post on Zulip Paolo Giarrusso (Jul 23 2020 at 13:20):

On Hobor's homepage I only found these slides for more details: https://www.comp.nus.edu.sg/~hobor/Publications/2019/gc_NIER_talk.pdf

view this post on Zulip Paolo Giarrusso (Jul 23 2020 at 13:21):

Let me comment that, if what @Armaël Guéneau describes is true, I'm not convinced that this GC strategy is as good as they claim:

The 12-generation GC, written in the spirit of the OCaml GC, is realistic and sophisticated, though by no means industrial-strength.

view this post on Zulip Bas Spitters (Jul 23 2020 at 13:34):

Interestingly the ICFP paper and the Oopsla paper do not cite each other. Are any of the certicoq authors on Zulip?
@Zoe Paraskevopoulou (Gitter import) @Abhishek Anand @Randy Pollack @Matthieu Sozeau ??

view this post on Zulip Matthieu Sozeau (Jul 24 2020 at 21:16):

I'm not very familiar with the GC implementation but I don't think it drastically changed from what Aquinas presented at Oopsla

view this post on Zulip Matthieu Sozeau (Jul 24 2020 at 21:17):

Andrew Appel would be the best placed to answer this question, I'll point him here :)

view this post on Zulip Andrew Appel (Jul 24 2020 at 21:40):

What exactly is the question here?
I should also remark: The CertiCoq garbage collector was implemented by me, and proved correct by Shengyi Wang, the lead author of the OOPLSA paper, Aquinas Hobor's PhD student, and now Shengyi is a postdoc at Princeton (though he's working on other things than the garbage collector). Indeed, the garbage collector has 12 generations, each twice the size of the previous, and I dare you to run out of space! But it's true, if we wrote a few more lines of code, we could handle "bottoming out" at the biggest generation more comprehensively. Also it's true that this collector doesn't need to handle mutable references, which simplifies the finding of roots between generations. Aquinas has told me that he's interested in having a student extend this collector (and proof) to handle mutable references.

view this post on Zulip Karl Palmskog (Jul 24 2020 at 21:55):

I think Bas's original question in this topic was on whether there was complete published description for the currently implemented GC in CertiCoq.

I'm reading Andrew's remark as that the OOPSLA paper (https://dl.acm.org/doi/pdf/10.1145/3360597) is the current best approximation.

view this post on Zulip Bas Spitters (Jul 24 2020 at 21:55):

@Andrew Appel Thanks! My question was mostly informative. Where is the GC documented? The Oopsla paper is a good source.
What is the relation with the ICFP paper (if any)?

view this post on Zulip Paolo Giarrusso (Jul 24 2020 at 22:39):

@Andrew Appel is the extended version of the OOPSLA paper available somewhere? The paper seems to link to https://www.comp.nus.edu.sg/~hobor/Publications/2019/autoquack_extended_oopsla19.pdf, but that gives 404

view this post on Zulip Andrew Appel (Jul 26 2020 at 01:17):

Paolo Giarrusso said:

Andrew Appel is the extended version of the OOPSLA paper available somewhere? The paper seems to link to https://www.comp.nus.edu.sg/~hobor/Publications/2019/autoquack_extended_oopsla19.pdf, but that gives 404

The most straightforward way to find out would be to ask Shengyi or Aquinas.
https://www.cs.princeton.edu/~shengyiw/
https://www.comp.nus.edu.sg/~hobor/


Last updated: Feb 06 2023 at 05:03 UTC