Not sure what is the best stream for this, but please consider submitting to:
https://popl24.sigplan.org/home/CPP-2024#Call-for-Papers
Abstract Submission Deadline: 12 September 2023 at 23:59 AoE (UTC-12h)
Please post an announcement to the Discourse forum. If you want to, you can post it via email (see https://github.com/coq/coq/wiki/Discourse).
if I'm reading it correctly, 12 out of 20 accepted papers are using Coq: https://popl24.sigplan.org/home/CPP-2024#event-overview
now we can only hope people make their artifacts reproducible using a Platform pick...
Maybe there should be guidelines in the call for paper. Currently, it basically says nothing about what standard is expected of the supplementary material when it is an integral part of CPP that the reviewers actively look at.
let's just say that I have tried to argue this with some people involved, and the response was quite far from enthusiastic (some background at https://github.com/coq/platform/issues/2)
to my knowledge, CPP reviewers are not even required to look at artifacts
Thanks. From what I could see this year, reviewers were actually looking at the artefacts. My understanding was that there was no artefact evaluation committee because the formalisation is part of the contribution under evaluation.
sure, most do look at artifacts, but I don't think this is explicitly spelled out either in the CFP or in instructions to PC members.
You are correct.
my argument has been that this should be spelled out with additional recommendations for each proof assistant, such as the Platform for Coq. The counterargument that I've heard is that this leads to less flexibility for PC (true, but this is also the point, to make the artifact submission/reviewing more predictable)
It could at the very least be a suggestion.
Mandating the platform could indeed be a problem, but recommending it might help when people have no other constraints. But please don't try to pick a version.
I think what should be required is that the submission contains clear instructions, dependencies and assumptions.
Usual AEC calls spell that out indeed
See "kick the tires" here https://popl24.sigplan.org/track/POPL-2024-artifact-evaluation#Process
And https://artifact-eval.org/guidelines.html gives some more background, even if that website is old now (it's from when AECs were introduced to the PL/SIGPLAN community)
Can't speak on CPP specifically, but I've been on both sides of AECs.
("Community") guidelines like https://proofartifacts.github.io/guidelines/ also show how to document proof-paper links
@Karl Palmskog : do you think I should contact the organizers offering help with issues / questions?
I don't think it's very useful to contact this year's organizers. We'd need to know who will be PC chairs of CPP 2025
I think there's near-zero chance of any central guidelines happening to accepted publications to CPP 2024. However, @Michael Soegtrop you might have more luck reaching out to individual researchers with Coq papers accepted to CPP 2024 as suggested by Bas here.
Karl Palmskog said:
I don't think it's very useful to contact this year's organizers. We'd need to know who will be PC chairs of CPP 2025
Since CPP 2020, each PC co-chair of CPP has served for two consecutive years. So, I guess Sandrine Blazy is a PC chair of CPP 2025.
It's a bit unfortunate that most resources like https://proofartifacts.github.io/guidelines/ don't say much about reproducibility. Almost all the "Code Artifacts Packaging" options listed here are quite unsatisfactory for proof assistant code in my view: https://artifact-eval.org/guidelines.html (e.g., VMs are pretty terrible)
I'd conjecture that the majority of CPP artifacts from 5+ years ago are very difficult to reproduce, if they can be obtained at all
Let me ask Lennart Behringer from the steering committee, whom I know well, what he thinks is best. But I agree that addressing individuals and this way setting standards which are then easier to require in few years is likely most effective.
yeah, I think the only way forward in short term is getting CPP board/PC to agree to at least recommend the Platform, in order to achieve reproducibility. I'm sure they can recommend similar things for other proof assistants
I'd conjecture that the majority of CPP artifacts from 5+ years ago are very difficult to reproduce, if they can be obtained at all
IMHO one needs a weekly or monthly CI to avoid bit rot. Most authors are helpful if they know, but many simply don't know. And I think it also makes it easier if they don't have follow up on 5 years of Coq development. And a manageable broad CI requires a base like Coq Platform.
but by "reproducibility" I mean just checking the code in the Coq version they originally used
if you test that you build with a certain Platform version/pick, then one should be able to get it to build with that in the future, then step by step update the code via subsequent Platform releases, if desired
the ideal is that we provide full containers of each Platform release/pick and instructions on how to reproduce Coq code using them, but that could come later
You mean: @Sandrine Blazy ...
the ideal is that we provide full containers of each Platform release/pick and instructions on how to reproduce Coq code using them
Well I think we have this already - one can build all previous picks with the latest Coq Platfom.
but at some point, this might become infeasible, e.g., due to changes/deps in system packages
Libraries which are not downwards compatible or where there is too much variation in what one gets on various systems I replace with build from sources opam packages. This works reliably so far. I am currently only going back to 8.12, but I could probably go back to 8.6 without much difficulties.
Indeed, the 2-year rotation policy for CPP chairs is expected to continue. I believe nudging authors to use platform could indeed be helpful for maintaining formalizations over time. But previous discussions to more strictly mandate this concluded that cutting edge papers are often based on material not available in standard packages, or in fact describe material in new versions of packages. But it could be worthwhile to discuss with the next PC chairs (or the SC chair, @Catalin Hritcu ) whether authors whose developments that do not fall under this category can be stronger encouraged to use platform in the next CFP.
Lennart Beringer said:
that cutting edge papers are often based on material not available in standard packages, or in fact describe material in new versions of packages
I don't think this is a problem. The idea is to use the infrastructure Coq Platform provides to produce as concise as possible machine readable / executable build instructions for a development. In the simplest case this is a single opam file which works based on a given Coq Platform release. In more compex cases this can be several opam packages together with a custom pick file which overrides versions of some standard Coq Platform packages. In either case I would say that using Coq Platform as base allows for short description and development time.
I think the "maintenance" framing of Platform use is not going to be sufficient or convincing. Realistically, only a small portion of paper artifacts will be maintained long term. But archiving and proper building instructions/tooling are crucial parts of the scientific side of ITP. The value of a paper in the ITP community is usually significantly diminished if the artifact is gone or can't be built at all as described.
Agreed. The first priority is to keep the original artefacts buildable for say 5 years. I would say this is realistic once it builds on Coq Platform without or at most very little maintenance. One can also this put into CI.
@Karl Palmskog what's wrong with VMs?
to my knowledge, VMs are tied to a specific virtualization solution, such as VirtualBox, and includes many things unrelated to Coq, with there usually being no way to distinguish the actual artifact files from VM boilerplate
even firing up a VM can be complicated (you need to know user/password)
One can export VMs into OVF format.
And typing a password is much easier than, dunno, installing OCaml 3 on a modern system, or probably installing OCaml 4.07 or 4.10 in a few years
And OCaml is pretty good there... GHC on Mac bitrots much more easily (not relevant to Coq but relevant to anything Haskell, like Agda)
but then it'd make much more sense to me to provide lightweight containers with the non-artifact Coq-related files. From personal experience, it's hell to try to extract the actual needed source files from VMs
You're not expected to. People should provide the source separately.
but they usually don't
Color me skeptical. But I agree AECs don't necessarily check for that, since that goes beyond the basic task.
example of 5GB VM with no source: https://zenodo.org/records/5493554
To clarify, I see what you mean, I know what AECs are for, and I think I see a difference.
Okay, I see the source zip is inside the VM
OTOH, this VM is in OVA format, so it should not be specific to virtualbox / VMware / ...
another problem with OVAs: what Coq people do is that they put their .vo
files inside them, not necessarily with the possibility of rebuilding them or instructions how to rebuild them
OTOH, it's far from clear whether that format is suitable for long-term archiving
Personally, I'd have failed an artifact without a way to rebuild .vo files from source.
Including prebuilt ones might be appreciated, especially for proofs with long builds.
a typical validation task is to modify sources, combine with other libraries, etc., run a bunch of analysis/inspection tools. I don't think OVAs are conducive to this
Sure. But I'd consider, say, proper opam packaging beyond what's needed to pass AEC.
I don't consider the opam packaging of the artifact itself to be important. But what we are pushing for here is to recommend that people provide some basic build instructions on top of a Platform version/pick
this could be make
, or dune build
, etc.
for almost any practical purpose like reviewing/validation, I'd consider this to be better than an OVA
The downside is when AEC members fail to reproduce the instructions because of problems with their systems.
at least in ITP and CPP, there's no way to fail the artifact separately from the paper
Paolo Giarrusso said:
The downside is when AEC members fail to reproduce the instructions because of problems with their systems.
sure, but to help with that one can provide, for example, Coq Platform OVAs and containers, Nix packages, ...
Very roughly, I'd say AECs check that authors didn't misrepresent their evidence, not whether they've figured they should write a _CoqProject file (did Software Foundations start mentioning that?)
to me at least, the Platform is orthogonal to _CoqProject
files, using any other way to build is possible on top of a Platform installation
I think there's "step through files in Proof General"
Anyway, I agree there might be technical solutions that are as seamless as VMs in ensuring reviewers can rebuild the artifact.
I'd still expect artifacts under review to include both VMs and source archives. Which doesn't ensure anything about zenodo, since those are (optionally) created after AEC badging.
yeah, people don't even have to allow anyone to see the artifact except the AEC to get that badge, right? Very open sciency stuff...
"the badge" is not well-defined. You should take a look at https://www.acm.org/publications/policies/artifact-review-and-badging-current.
By "passing AEC" I was referring to functional. I agree with you that reusable and available are also nice.
it's ironic that they [ACM] go full "author pays everything" to get the PDFs out there, but artifacts are not going to be covered by open access
and yes, many raise the same question, but few communities (if any) have decided to reject papers with proprietary artifacts outright.
the MapReduce paper didn't come with an artifact, but people reimplemented it as Hadoop. I haven't looked closely, but would have we been better off without that paper?
The AEC at least gets some outsiders to review the code as well.
the scope here is ITP artifacts, and I can't think of any paper where I've reused something without having access to the source. Usually these papers would be hell to recreate in the tool based on content in the pdf
Sure. The basic AEC badge is not meant to enable reuse.
In any case, all the things you say make for better artifacts. I think the only contention can be about the lower bar.
@Paolo Giarrusso : My private experience with VirtualBox and VMware VMs I archived for some purpose is that a good fraction of them don't work (easily) any more after a few years. I remember several cases where I had to setup a new VM and recompile everything from scratch (which was not trivial then). A common reason seems to be that old kernels get confused by newer graphics card (abstraction layers), CPUs, bus systems, you name it . The old VMs simply won't boot any more. The mileage should be better with Intel SImics - it is intended to fully abstract the HW configuration - but I don't think it is a very feasible solution for Coq users (it is rather complicated to use and the additional abstraction makes it quite a bit slower especially for UI things).
@Michael Soegtrop Papers lifetime is meant to be longer than anything reasonable for the platform (more 20 years than 5). Re HW, the naive fix seems to fix the HW to some emulated devices — this isn't done by disk images, but done by OVAs. Why would that break?
It's not obvious to me that OVAs last for 20 years, but the x86 platform seems to have managed
I think it makes more sense that advanced solutions are figured out for VM'ing the Platform releases/picks (meaning one can replicate Coq things inside those VMs 20 years from now) - than that everyone VM's their own Coq stuff from scratch
That makes sense, especially as long as such a thing is easy to customize (and even if you fork some platform package, opam uninstall exists if needed)
@Paolo Giarrusso :
Papers lifetime is meant to be longer than anything reasonable for the platform (more 20 years than 5).
We do our best. Let's see where we end up.
Re HW, the naive fix seems to fix the HW to some emulated devices
Have you a link were VMware says that this is so - I couldn't fine any?
Intel Simics is the only system I know of which already showed that it can properly abstract HW over say 2 decades, and I would say Simics is not for the average Coq user. But of course things might have improved. On the other hand in the past Simics was better in this than other VM software because it is easier to build an virtualised CPU emulator and the CPU in the same company.
The main issue with Simics in 20 years I see is the question if it still exists then.
this isn't done by disk images, but done by OVAs. Why would that break?
I don't know the technology of OVAs and couldn't find technical docs. My personal experience is that most VMs can't be transported well over several generations of HW (bridge chips and the like).
If you export and reimport an OVA, my experience is that it should be using the same bridge and the same HW configuration. I've done this with virtualbox a while ago. Should I infer your experience is with VM disk images?
Karl Palmskog said:
I think it makes more sense that advanced solutions are figured out for VM'ing the Platform releases/picks (meaning one can replicate Coq things inside those VMs 20 years from now) - than that everyone VM's their own Coq stuff from scratch
Well even today you can use Simics to emulate a 20 (or 40) year old PC and run whatever software did run on such a PC, so I am not sure it is worth the effort as long as such systems exist. We can still think about how to supply emulators as soon as it gets really hard to run older Coq on current machines.
Paolo Giarrusso said:
If you export and reimport an OVA, my experience is that it should be using the same bridge and the same HW configuration. I've done this with virtualbox a while ago. Should I infer your experience is with VM disk images?
20 years ago OVAs didn't exist. If the concept of OVAs still exists in 20years and if you can do anything with it is hard to tell - possibly not. But I am quite sure that in 20 years you can find some 2023 PC emulator, on which you can do whatever you could do in 2023 at about the same speed or faster. What you need for this is a 2023 Ubuntu and a 2023 Coq Platform.
Re "older Coq", older OCaml needed patching for new host OSes already, and old Coq won't run on OCaml 4.14, so somebody with Intel HW could try to install old Coqs today
https://www.dmtf.org/standards/ovf exists, but it's not obvious it actually specifies which devices must be supported.
At the same time, I don't see this point in the "known issues" documented by virtualbox https://www.virtualbox.org/manual/ch01.html#ovf-about.
Re "older Coq", older OCaml needed patching for new host OSes already,
yes, and Coq Platform CI had som impact on convincing the OCaml team to back port the patch to all OCaml versions required to build all versions of Coq in Coq Platform. We test this, and this does have an impact on if it will work in the future. There might be a point where it gets really really hard, but what to do then we should decide when we get to this point. It might also be that this point doesn't come (before we move to 128 bit CPUs, which might never happen because 64 bit address space is enough to address whatever RAM sizes we will have in the future).
@Paolo Giarrusso : regarding OVFs: it is hard to say something substantial about it, because one can't test it. I don't think you can run a version of virtual box which can create an OVF on an emulated or real PC with a kernel which say doesn't support PCIe or multicore. If you can demonstrate that you can migrate a VM from a pre PCIe, pre multicore PC to a PCIe + multicore PC you have at least a piece of what is required to say that VMs allow you to archive software for 20 years. And my point is that I have severe doubts that you can demonstrate even this small piece.
@Paolo Giarrusso @Karl Palmskog : do you think it would be a worthwhile exercise to
for the long term, yes it would be very nice to push the preservation side of the Platform. But I don't think it's currently a priority, when we seemingly aren't yet able to reach enough people to build a momentum for Platform-for-reproducibility
I even know some projects that are "stuck" in 8.3, and 8.3 is a big pain to build
@Michael Soegtrop I'm not sure I understand the question, unless you're assuming support for passthrough host devices — I'm not. I can create today a VirtualBox VM with a PIIX3 chipset and matching IDE controller (which is from 1997 according to https://www.datasheetcatalog.com/datasheets_pdf/8/2/3/7/82371SB_(PIIX3).shtml). As long as VirtualBox doesn't drop emulators for all those devices, is there a problem?
(repeat this for all emulated devices in the VM, and make sure not to use any passthrough ones).
@Paolo Giarrusso : I can't say where the issue really is, but I can say that I have VirtualBox VMs I created in the 2000..2010 time frame without pass through and they won't boot. Maybe if I play long enough with settings I can make them boot, but I couldn't get them to booth within half a day of trying. I guess that such back compatibility features are not sufficiently well tested to really work in practice. I would also guess that the effort for maintaining the translations is not linear with the number of chipsets and other HW supported.
It is very hard to test such things because one can't run virtualisation systems inside a VM (unless one does really slow full emulation), so it is hard to systematically create VMs on old machines to test them on new machines. And things which are hard to test usually don't work very well.
I haven't seen a reason the _host_ system makes a difference, and you can create an old VM on a new system.
in any case, I'm going to do test some old OVF myself
my own https://inc-lc.github.io/AEC.html is from 2014 and I'll test it on my 2020 Intel MacBook. I'm happy to try something else and report as well.
@Paolo Giarrusso : I don't think the host system per see makes a difference but the version of VirtualBox you use to create a VM makes a difference (if only by influencing default settings for HW support) and I don't think you can run a 10 years old virtual box on a modern machine.
Karl Palmskog said:
if I'm reading it correctly, 12 out of 20 accepted papers are using Coq: https://popl24.sigplan.org/home/CPP-2024#event-overview
Full breakdown:
- 12 Coq papers
- 3 Lean papers
- 2 Isabelle/HOL papers
- 1 Agda paper
- 1 PVS paper
- 1 Rzk paper
Thanks for the summary @Karl Palmskog , can we use your analysis in @Coqlang's twitter ?
fine by me, but this was based on a quick reading of abstracts
@Michael Soegtrop just booted the 2014 machine, first try. The HW settings were part of the OVA:
image.png
Karl Palmskog said:
fine by me, but this was based on a quick reading of abstracts
done, for those wanting to interact on Twitter
@Paolo Giarrusso : possibly things improved / matured. As I said the trouble I had was with machines created in the 2000...2010 time frame. Still it is hard to say how well an OVA works 20 years after its creation.
Sure, but let's not blame or suspect OVF for the problems you've seen — OVF was born around 2010 so it didn't even exist in most of that timeframe.
@Paolo Giarrusso : so what do you suggest? To supply ready to use Coq Platform OVF VMs?
I just meant "they're a tool that could be helpful", but I don't know if that fits with conference workflows. A Coq conference could just say "send a source artifact that works on this OVF", but a single artifact can be multi-language
Last updated: Nov 29 2023 at 21:01 UTC