Stream: Miscellaneous

Topic: Umbrella library for Coq


view this post on Zulip Michael Soegtrop (Aug 27 2020 at 12:34):

for Coq I totally I agree this would be infeasible, but we are probably the furthest from a monorepo assumption of all proof assistants

I am not a big fan of a mono repo approach - I think it is either too stiff or the overall quality has to suffer. Projects in different stages of development need different source control / review / CI approaches and forcing many projects in different stages under the same rules is not good.

As I said, I would go the same approach as Coq platform - approach authors of interesting work and ask them if they agree to long term maintain their work under some umbrella project, so that others can profit from it more. Of cause this means work for something one can't publish as researcher - but at least it should generate some citations.

The umbrella project should provide reasonable infrastructure - a directory as you started it but also CI and easy ways to get the work (opam).

A good name for the umbrella project would be important. How about "Coq umbrella library" - logo being a Coq with a pile of books under an umbrella :-)

view this post on Zulip Karl Palmskog (Aug 27 2020 at 12:53):

I agree that it would be nice to have more projects under one roof in many cases, but real-world Coq integration efforts in my experience are very difficult for both technical and social reasons. For example, it's not easy to configure CI to handle several packages. Often libraries were developed with research goals in mind, and integrating is not something that is novel or can be published (as you mentioned)

view this post on Zulip Karl Palmskog (Aug 27 2020 at 13:11):

I think we need to provide more incentives for these integrations to happen. For example, we could set up automation to periodically announce new releases of projects on the Twitter account and elsewhere

view this post on Zulip Karl Palmskog (Aug 27 2020 at 13:13):

and in particular, we need to fix the web frontend to the released opam archive: https://coq.inria.fr/opam/www/

view this post on Zulip Karl Palmskog (Aug 27 2020 at 13:15):

I have some ongoing research directions that will help this, but unfortunately, Coq blockchain verification is seemingly what brings home the bacon

view this post on Zulip Michael Soegtrop (Aug 27 2020 at 14:39):

I think we need to provide more incentives for these integrations to happen.

As industry worker I don't know what would work here. Are citations not also something researchers are after? How about some auto-cite feature which spits out the paper references in a convenient format for a Coq development using other packages? I guess this would not be that easy to implement, though - that is know which lemmas from which sources you used for some resulting lemma in a similar way as Coq can find the axioms a lemma uses. But I guess it wouldn't be entirely impossible either. Each module would need some sort of GUID which are inherited by all the lemmas defined in it and one could collect the GUIDs of the resulting proof term - at least as long as the lemmas are not unfolded.

Would this be an incentive for researchers to keep their work maintained?

view this post on Zulip Karl Palmskog (Aug 27 2020 at 14:42):

@Michael Soegtrop indeed automation will help, but a comparison with Isabelle's AFP might be useful. The AFP:

In essence, I think they have lined up the academic incentives quite well. Many Isabelle researchers list their AFP contributions above or on equal footing with their regular publications.

Another aspect is that inclusion into the AFP means that the submitted version is maintained "forever" by Isabelle maintainers.

view this post on Zulip Karl Palmskog (Aug 27 2020 at 14:47):

if one can set up some similar academic incentive game around the Coq platform, or the Coq umbrella library, this would probably be very helpful. Certainly, one can provide integration with Zenodo at least, which gives citable DOIs to artifacts. Ping @Théo Zimmermann that Coq 8.12.0 is not yet on Zenodo...

view this post on Zulip Théo Zimmermann (Aug 27 2020 at 14:48):

Yeah, sorry about that. I'll put it back up on my todo list

view this post on Zulip Bas Spitters (Aug 27 2020 at 14:53):

I was recently asked about the practice of listing AFP articles on ones CV. Of course, one should be proud of this, but I believe one should be careful to avoid the impression that one is listing articles twice.
Is it common in the Isabelle community to cite both a paper and the AFP entry?

view this post on Zulip Karl Palmskog (Aug 27 2020 at 14:54):

I think they cite both, but they are usually cited in different contexts. For example, where one might point to a GitHub repo in a Coq research paper, they cite the AFP entry.

view this post on Zulip Karl Palmskog (Aug 27 2020 at 14:56):

having Zenodo entries for everything (all Coq code accompanying some paper) would probably make mine and other Coq users' CVs look better in some intuitive sense, but not sure funders and potential employers would care in the end

view this post on Zulip Michael Soegtrop (Aug 27 2020 at 15:27):

Another aspect is that inclusion into the AFP means that the submitted version is maintained "forever" by Isabelle maintainers.

So the Isabelle maintainers do the work, not the authors? This doesn't look terribly scalable to me.

view this post on Zulip Karl Palmskog (Aug 27 2020 at 15:33):

the authors can update their code with new lemmas / definitions and do resubmissions, but the code as submitted is maintained. The closest analogue for Coq is Coq's CI. I believe Coq's CI has quite a few more LOC than AFP (maybe 1+ million more), so it scales reasonably well if all you do is fix breaking changes.

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:16):

Coq’s CI is bigger than AFP?

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:16):

I thought that “basically all of Isabell code is in AFP, but not all of Coq”

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:17):

(I’m not implying that it was an informed opinion, but I’m mangling something I saw here)

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:17):

so I wonder what that means

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:18):

https://www.isa-afp.org/statistics.html --- currently around 2,5 million LOC. Coq's CI had around 2,2 million by August 2019, as per Théo's thesis

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:19):

we found around 14 million LOC of Coq code on GitHub (as reported by cloc) when we did some naive mining maybe 2 years ago

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:19):

thanks, but I was wondering more, dunno — does Coq have more users, more projects, a bigger community? or are coq proofs longer because of less automation? or ... ?

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:20):

but sorry, this is OT and mine was just curiosity

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:20):

Coq has a bigger community than Isabelle for sure, and its projects are also more active. The AFP is mostly mainteinance-only.

view this post on Zulip Bas Spitters (Aug 27 2020 at 20:21):

Some coq proofs are very verbose. Russell OConnor once proposed to count the size of zipped files.

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:21):

this is not to say that this is in itself better, Isabelle devs have chosen a certain model they want to handle projects in

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:22):

the example I like to throw around (mostly for humor) is that according to a 2020 report on programming language use on GitHub by an independent firm, Coq is now more popular on GitHub than Standard ML

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:23):

Which reminds me, Derek Dreyer just called SML a dead language, to my surprise

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:23):

(on a SIGPLAN interview)

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:23):

OTOH, the people who'd disagree are not that many :-)

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:23):

Back to @Michael Soegtrop 's question

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:24):

well, it [SML] will still be maintained forever, most likely, there are very high-profile projects that will be around, and compilers are good

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:24):

(deleted)

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:24):

getting citations from users is _one_ strategy, but I guess it’s not so common

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:24):

in my view, CakeML is, or will be, the true successor to SML (or at least I hope so)

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:25):

academic projects are either throwaways _or_ require significant maintenance effort.

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:26):

there are inbetweeners for sure

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:26):

not that you should ask _me_ what you need for academic success, but there are few projects that get lots of users (after lots of effort), and for most projects, I guess, maintenance wouldn’t help that much.

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:27):

ah yes, you're a civilian now

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:27):

I would argue industry is full of throwaways as well

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:30):

Sure, I guess, but “ship a 100 software projects with 0 total users” can be closer to success for an academic than for industrial code (and I’m not trying to be offensive)

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:30):

I mean, none of my academic code was meant to have users

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:31):

(in fact, getting your stuff to be used seems an easier option with proof assistants than in mainstream PL)

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:33):

the correct way to think about research artifacts, in my view, is that they should be "prepared for" being used by other researchers trying to validate or extend your research

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:35):

for example, a tarball hosted on someone's website without a readme is probably the worst - it will in all likelihood disappear soon, and even if someone cached it they don't know how to build it or even if it builds at all

view this post on Zulip Bas Spitters (Aug 27 2020 at 20:36):

Open data, open science

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:37):

no argument from me — I’ve been on Artifact Evaluation Committees. But that just means you can repeat the exact same result; for coq-community, I guess you expect code to be maintained and updated?

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:39):

for many/most POPL papers it might not make sense to _update_ the proofs, but proper packaging is still needed. (At least tell me which coq version you used ^^)

view this post on Zulip Paolo Giarrusso (Aug 27 2020 at 20:39):

(and I wasn’t subtweeting anybody here)

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:39):

I think there is a taxonomy or continous scale to maintenance:

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:40):

coq-community has all of these, but more towards the first. The Coq platform and an umbrella library is arguably aimed at making the first two more feasible

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:42):

given that it's generally possible to port most plain Coq projects from 10 years ago to stable Coq, I think the PL/FM community should try to invest more resources into at least basic maintenance of research artifacts. Otherwise, people have many more opportunities to reinvent the wheel and to repackage old hat. And at the very least, the artifacts should be easily accessible. For example, I don't even know if (Coq) POPL artifacts from 3 years ago are still available via the web.

view this post on Zulip Karl Palmskog (Aug 27 2020 at 20:49):

I know for sure that plenty of artifacts over the years have been actively withheld from any kind of inspection by the community (except artifact reviewers). In my view, any value from reading an ITP-related paper is significantly lessened without the artifact

view this post on Zulip Bas Spitters (Aug 29 2020 at 09:42):

I see stdpp is not included in platform. @Ralf Jung @robbert is this intended?

view this post on Zulip Robbert Krebbers (Aug 29 2020 at 09:52):

https://github.com/MSoegtropIMC/coq-platform/issues/4

view this post on Zulip Bas Spitters (Aug 29 2020 at 10:09):

@robbert Great! Do you have an opinion on the umbrella initiative wrt stdlibpp ?

view this post on Zulip Robbert Krebbers (Aug 29 2020 at 10:12):

I find the proposal here a bit too preliminary to say anything useful. Or is there more information elsewhere?

view this post on Zulip Robbert Krebbers (Aug 29 2020 at 10:14):

(I am also not sure I understand this thread, it starts with a post that contains a quote, https://coq.zulipchat.com/#narrow/stream/237655-Miscellaneous/topic/Umbrella.20library.20for.20Coq/near/208210104, but I have no idea where that quote refers to, so there is probably some prior conversation that I am unaware of.)

view this post on Zulip Bas Spitters (Aug 29 2020 at 10:24):

Maybe @Michael Soegtrop can elaborate. My understanding is it is part of the ongoing discussion on the spectrum of common libraries from stdlib to independent opam libraries, or even random github repos.
I understand umbrella to have similar goals as extlib, stdlibpp, stdlib2, ...
I believe a coherent proposal is still missing.

view this post on Zulip Karl Palmskog (Aug 29 2020 at 10:34):

quote refers to here: https://coq.zulipchat.com/#narrow/stream/237655-Miscellaneous/topic/Formalizing.20undergraduate.20mathematics/near/208199545

view this post on Zulip Michael Soegtrop (Aug 29 2020 at 13:18):

My suggestion of an "umbrella library" refered to @Karl Palmskog 's wiki entry "https://github.com/coq-community/manifesto/wiki/List-of-disused-formalized-mathematics-in-Coq".

So it is about individual formalizations of specific theorems, say the proof of the AKS prime algorithm in the above link.

I think such developments are too small to handle them with a mechanism like the Coq platform but too precsious to let them bit rot.

How this is best handled I can't say - that's what we are discussing here. The minimum should be:

view this post on Zulip Karl Palmskog (Aug 29 2020 at 13:19):

thanks for clarifying. I would argue this repo is already a sort of mini-umbrella Coq library: https://github.com/jmadiot/coq100 (but without CI or packaging, etc.)

view this post on Zulip Michael Soegtrop (Aug 29 2020 at 13:26):

thanks for clarifying. I would argue this repo is already a sort of mini-umbrella-Coq library: https://github.com/jmadiot/coq100 (but without CI or packaging, etc.)

Definitely a starting point! I don't think we need a fully automated CI since it should anyway only run with each Coq release.

view this post on Zulip Karl Palmskog (Aug 29 2020 at 13:31):

ideally, the umbrella library maintainer would be a math-oriented researcher. I have no intuition at all on what math is considered good/bad/useful/trivial/worth preserving

view this post on Zulip Michael Soegtrop (Aug 29 2020 at 13:38):

There is a tachnical side to it and a curating side to it. I guess there should be some sort of committee to decide what goes into the Coq umbrella library (and is saved from bit rot) and what not, possibly with a quarterly discussion meeting.

view this post on Zulip Bas Spitters (Aug 29 2020 at 13:56):

One could separate the maintainer work from the assessment about what is interesting maths.
It should be easier to find a committee of knowledgeable mathematicians than to also find a maintainer at the same time.

view this post on Zulip Théo Zimmermann (Aug 29 2020 at 14:31):

Isn't this coq100 project a good candidate for inclusion in coq-community?

view this post on Zulip Karl Palmskog (Aug 29 2020 at 14:33):

yes, I opened an issue in the project itself: https://github.com/jmadiot/coq100/issues/4
(since the author did not show interest before here: https://github.com/coqtail/coqtail/issues/1 I didn't go directly to a manifesto issue)

view this post on Zulip Karl Palmskog (Aug 29 2020 at 14:35):

maybe we could reach out through some other channel if no response for a week?

view this post on Zulip Pierre-Marie Pédrot (Aug 29 2020 at 14:46):

As one of the coauthors of coqtail I should have reacted a tad more :/

view this post on Zulip Pierre-Marie Pédrot (Aug 29 2020 at 14:47):

I can ping JM on a side-channel on which he might be more active


Last updated: Aug 19 2022 at 20:03 UTC