for Coq I totally I agree this would be infeasible, but we are probably the furthest from a monorepo assumption of all proof assistants
I am not a big fan of a mono repo approach - I think it is either too stiff or the overall quality has to suffer. Projects in different stages of development need different source control / review / CI approaches and forcing many projects in different stages under the same rules is not good.
As I said, I would go the same approach as Coq platform - approach authors of interesting work and ask them if they agree to long term maintain their work under some umbrella project, so that others can profit from it more. Of cause this means work for something one can't publish as researcher - but at least it should generate some citations.
The umbrella project should provide reasonable infrastructure - a directory as you started it but also CI and easy ways to get the work (opam).
A good name for the umbrella project would be important. How about "Coq umbrella library" - logo being a Coq with a pile of books under an umbrella :-)
I agree that it would be nice to have more projects under one roof in many cases, but real-world Coq integration efforts in my experience are very difficult for both technical and social reasons. For example, it's not easy to configure CI to handle several packages. Often libraries were developed with research goals in mind, and integrating is not something that is novel or can be published (as you mentioned)
I think we need to provide more incentives for these integrations to happen. For example, we could set up automation to periodically announce new releases of projects on the Twitter account and elsewhere
and in particular, we need to fix the web frontend to the released
opam archive: https://coq.inria.fr/opam/www/
I have some ongoing research directions that will help this, but unfortunately, Coq blockchain verification is seemingly what brings home the bacon
I think we need to provide more incentives for these integrations to happen.
As industry worker I don't know what would work here. Are citations not also something researchers are after? How about some auto-cite feature which spits out the paper references in a convenient format for a Coq development using other packages? I guess this would not be that easy to implement, though - that is know which lemmas from which sources you used for some resulting lemma in a similar way as Coq can find the axioms a lemma uses. But I guess it wouldn't be entirely impossible either. Each module would need some sort of GUID which are inherited by all the lemmas defined in it and one could collect the GUIDs of the resulting proof term - at least as long as the lemmas are not unfolded.
Would this be an incentive for researchers to keep their work maintained?
@Michael Soegtrop indeed automation will help, but a comparison with Isabelle's AFP might be useful. The AFP:
In essence, I think they have lined up the academic incentives quite well. Many Isabelle researchers list their AFP contributions above or on equal footing with their regular publications.
Another aspect is that inclusion into the AFP means that the submitted version is maintained "forever" by Isabelle maintainers.
if one can set up some similar academic incentive game around the Coq platform, or the Coq umbrella library, this would probably be very helpful. Certainly, one can provide integration with Zenodo at least, which gives citable DOIs to artifacts. Ping @Théo Zimmermann that Coq 8.12.0 is not yet on Zenodo...
Yeah, sorry about that. I'll put it back up on my todo list
I was recently asked about the practice of listing AFP articles on ones CV. Of course, one should be proud of this, but I believe one should be careful to avoid the impression that one is listing articles twice.
Is it common in the Isabelle community to cite both a paper and the AFP entry?
I think they cite both, but they are usually cited in different contexts. For example, where one might point to a GitHub repo in a Coq research paper, they cite the AFP entry.
having Zenodo entries for everything (all Coq code accompanying some paper) would probably make mine and other Coq users' CVs look better in some intuitive sense, but not sure funders and potential employers would care in the end
Another aspect is that inclusion into the AFP means that the submitted version is maintained "forever" by Isabelle maintainers.
So the Isabelle maintainers do the work, not the authors? This doesn't look terribly scalable to me.
the authors can update their code with new lemmas / definitions and do resubmissions, but the code as submitted is maintained. The closest analogue for Coq is Coq's CI. I believe Coq's CI has quite a few more LOC than AFP (maybe 1+ million more), so it scales reasonably well if all you do is fix breaking changes.
Coq’s CI is bigger than AFP?
I thought that “basically all of Isabell code is in AFP, but not all of Coq”
(I’m not implying that it was an informed opinion, but I’m mangling something I saw here)
so I wonder what that means
https://www.isa-afp.org/statistics.html --- currently around 2,5 million LOC. Coq's CI had around 2,2 million by August 2019, as per Théo's thesis
we found around 14 million LOC of Coq code on GitHub (as reported by cloc
) when we did some naive mining maybe 2 years ago
thanks, but I was wondering more, dunno — does Coq have more users, more projects, a bigger community? or are coq proofs longer because of less automation? or ... ?
but sorry, this is OT and mine was just curiosity
Coq has a bigger community than Isabelle for sure, and its projects are also more active. The AFP is mostly mainteinance-only.
Some coq proofs are very verbose. Russell OConnor once proposed to count the size of zipped files.
this is not to say that this is in itself better, Isabelle devs have chosen a certain model they want to handle projects in
the example I like to throw around (mostly for humor) is that according to a 2020 report on programming language use on GitHub by an independent firm, Coq is now more popular on GitHub than Standard ML
Which reminds me, Derek Dreyer just called SML a dead language, to my surprise
(on a SIGPLAN interview)
OTOH, the people who'd disagree are not that many :-)
Back to @Michael Soegtrop 's question
well, it [SML] will still be maintained forever, most likely, there are very high-profile projects that will be around, and compilers are good
(deleted)
getting citations from users is _one_ strategy, but I guess it’s not so common
in my view, CakeML is, or will be, the true successor to SML (or at least I hope so)
academic projects are either throwaways _or_ require significant maintenance effort.
there are inbetweeners for sure
not that you should ask _me_ what you need for academic success, but there are few projects that get lots of users (after lots of effort), and for most projects, I guess, maintenance wouldn’t help that much.
ah yes, you're a civilian now
I would argue industry is full of throwaways as well
Sure, I guess, but “ship a 100 software projects with 0 total users” can be closer to success for an academic than for industrial code (and I’m not trying to be offensive)
I mean, none of my academic code was meant to have users
(in fact, getting your stuff to be used seems an easier option with proof assistants than in mainstream PL)
the correct way to think about research artifacts, in my view, is that they should be "prepared for" being used by other researchers trying to validate or extend your research
for example, a tarball hosted on someone's website without a readme is probably the worst - it will in all likelihood disappear soon, and even if someone cached it they don't know how to build it or even if it builds at all
Open data, open science
no argument from me — I’ve been on Artifact Evaluation Committees. But that just means you can repeat the exact same result; for coq-community, I guess you expect code to be maintained and updated?
for many/most POPL papers it might not make sense to _update_ the proofs, but proper packaging is still needed. (At least tell me which coq version you used ^^)
(and I wasn’t subtweeting anybody here)
I think there is a taxonomy or continous scale to maintenance:
coq-community has all of these, but more towards the first. The Coq platform and an umbrella library is arguably aimed at making the first two more feasible
given that it's generally possible to port most plain Coq projects from 10 years ago to stable Coq, I think the PL/FM community should try to invest more resources into at least basic maintenance of research artifacts. Otherwise, people have many more opportunities to reinvent the wheel and to repackage old hat. And at the very least, the artifacts should be easily accessible. For example, I don't even know if (Coq) POPL artifacts from 3 years ago are still available via the web.
I know for sure that plenty of artifacts over the years have been actively withheld from any kind of inspection by the community (except artifact reviewers). In my view, any value from reading an ITP-related paper is significantly lessened without the artifact
I see stdpp is not included in platform. @Ralf Jung @robbert is this intended?
https://github.com/MSoegtropIMC/coq-platform/issues/4
@robbert Great! Do you have an opinion on the umbrella initiative wrt stdlibpp ?
I find the proposal here a bit too preliminary to say anything useful. Or is there more information elsewhere?
(I am also not sure I understand this thread, it starts with a post that contains a quote, https://coq.zulipchat.com/#narrow/stream/237655-Miscellaneous/topic/Umbrella.20library.20for.20Coq/near/208210104, but I have no idea where that quote refers to, so there is probably some prior conversation that I am unaware of.)
Maybe @Michael Soegtrop can elaborate. My understanding is it is part of the ongoing discussion on the spectrum of common libraries from stdlib to independent opam libraries, or even random github repos.
I understand umbrella to have similar goals as extlib, stdlibpp, stdlib2, ...
I believe a coherent proposal is still missing.
quote refers to here: https://coq.zulipchat.com/#narrow/stream/237655-Miscellaneous/topic/Formalizing.20undergraduate.20mathematics/near/208199545
My suggestion of an "umbrella library" refered to @Karl Palmskog 's wiki entry "https://github.com/coq-community/manifesto/wiki/List-of-disused-formalized-mathematics-in-Coq".
So it is about individual formalizations of specific theorems, say the proof of the AKS prime algorithm in the above link.
I think such developments are too small to handle them with a mechanism like the Coq platform but too precsious to let them bit rot.
How this is best handled I can't say - that's what we are discussing here. The minimum should be:
build the "umbrella library" components with each new release of Coq (not in CI) and do something to fix issues - simple stuff immediately and more complicated things by informing the author. I explicitly don't suggest to put this into Coq's CI.
maybe review things every other years and see if the interface is still appropriate.
thanks for clarifying. I would argue this repo is already a sort of mini-umbrella Coq library: https://github.com/jmadiot/coq100 (but without CI or packaging, etc.)
thanks for clarifying. I would argue this repo is already a sort of mini-umbrella-Coq library: https://github.com/jmadiot/coq100 (but without CI or packaging, etc.)
Definitely a starting point! I don't think we need a fully automated CI since it should anyway only run with each Coq release.
ideally, the umbrella library maintainer would be a math-oriented researcher. I have no intuition at all on what math is considered good/bad/useful/trivial/worth preserving
There is a tachnical side to it and a curating side to it. I guess there should be some sort of committee to decide what goes into the Coq umbrella library (and is saved from bit rot) and what not, possibly with a quarterly discussion meeting.
One could separate the maintainer work from the assessment about what is interesting maths.
It should be easier to find a committee of knowledgeable mathematicians than to also find a maintainer at the same time.
Isn't this coq100
project a good candidate for inclusion in coq-community?
yes, I opened an issue in the project itself: https://github.com/jmadiot/coq100/issues/4
(since the author did not show interest before here: https://github.com/coqtail/coqtail/issues/1 I didn't go directly to a manifesto issue)
maybe we could reach out through some other channel if no response for a week?
As one of the coauthors of coqtail I should have reacted a tad more :/
I can ping JM on a side-channel on which he might be more active
Last updated: Oct 13 2024 at 01:02 UTC