Stream: Coq Platform devs & users

Topic: LOC size of 2021.09 and later


view this post on Zulip Karl Palmskog (Sep 30 2021 at 12:29):

for funding and proof engineering purposes, could we compute some measure of the size (most easily, in terms of LOC) of the platform releases? The easiest approach would probably be to use cloc, which has OCaml and Coq support and can be invoked in the root directory of an arbitrarily deep directory structure.

view this post on Zulip Karl Palmskog (Sep 30 2021 at 12:33):

motivation: Paulson and his collaborators often claim that the Isabelle AfP is "the largest repository of mechanised proofs" (e.g., here)

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:33):

Yes, computing the locs wouldn't be hard. E.g. the script which creates the smoke test kit retrieves the sources of all primary packages via opam. I could use the same script and count locs there.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:34):

Give me a few minutes ...

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:46):

Here you go

source-statistics$ cloc *
    9081 text files.
    7453 unique files.
    2447 files ignored.

github.com/AlDanial/cloc v 1.90  T=13.18 s (505.7 files/s, 195677.7 lines/s)
-----------------------------------------------------------------------------------
Language                         files          blank        comment           code
-----------------------------------------------------------------------------------
Coq                               4602         233957         158330        1738065
OCaml                              876          24994          31061         111873
C                                  351          15523          18845          87204
TeX                                 81           6197           1946          44689
Bourne Shell                        77           2525           2589          17908
Markdown                           168           4184              0          16952
C/C++ Header                       115           3057           7425           9240
SVG                                  2              1              1           8539
make                               108           1617            973           5103
HTML                                10            122            105           4166
C++                                 10            550            900           3117
YAML                                54            460            208           2867
Assembly                            60            362           2562           1641
CSS                                 12            313            100           1247
Python                              17            231            145           1180
Lisp                                11            181            112           1018
Nix                                 31            127            154            820
Bourne Again Shell                  24            126            231            633
JavaScript                          12            100            232            521
XML                                  1              3              0            296
Haskell                              3             48              5            267
Verilog-SystemVerilog               11             28              0            237
m4                                   1             36              4            233
Lua                                  2             25             23            224
Perl                                 3             47             32            214
JSON                                 2              0              0            176
awk                                  3              5              1             93
Dockerfile                           5              9              0             68
INI                                  1              0              0             38
sed                                  4              1             25             38
Bazel                                1             18              0             35
DOS Batch                            5             10              0             24
diff                                 1              3             10             11
C Shell                              1              0              0              4
TOML                                 1              0              3              3
reStructuredText                     1             32            127              2
-----------------------------------------------------------------------------------
SUM:                              6667         294892         226149        2058746
-----------------------------------------------------------------------------------

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:47):

I will commit a script which just creates a folder with all sources (in subfolders for each package).

The above is for 2021.09 8.13

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:48):

This includes all packages which have been explicitly installed via opam (--installed-roots).

view this post on Zulip Karl Palmskog (Sep 30 2021 at 12:48):

OK, so bottom line:

view this post on Zulip Gaëtan Gilbert (Sep 30 2021 at 12:49):

shouldn't there be more rst if it includes coq?

view this post on Zulip Karl Palmskog (Sep 30 2021 at 12:49):

does it include Coq itself? I thought not.

view this post on Zulip Guillaume Melquiond (Sep 30 2021 at 12:50):

With 876 OCaml files, I suppose it does.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:51):

Ehm, sorry, Coq itself is not included - I filtered for coq-. So it also doesn't include things like gappa or elpi.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:52):

A moment ...

view this post on Zulip Karl Palmskog (Sep 30 2021 at 12:52):

in comparison, Isabelle AfP is currently ~3.1M LOC: https://www.isa-afp.org/statistics.html

view this post on Zulip Théo Zimmermann (Sep 30 2021 at 12:54):

Do you know how they compute their numbers of LOC (e.g. do they eliminate comments and blank lines)?

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:56):

Should I keep CoqIDE, elpi, gappa, menhir?

view this post on Zulip Karl Palmskog (Sep 30 2021 at 12:57):

Théo Zimmermann said:

Do you know how they compute their numbers of LOC (e.g. do they eliminate comments and blank lines)?

don't know unfortunately, but pretty sure they eliminate blank lines at least, https://link.springer.com/chapter/10.1007/978-3-319-20615-8_1

“code” refers to definitions and proofs (including comments but not empty lines)

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:59):

OK, here is an update which adds coq, coqide, gappa, elpi and menhir:

coq-platform-2021.09$ cloc source-statistics/
   19061 text files.
   12413 unique files.
    8261 files ignored.

github.com/AlDanial/cloc v 1.90  T=16.34 s (664.2 files/s, 201435.4 lines/s)
-----------------------------------------------------------------------------------
Language                         files          blank        comment           code
-----------------------------------------------------------------------------------
Coq                               7117         278890         185598        1962558
OCaml                             1939          66361          72064         340015
C                                  358          15741          19047          89299
TeX                                107           7342           2302          50496
Markdown                           208           6345              0          24638
reStructuredText                   109          14088          17064          21499
Bourne Shell                       235           3408           3091          21047
C++                                 37           1509           1540          11485
C/C++ Header                       139           3382           7757          10562
SVG                                  2              1              1           8539
make                               119           1764           1085           5918
HTML                                17            193            105           4405
Python                              44           1203           1483           4043
YAML                                59            594            279           3825
CSS                                 24            549            256           2606
Assembly                            60            362           2562           1641
Lisp                                13            240            156           1242
Nix                                 61            196            186           1180
m4                                   4            270             23           1168
diff                                 3              9             27            984
XML                                  5             29              7            876
Standard ML                         67            285              0            818
NAnt script                          1            215              0            712
Bourne Again Shell                  25            132            233            681
JavaScript                          15            118            264            607
CSV                                  1              0              0            501
Verilog-SystemVerilog               19             34              0            269
Haskell                              3             48              5            267
ANTLR Grammar                       33             59             16            225
Lua                                  2             25             23            224
Perl                                 3             47             32            214
JSON                                 2              0              0            176
Dockerfile                           6             25             22            104
awk                                  3              5              1             93
DOS Batch                            6             37             12             90
INI                                  1              0              0             38
sed                                  4              1             25             38
Bazel                                1             18              0             35
C Shell                              1              0              0              4
TOML                                 1              0              3              3
Windows Resource File                1              0              0              1
-----------------------------------------------------------------------------------
SUM:                             10855         403525         315269        2573126
-----------------------------------------------------------------------------------

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 12:59):

If you want something else, please give my your package name regexp :-)

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 13:04):

I guess all in all it should be comparable - this is not all of Coq (e.g. MetaCoq is not yet in).

view this post on Zulip Karl Palmskog (Sep 30 2021 at 13:05):

well, it at least shows that released for 8.13 is very likely larger than AfP right now (2M LOC platform, then a lot of other stuff)

view this post on Zulip Enrico Tassi (Sep 30 2021 at 13:07):

So, you really want me to open a PR to add .elpi support to cloc...

view this post on Zulip Enrico Tassi (Sep 30 2021 at 13:09):

Perl... maybe not today

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 13:18):

@Karl Palmskog : it wouldn't be hard to run cloc over everything in coq-released, it just would take a while.

Indeed I believe that Coq Platform is only a rather small part of it. But then for fairness I think one should only count stuff which compiles with 8.13.

view this post on Zulip Karl Palmskog (Sep 30 2021 at 13:45):

if we wanted to measure "all" of released for 8.13, an alternative to diving into complex opam search parameters would be to use this as a guide: https://coq-bench.github.io/clean/Linux-x86_64-4.12.0-2.0.8/released/

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 14:52):

@Karl Palmskog : this doesn't look accurate. E.g. coq-aac-tactics version 8.13.0 is compatible with Coq 8.13.2, but the table says "NC" - if I interpret it right.

view this post on Zulip Guillaume Melquiond (Sep 30 2021 at 14:56):

That is because it is marked as incompatible with OCaml 4.12.0.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 14:59):

@Guillaume Melquiond thanks! This looks better: (https://coq-bench.github.io/clean/Linux-x86_64-4.10.2-2.0.6/released/)

@Karl Palmskog: Can one have the contents as e.g. XML?

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 15:00):

There seems to be a lot of stuff which stopped at 8.10 ...

view this post on Zulip Enrico Tassi (Sep 30 2021 at 15:14):

CC @Guillaume Claret

view this post on Zulip Guillaume Claret (Sep 30 2021 at 15:35):

So I have this JS snippet to get the list of working packages for a given Coq version:

console.log(JSON.stringify(
  [...document.querySelectorAll("td.success a")]
    .map(a => a.href.slice(a.baseURI.length).split("/"))
    .filter(([coqVersion,,]) => coqVersion === "8.13.2")
    .map(([,name, version]) =>
      (name + "." + version.slice(0, version.length - ".html".length)).replaceAll("%2B", "~")
    ),
  null, 2
))

view this post on Zulip Guillaume Claret (Sep 30 2021 at 15:36):

where 8.13.2 should be replaced by the Coq version number

view this post on Zulip Guillaume Claret (Sep 30 2021 at 15:36):

it should be run into the JS console of the page https://coq-bench.github.io/clean/Linux-x86_64-4.10.2-2.0.6/released/ for example

view this post on Zulip Guillaume Claret (Sep 30 2021 at 15:37):

I have got the output:

[
  "aac-tactics.8.13.0",
  "addition-chains.0.4",
  "almost-full.8.13.0",
  "approx-models.1.0",
  "autosubst.1.7",
  "bbv.1.2",
  "bignums.8.13.0",
  "bits.1.1.0",
  "buchberger.8.13.0",
  "ceres.0.4.0",
  "chapar.8.13.0",
  "chick-blog.1.0.1",
  "coinduction.1.0",
  "color.1.8.1",
  "comp-dec-modal.1.0",
  "compcert.3.8",
  "compcert.3.9",
  "compcert-32.3.8",
  "compcert-32.3.9",
  "coq2html.1.0",
  "coq2html.1.1",
  "coq2html.1.2",
  "coqffi.1.0.0~beta5",
  "coqffi.1.0.0~beta6",
  "coqffi.1.0.0~beta7",
  "coqoban.8.13.0",
  "coqprime.1.0.6",
  "coquelicot.3.1.0",
  "coquelicot.3.2.0",
  "cunit.1.0.0",
  "deriving.0.1.0",
  "dijkstra.0.1.0",
  "dpdgraph.0.6.9",
  "elpi.1.8.1",
  "elpi.1.9.0",
  "elpi.1.9.1",
  "elpi.1.9.2",
  "elpi.1.9.3",
  "elpi.1.9.4",
  "elpi.1.9.5",
  "elpi.1.9.6",
  "elpi.1.9.7",
  "elpi.1.10.0",
  "elpi.1.10.1",
  "elpi.1.10.2",
  "elpi.1.10.3",
  "elpi.1.11.0",
  "equations.1.2.3~8.13",
  "equations.1.2.4~8.13",
  "equations.1.3~beta2~8.13",
  "equations.1.3~beta~8.13",
  "error-handlers.1.0.0",
  "error-handlers.1.1.0",
  "error-handlers.1.1.1",
  "error-handlers.1.2.0",
  "ext-lib.0.11.3",
  "extructures.0.2.2",
  "extructures.0.3.0",
  "flocq.3.2.1",
  "flocq.3.3.0",
  "flocq.3.3.1",
  "flocq.3.4.0",
  "flocq.3.4.1",
  "flocq.3.4.2",
  "fourcolor.1.2.3",
  "freespec-core.0.3",
  "freespec-exec.0.3",
  "freespec-ffi.0.3",
  "function-ninjas.1.0.0",
  "gaia.1.12",
  "gappa.1.4.5",
  "gappa.1.4.6",
  "geometric-algebra.0.8.12",
  "goedel.8.12.0",
  "goedel.8.13.0",
  "graph-theory.0.8",
  "graph-theory.0.9",
  "hammer.1.3.1~8.13",
  "hammer-tactics.1.3.1~8.13",
  "hierarchy-builder.1.0.0",
  "hierarchy-builder.1.1.0",
  "hierarchy-builder-shim.1.1.0",
  "high-school-geometry.8.13.0",
  "hott.8.13",
  "http.0.1.0",
  "huffman.8.13.0",
  "hydra-battles.0.4",
  "infotheo.0.3",
  "infotheo.0.3.1",
  "infotheo.0.3.2",
  "infotheo.0.3.3",
  "interval.4.1.0",
  "interval.4.1.1",
  "interval.4.2.0",
  "interval.4.3.0",
  "io.2.0.0",
  "io.2.1.0",
  "io.3.0.0",
  "io.3.1.0",
  "io.4.0.0",
  "io-hello-world.1.2.0",
  "io-list.1.0.0",
  "io-list.1.1.0",
  "io-system.2.1.0",
  "io-system.2.3.0",
  "io-system.2.4.1",
  "io-system-ocaml.2.3.1",
  "iris.3.4.0",
  "iris-heap-lang.3.4.0",
  "iris-string-ident.0.1.0",
  "itauto.8.13",
  "iterable.1.0.0",
  "itree.3.2.0",
  "itree.4.0.0",
  "itree-io.0.1.0",
  "jmlcoq.8.13.0",
  "jsast.2.0.0",
  "libhyps.1.0.1",
  "libhyps.1.0.2",
  "libhyps.2.0.1",
  "libhyps.2.0.2",
  "list-plus.1.0.0",
  "list-plus.1.1.0",
  "list-string.2.0.0",
  "list-string.2.1.0",
  "list-string.2.1.1",
  "list-string.2.1.2",
  "math-classes.8.13.0",
  "mathcomp-abel.1.0.0",
  "mathcomp-algebra.1.12.0",
  "mathcomp-analysis.0.3.5",
  "mathcomp-analysis.0.3.6",
  "mathcomp-analysis.0.3.7",
  "mathcomp-analysis.0.3.8",
  "mathcomp-analysis.0.3.9",
  "mathcomp-analysis.0.3.10",
  "mathcomp-bigenough.1.0.0",
  "mathcomp-character.1.12.0",
  "mathcomp-dioid.0.1",
  "mathcomp-field.1.12.0",
  "mathcomp-fingroup.1.12.0",
  "mathcomp-finmap.1.5.1",
  "mathcomp-multinomials.1.5.4",
  "mathcomp-odd-order.1.12.0",
  "mathcomp-real-closed.1.1.2",
  "mathcomp-solvable.1.12.0",
  "mathcomp-ssreflect.1.12.0",
  "mathcomp-zify.1.0.0~1.12~8.13",
  "menhirlib.20200525",
  "menhirlib.20200612",
  "menhirlib.20200619",
  "menhirlib.20200624",
  "menhirlib.20201122",
  "menhirlib.20201201",
  "menhirlib.20201214",
  "menhirlib.20201216",
  "menhirlib.20210310",
  "menhirlib.20210419",
  "metacoq-pcuic.1.0~beta2~8.13",
  "metacoq-template.1.0~beta2~8.13",
  "metacoq-translations.1.0~beta2~8.13",
  "min-imports.1.0.0",
  "min-imports.1.0.1",
  "min-imports.1.0.2",
  "moment.1.1.0",
  "moment.1.2.0",
  "monae.0.3",
  "monae.0.3.1",
  "monae.0.3.2",
  "monae.0.3.3",
  "monae.0.3.4",
  "mtac2.1.4~8.13",
  "ollibs.2.0.1",
  "ordinal.0.5.0",
  "paco.4.0.4",
  "paco.4.1.0",
  "paco.4.1.1",
  "paramcoq.1.1.2~coq8.13",
  "parsec.0.1.0",
  "pi-agm.1.2.6",
  "pocklington.8.12.0",
  "poltac.0.8.12",
  "qarith-stern-brocot.8.13.0",
  "quickchick.1.5.0",
  "record-update.0.3.0",
  "reduction-effects.0.1.2",
  "reglang.1.1.2",
  "relation-algebra.1.7.5",
  "robot.0.1",
  "rsa.8.8.0",
  "semantics.8.11.1",
  "simple-io.0.1",
  "simple-io.0.2",
  "simple-io.1.0.0",
  "simple-io.1.5.0",
  "smpl.8.13",
  "stalmarck.8.13.0",
  "stalmarck-tactic.8.13.0",
  "stdpp.1.5.0",
  "switch.1.0.5",
  "tactician.1.0~beta1~8.13",
  "tactician-dummy.1.0~beta1",
  "tactician-stdlib.1.0~beta1~8.13",
  "tlc.20210316",
  "topology.9.0.0",
  "tree-calculus.1.0.0",
  "type-infer.0.1.0",
  "unicoq.1.5~8.13",
  "unimath.20210807",
  "vst.2.7",
  "vst.2.7.1",
  "vst.2.8",
  "vst-32.2.7.1",
  "vst-32.2.8",
  "waterproof.1.0.0",
  "yalla.2.0.2",
  "zorns-lemma.9.0.0"
]

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 15:47):

@Guillaume Claret : thanks - works nicely! I am not sure, though, if long term I wouldn't prefer a wget of the HTML page + some XSLT.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 15:52):

@Karl Palmskog : there are still some largish developments missing especially from US universities, say CertiCoq, Kami, Vellum, CertiKOS

view this post on Zulip Karl Palmskog (Sep 30 2021 at 15:55):

yes, tons of large projects are missing for sure (although I would not count CertikOS since it's private and proprietary). I think Awesome Coq has the top public ones.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 16:07):

I didn't follow CertiKOS since I left automotive - didn't Zhong Shao plan to open source it at some point? I thought that was a prerequisite for the deepspec funding.

view this post on Zulip Karl Palmskog (Sep 30 2021 at 16:09):

well, if someone can point to a publicly inspectable CertikOS version, I will withdraw my "private and proprietary" comment above

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 16:27):

If you google it, you find e.g. http://flint.cs.yale.edu/certikos/publications/device/ or https://github.com/npe9/certikos. But this doesn't look endorsed if you ask me. The first link is to the flint group webpage, though.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 16:29):

The license is not open source, though (Yale non commercial, I guess similar to CompCert).

view this post on Zulip Karl Palmskog (Sep 30 2021 at 16:34):

the npe9 link is seemingly for Coq 8.4, so that stretches "public" quite a bit.

view this post on Zulip Michael Soegtrop (Sep 30 2021 at 16:36):

That's what I meant with "looks not endorsed".


Last updated: Jan 30 2023 at 10:03 UTC