@Pierre-Marie Pédrot @Jason Gross : I have an odd issue with Coq-rewriter and Coq 8.16.1: on an Intel Mac it builds fine with 32 MB stack (
ulimit -s 32768). On an Apple silicon Mac it does not build with almost twice as much stack. Unfortunately macOS has a hard limit of 65520kB stack size (seriously?). I expect that I am using native ARM Ocaml (it is definitely much faster than the Intel Mac, so I don't think it messed it up in a way which makes me work with Rosetta). Any thoughts?
Btw.: I wanted to also supply an Apple Silicon Coq Platform installer - but I might have to drop rewrite and fiat-crypto for this reason.
P.S.: I CCed Pierre-Marie above because I thought there is also a difference between 8.16.0 and 8.16.1 in that (I remember some discussions about this), but this does not seem to be the case.
Ew, really? I'm happy to add a
SKIP_ENORMOUS_STACK variable to the Makefile that disables the build of the things that need that much stack. If I'm recalling correctly, it's only solinas_reduction that needs this much stack? Is that right, or are there other targets that fail?
I think we'd wrap https://github.com/mit-plv/fiat-crypto/blob/d72df784a772ea179243dde8a3ac04c59e3ce70d/Makefile.examples#L123 and the addition of
solinas_reduction to https://github.com/mit-plv/fiat-crypto/blob/d72df784a772ea179243dde8a3ac04c59e3ce70d/Makefile.config#L53 in something like
ifneq ($(SKIP_ENORMOUS_STACK),1) ...
Another alternative is that we could possibly set up the ARM build to use Haskell rather than ocamlopt to build extracted code? Idk if that's any better?
And if you can let me know what's failing, it might be worth opening an issue on the OCaml bug tracker that compiling this file takes too much stack and ocamlopt should be more efficient, or at least have a mode for compiling with less stack?
@Jason Gross It is this file which fails at 64M stack size:
File "./src/Rewriter/Demo.v", line 20, characters 0-21: Error: Stack overflow.
but afair it was a different one with 32M - maybe also a randomness in parallel build - I am rerunning it at 32M.
Line 20 in Demo.v is
Search, you should report an issue on the Coq bug tracker requesting that
Search be more tail recursive if need be
In the meantime you can just patch out that line?
Hmm OK - a
Search (?x + 0 = ?x). really looks rather innocent.
Should I bisect the stack size needed on Intel for this?
We have stack overflows problems on searches like that one using JScoq since many years.
The easy way out is to limit the search to a module, so that the number of results is smaller, eg
Search ... inside module_name.
Fixing the issue in Coq would surely be better, just wanted to share a quick work around.
@Enrico Tassi : can you imagine why ARM would take 2x the amount of stack than Intel?
@Jason Gross : yes sure I can patch this out. Let's see how far I get with Fiat crypto then ...
Michael Soegtrop said:
Enrico Tassi : can you imagine why ARM would take 2x the amount of stack than Intel?
nope, CC @Gabriel Scherer which surely knows better than me if this is a known issue. Which OCaml version are we talking about BTW?
@Jason Gross : With the patch I get:
File "./src/Rewriter/Language/IdentifiersLibraryProofs.v", line 285, characters 12-29: Error: Stack overflow.
That is https://github.com/mit-plv/rewriter/blob/d644f09c90513b0837c2f9f54fd0a19315ee31dd/src/Rewriter/Language/IdentifiersLibraryProofs.v#LL285C13-L285C29 which is
None of the tactics here are even recursive (they're all defined right above that tactic), so I don't see what could be taking up stack space. Could you get a stack trace?
But it's starting to sound like OCaml / Coq on ARM aren't ready for serious development. These aren't even the parts of rewriter and Fiat Crypto that are known to stress various bits of the Coq toolchain
@Jason Gross : I guess it makes more sense to not ship rewriter and fiat-crypto with the ARM DMG (which I would anyway call experimental). Of course I will include it in the Intel DMG. This looks like we can't solve this between 12 and lunch.
Possibly I should try a newer OCaml ...
@Jason Gross : actually good news: for testing I put the stack size back to 32M. When I use 64M, coq rewriter and coq fiat-crypto run through :-)
So I will include them in the Apple silicon DMG
nope, CC @Gabriel Scherer which surely knows better than me if this is a known issue.
Stack usage is a system-dependent property, for example some platform may expect call-frame alignment guarantees that artificially inflate stack usage on small non-tail recursive functions. Reporting 2x stack usage differences in corner cases is not terribly surprising.
Note that OCaml 5 does not use the system stack for OCaml calls anymore, so this category of issue will magically disappear.
(OCaml 5 is still somewhat experimental, of course.)
(Minor note: having everything in a single Zulip thread made it a bit difficult for me to follow this sub-conversation, I would welcome a new topic if there is more about this specific issue.)
24 messages were moved here from #Coq Platform devs & users > 2022.09+beta1 release by Karl Palmskog.
FWIW, a :+1: from me on "ARM Macs need more stack", even if it's usually fine; there were some documented improvements between 4.10.2 and 4.14, and it was worse early on.
@Paolo Giarrusso : are you aware of improvements between 4.13.1 and 4.14.x?
Tail-call with up to 64 arguments are now guaranteed to be optimized for all architectures.
(all I've seen was listed for https://ocaml.org/releases/4.14.0)
they also list https://github.com/ocaml/ocaml/pull/10549 but that one was also cherry-picked back to 4.13
Thanks! I consider switching to 4.14. Are there known issues around Coq?
@Paolo Giarrusso : since fiat-crypto compiles fine with 64 MB on arm, I refrained from making such a rather large change that late. But I will do it first thing after the release.
Last updated: Jun 03 2023 at 04:30 UTC