What kind of hardware are the benches running on? I just realized the reported timings are about 70% faster than what I can achieve locally.. and I thought my laptop was pretty fast
xeon E-2246G
it's a massive waste of cores as we run everything with -j 1 too
Thanks!
It seems I should be expecting a 20% difference on single thread workload. not 70%. Another mystery to investigate..
which timings are you looking at? eg the whole package or some particular commands?
On master that line takes ~24s for me.
Actually, it might be closer to 22s. The results are quite noisy. Still, not close enough to ~13s on the bench runner.
What's the source of the 20% figure, in what context is that, and does it account for caches and RAM speed?
And does the bench server run at base clock (3.6GHz) or some boost clock (up to 4.8GHz)?
*https://www.intel.com/content/www/us/en/products/sku/191043/intel-xeon-e2246g-processor-12m-cache-3-60-ghz/specifications.html
It's from synthetic benchmarks. My laptop has a i7-9750H. I can't find anything super reliable but the usual suspects report a 20% overall difference, some saying that's single-threaded and some saying that's multi-threaded workloads.
The CPUs are from the same year, same node, both are coffee lake.. the only difference is the higher boost clock of 4.8 versus 4.5
And the power limit, obviously.. but I doubt they'd make it tight enough to kill single thread performance on the laptop part.
@Paolo Giarrusso I am curious, how long does that line take on your apple hardware? You can just copy the contents of the link above and put Time
in front of the line 32.
about 16s on my laptop (i7-12700H) with master compiled with dune dev profile
but 13s with 8.17 compiled with release profile, and I don't think the change is because of the coq version
Oh, of course. I suppose the bench uses the release version?
yes
Okay, that explains everything. With the release profile I get roughly 14.5s which is well within the 20% estimate.
It might also explain why I often cannot replicate slowdowns from the bench results locally. If everything is already slow, a small slowdown could look even smaller in terms of percentages.
8.17 with the release profile is even faster at 13.7s.. unfortunately this also shows that I've been using an unoptimized build as my daily driver. My main coqtop binary takes 20s. :face_palm:
Maybe not.. it's just my custom OCAMLRUNPARAM
settings I usually use. Okay, I guess that solves all the remaining questions.
Last updated: Oct 13 2024 at 01:02 UTC