`@coqbot bench` · Coq devs & plugin devs

Stream: Coq devs & plugin devs

Topic: `@coqbot bench`

Ali Caglayan (Mar 21 2022 at 18:28):

As of writing, you should be able to do @coqbot bench, please try it out and let me know how you found it.

Ali Caglayan (Mar 21 2022 at 18:32):

It just starts the bench for the moment. On my todo list are:

Reporting the bench results
Putting a little bench check tab at the bottom next to the other checks

Ali Caglayan (Mar 22 2022 at 00:56):

When you start a bench, the job should appear in the checks tab at the bottom. The link there will take you straight to the bench tab.

Ali Caglayan (Mar 23 2022 at 09:42):

As of yesterday you should be able to do:

@coqbot bench to start a bench.
A check status will appear at the bottom of the PR.
If you click the "Details" link it will take you to a page reporting the progress of the bench in the checks tab.
There is a link at the bottom that takes you to the GitLab job (I could probably add an easier link there too).
Importantly, when the bench is finished, the tables will be reported on that page.

Hopefully soon we can get coqbot to post a comment too, but I need to work out a few more things before this is possible.

Ali Caglayan (Mar 23 2022 at 10:15):

Here is what this looks like at the moment: https://github.com/coq/coq/pull/14748/checks?check_run_id=5659421664

Pierre-Marie Pédrot (Mar 23 2022 at 11:59):

Is it normal that github displays the bench as waiting as long as it has not been launched?

Pierre-Marie Pédrot (Mar 23 2022 at 11:59):

it's annoying for most PRs since we don't want to run a bench in general

Pierre-Marie Pédrot (Mar 23 2022 at 12:00):

(this also mean we can't have a green CI anymore without running the bench)

Ali Caglayan (Mar 23 2022 at 12:40):

No that shouldn't happen, I will look into it

Ali Caglayan (Mar 23 2022 at 12:41):

@Pierre-Marie Pédrot Have you got an example that you saw?

Ali Caglayan (Mar 23 2022 at 12:42):

I see that GitHub displays the bench as waiting, but it doesn't change the overall tick mark for me

Pierre-Marie Pédrot (Mar 23 2022 at 12:43):

https://github.com/coq/coq/pull/15849

Pierre-Marie Pédrot (Mar 23 2022 at 12:43):

it's yellow in the list of checks so it can't be green

Pierre-Marie Pédrot (Mar 23 2022 at 12:43):

(the green tick is about rebasability, not the CI)

Ali Caglayan (Mar 23 2022 at 12:45):

I can completely remove it from the checks tab or I can give it a neutral status

Ali Caglayan (Mar 23 2022 at 12:46):

I think it would still be useful to keep in the checks tab

Pierre-Marie Pédrot (Mar 23 2022 at 12:46):

neutral is fine I think

Pierre-Marie Pédrot (Mar 23 2022 at 12:46):

(as long as it doesn't appear as waiting somehow)

Ali Caglayan (Mar 23 2022 at 12:51):

Alright, I've changed it to be neutral, that should appear in around 5 min

Théo Zimmermann (Mar 23 2022 at 12:53):

As I've commented on one of your PRs (but you seem to have missed), you could also try using the action_required status (that we've never used so far, so I don't know how it looks).

Théo Zimmermann (Mar 23 2022 at 12:54):

The main issue that I can see with neutral is that so far it has been used only for failures in allow failure mode, and if we start putting it on every PRs, people may stop noticing such failures.

Ali Caglayan (Mar 23 2022 at 13:30):

Hmm action_required was way more aggressive than I thought.

Ali Caglayan (Mar 23 2022 at 13:32):

It straight up fails the check

Ali Caglayan (Mar 23 2022 at 13:32):

https://github.com/coq/coq/pull/15657/checks?check_run_id=5661027689

Ali Caglayan (Mar 23 2022 at 13:32):

I believe neutral is better in this case

Gaëtan Gilbert (Mar 23 2022 at 13:36):

can't we have no check when the bench isn't started

Ali Caglayan (Mar 23 2022 at 13:42):

We can, but is it better to have less info?

Gaëtan Gilbert (Mar 23 2022 at 13:42):

yes

Gaëtan Gilbert (Mar 23 2022 at 13:42):

in this case "bench isn't started" isn't valuable enough to use a check field imo

Théo Zimmermann (Mar 23 2022 at 13:43):

Clearly, for experienced users that know about the command, it's better not to show anything.

Théo Zimmermann (Mar 23 2022 at 13:43):

To make the feature discoverable, we should make sure it is documented anywhere the bench is mentioned.

Gaëtan Gilbert (Mar 23 2022 at 13:44):

can we not make coqbot into some spammy bot? minimization spam is already pretty bad

Théo Zimmermann (Mar 23 2022 at 13:44):

And newcomers also learn by imitation (by seeing other people use the command).

Ali Caglayan (Mar 28 2022 at 20:31):

@coqbot bench should now post a summary of the bench in the PR like this: https://github.com/coq/coq/pull/14748#issuecomment-1081062517

Ali Caglayan (Mar 28 2022 at 20:32):

Let me know if you have any suggestions about how the output should be formatted.

Théo Zimmermann (Mar 29 2022 at 09:10):

Since the bench summary is also displayed in the Checks tab, I would instead suggest to not use any folding in the comment: to display the main table there since it's what people usually look at first, and to put a link from the comment to the check run summary. It would also provide a trace that would mean never losing the links to these check runs. I can give advice on how to achieve this.

Ali Caglayan (Mar 31 2022 at 16:55):

For the top speed ups/ slow downs I chose 25 for both. It can be tweaked here:
https://github.com/coq/coq/blob/1f6328bba6596e2cfd35cee6a786feb795cc1adf/dev/bench/render_line_results.ml#L100
Is everybody happy with this? Anybody want more or less?

Pierre Roux (Mar 31 2022 at 17:21):

Sounds good (although considring its folded by default in the result, it could maybe be a bit more without being too annoying)

Pierre-Marie Pédrot (Apr 01 2022 at 06:51):

@Ali Caglayan I think that we should consider speed-ups in terms of relative time diff, not absolute diff.

Ali Caglayan (Apr 01 2022 at 08:16):

@Pierre-Marie Pédrot How do we calculate that?

Gaëtan Gilbert (Apr 01 2022 at 08:17):

it's the %DIFF column

Ali Caglayan (Apr 01 2022 at 08:20):

Right, but that can have a lot of garbage when I checked it

Ali Caglayan (Apr 01 2022 at 08:20):

I can submit a PR that prints both to demonstrate if you like

Ali Caglayan (Apr 01 2022 at 08:21):

Anybody have a commit with some real speed ups or slow downs however?

Ali Caglayan (Apr 01 2022 at 08:22):

The main issue is that noise is not relative, so an extra 0.01 second can become a lot of pdiff for one line but not so much for another. Hence why I settled on absolute diff.

Pierre-Marie Pédrot (Apr 01 2022 at 08:33):

I agree that for small durations it's just noise, but maybe we should consider relative diff above some threshold

Pierre-Marie Pédrot (Apr 01 2022 at 08:34):

but maybe it's not that useful anyways

Ali Caglayan (Apr 01 2022 at 10:40):

OK here is a PR: https://github.com/coq/coq/pull/15889

Ali Caglayan (Apr 01 2022 at 10:41):

When the bench finishes it will display pdiff tables in the log and also generate pdiff artifacts.

Ali Caglayan (Apr 01 2022 at 10:43):

I've manually triggered it again just to bench bignums

Ali Caglayan (Apr 01 2022 at 11:45):

@Pierre-Marie Pédrot Here are the pdiff tables: https://gitlab.com/coq/coq/-/jobs/2280435968

Ali Caglayan (Apr 01 2022 at 11:45):

I suppose I can try to filter by a minimum absolute difference.

Ali Caglayan (Apr 01 2022 at 11:46):

What would be a good value? 0.05?

Pierre-Marie Pédrot (Apr 01 2022 at 13:40):

maybe a bit higher, like 0.1

Ali Caglayan (Apr 01 2022 at 13:51):

Alright, I've filtered those out lets see how we do: https://gitlab.com/coq/coq/-/jobs/2281363213

Ali Caglayan (Apr 01 2022 at 13:51):

I've also made the artifacts in _build/timings .txt files so we should be able to view them without downloading

Ali Caglayan (Apr 01 2022 at 13:51):

(hopefully)

Ali Caglayan (Apr 01 2022 at 13:57):

Of course now coqbot doesn't know that they are txt files so it will not have a good time reporting this bench

Ali Caglayan (Apr 01 2022 at 13:59):

Let's try again, I screwed up which values I was filtering https://gitlab.com/coq/coq/-/jobs/2281425991

Pierre-Marie Pédrot (Apr 01 2022 at 14:22):

what's this?

Error while fetching checks for #15890 for running bench job: Got more than one checkSuite.

Pierre-Marie Pédrot (Apr 01 2022 at 14:22):

https://github.com/coq/coq/pull/15890

Gaëtan Gilbert (Apr 01 2022 at 14:23):

it can take a bit of time to push to gitlab / get the pipeline started, this error happens if you talk to the bot before that's done

Théo Zimmermann (Apr 01 2022 at 14:23):

We should handle this gracefully, for instance by retrying with a delay.

Last updated: Apr 19 2024 at 20:01 UTC