Stream: Coq devs & plugin devs

Topic: `@coqbot bench`


view this post on Zulip Ali Caglayan (Mar 21 2022 at 18:28):

As of writing, you should be able to do @coqbot bench, please try it out and let me know how you found it.

view this post on Zulip Ali Caglayan (Mar 21 2022 at 18:32):

It just starts the bench for the moment. On my todo list are:

view this post on Zulip Ali Caglayan (Mar 22 2022 at 00:56):

When you start a bench, the job should appear in the checks tab at the bottom. The link there will take you straight to the bench tab.

view this post on Zulip Ali Caglayan (Mar 23 2022 at 09:42):

As of yesterday you should be able to do:

Hopefully soon we can get coqbot to post a comment too, but I need to work out a few more things before this is possible.

view this post on Zulip Ali Caglayan (Mar 23 2022 at 10:15):

Here is what this looks like at the moment: https://github.com/coq/coq/pull/14748/checks?check_run_id=5659421664

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 11:59):

Is it normal that github displays the bench as waiting as long as it has not been launched?

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 11:59):

it's annoying for most PRs since we don't want to run a bench in general

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 12:00):

(this also mean we can't have a green CI anymore without running the bench)

view this post on Zulip Ali Caglayan (Mar 23 2022 at 12:40):

No that shouldn't happen, I will look into it

view this post on Zulip Ali Caglayan (Mar 23 2022 at 12:41):

@Pierre-Marie Pédrot Have you got an example that you saw?

view this post on Zulip Ali Caglayan (Mar 23 2022 at 12:42):

I see that GitHub displays the bench as waiting, but it doesn't change the overall tick mark for me

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 12:43):

https://github.com/coq/coq/pull/15849

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 12:43):

it's yellow in the list of checks so it can't be green

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 12:43):

(the green tick is about rebasability, not the CI)

view this post on Zulip Ali Caglayan (Mar 23 2022 at 12:45):

I can completely remove it from the checks tab or I can give it a neutral status

view this post on Zulip Ali Caglayan (Mar 23 2022 at 12:46):

I think it would still be useful to keep in the checks tab

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 12:46):

neutral is fine I think

view this post on Zulip Pierre-Marie Pédrot (Mar 23 2022 at 12:46):

(as long as it doesn't appear as waiting somehow)

view this post on Zulip Ali Caglayan (Mar 23 2022 at 12:51):

Alright, I've changed it to be neutral, that should appear in around 5 min

view this post on Zulip Théo Zimmermann (Mar 23 2022 at 12:53):

As I've commented on one of your PRs (but you seem to have missed), you could also try using the action_required status (that we've never used so far, so I don't know how it looks).

view this post on Zulip Théo Zimmermann (Mar 23 2022 at 12:54):

The main issue that I can see with neutral is that so far it has been used only for failures in allow failure mode, and if we start putting it on every PRs, people may stop noticing such failures.

view this post on Zulip Ali Caglayan (Mar 23 2022 at 13:30):

Hmm action_required was way more aggressive than I thought.

view this post on Zulip Ali Caglayan (Mar 23 2022 at 13:32):

It straight up fails the check

view this post on Zulip Ali Caglayan (Mar 23 2022 at 13:32):

https://github.com/coq/coq/pull/15657/checks?check_run_id=5661027689

view this post on Zulip Ali Caglayan (Mar 23 2022 at 13:32):

I believe neutral is better in this case

view this post on Zulip Gaëtan Gilbert (Mar 23 2022 at 13:36):

can't we have no check when the bench isn't started

view this post on Zulip Ali Caglayan (Mar 23 2022 at 13:42):

We can, but is it better to have less info?

view this post on Zulip Gaëtan Gilbert (Mar 23 2022 at 13:42):

yes

view this post on Zulip Gaëtan Gilbert (Mar 23 2022 at 13:42):

in this case "bench isn't started" isn't valuable enough to use a check field imo

view this post on Zulip Théo Zimmermann (Mar 23 2022 at 13:43):

Clearly, for experienced users that know about the command, it's better not to show anything.

view this post on Zulip Théo Zimmermann (Mar 23 2022 at 13:43):

To make the feature discoverable, we should make sure it is documented anywhere the bench is mentioned.

view this post on Zulip Gaëtan Gilbert (Mar 23 2022 at 13:44):

can we not make coqbot into some spammy bot? minimization spam is already pretty bad

view this post on Zulip Théo Zimmermann (Mar 23 2022 at 13:44):

And newcomers also learn by imitation (by seeing other people use the command).

view this post on Zulip Ali Caglayan (Mar 28 2022 at 20:31):

@coqbot bench should now post a summary of the bench in the PR like this: https://github.com/coq/coq/pull/14748#issuecomment-1081062517

view this post on Zulip Ali Caglayan (Mar 28 2022 at 20:32):

Let me know if you have any suggestions about how the output should be formatted.

view this post on Zulip Théo Zimmermann (Mar 29 2022 at 09:10):

Since the bench summary is also displayed in the Checks tab, I would instead suggest to not use any folding in the comment: to display the main table there since it's what people usually look at first, and to put a link from the comment to the check run summary. It would also provide a trace that would mean never losing the links to these check runs. I can give advice on how to achieve this.

view this post on Zulip Ali Caglayan (Mar 31 2022 at 16:55):

For the top speed ups/ slow downs I chose 25 for both. It can be tweaked here:
https://github.com/coq/coq/blob/1f6328bba6596e2cfd35cee6a786feb795cc1adf/dev/bench/render_line_results.ml#L100
Is everybody happy with this? Anybody want more or less?

view this post on Zulip Pierre Roux (Mar 31 2022 at 17:21):

Sounds good (although considring its folded by default in the result, it could maybe be a bit more without being too annoying)

view this post on Zulip Pierre-Marie Pédrot (Apr 01 2022 at 06:51):

@Ali Caglayan I think that we should consider speed-ups in terms of relative time diff, not absolute diff.

view this post on Zulip Ali Caglayan (Apr 01 2022 at 08:16):

@Pierre-Marie Pédrot How do we calculate that?

view this post on Zulip Gaëtan Gilbert (Apr 01 2022 at 08:17):

it's the %DIFF column

view this post on Zulip Ali Caglayan (Apr 01 2022 at 08:20):

Right, but that can have a lot of garbage when I checked it

view this post on Zulip Ali Caglayan (Apr 01 2022 at 08:20):

I can submit a PR that prints both to demonstrate if you like

view this post on Zulip Ali Caglayan (Apr 01 2022 at 08:21):

Anybody have a commit with some real speed ups or slow downs however?

view this post on Zulip Ali Caglayan (Apr 01 2022 at 08:22):

The main issue is that noise is not relative, so an extra 0.01 second can become a lot of pdiff for one line but not so much for another. Hence why I settled on absolute diff.

view this post on Zulip Pierre-Marie Pédrot (Apr 01 2022 at 08:33):

I agree that for small durations it's just noise, but maybe we should consider relative diff above some threshold

view this post on Zulip Pierre-Marie Pédrot (Apr 01 2022 at 08:34):

but maybe it's not that useful anyways

view this post on Zulip Ali Caglayan (Apr 01 2022 at 10:40):

OK here is a PR: https://github.com/coq/coq/pull/15889

view this post on Zulip Ali Caglayan (Apr 01 2022 at 10:41):

When the bench finishes it will display pdiff tables in the log and also generate pdiff artifacts.

view this post on Zulip Ali Caglayan (Apr 01 2022 at 10:43):

I've manually triggered it again just to bench bignums

view this post on Zulip Ali Caglayan (Apr 01 2022 at 11:45):

@Pierre-Marie Pédrot Here are the pdiff tables: https://gitlab.com/coq/coq/-/jobs/2280435968

view this post on Zulip Ali Caglayan (Apr 01 2022 at 11:45):

I suppose I can try to filter by a minimum absolute difference.

view this post on Zulip Ali Caglayan (Apr 01 2022 at 11:46):

What would be a good value? 0.05?

view this post on Zulip Pierre-Marie Pédrot (Apr 01 2022 at 13:40):

maybe a bit higher, like 0.1

view this post on Zulip Ali Caglayan (Apr 01 2022 at 13:51):

Alright, I've filtered those out lets see how we do: https://gitlab.com/coq/coq/-/jobs/2281363213

view this post on Zulip Ali Caglayan (Apr 01 2022 at 13:51):

I've also made the artifacts in _build/timings .txt files so we should be able to view them without downloading

view this post on Zulip Ali Caglayan (Apr 01 2022 at 13:51):

(hopefully)

view this post on Zulip Ali Caglayan (Apr 01 2022 at 13:57):

Of course now coqbot doesn't know that they are txt files so it will not have a good time reporting this bench

view this post on Zulip Ali Caglayan (Apr 01 2022 at 13:59):

Let's try again, I screwed up which values I was filtering https://gitlab.com/coq/coq/-/jobs/2281425991

view this post on Zulip Pierre-Marie Pédrot (Apr 01 2022 at 14:22):

what's this?

Error while fetching checks for #15890 for running bench job: Got more than one checkSuite.

view this post on Zulip Pierre-Marie Pédrot (Apr 01 2022 at 14:22):

https://github.com/coq/coq/pull/15890

view this post on Zulip Gaëtan Gilbert (Apr 01 2022 at 14:23):

it can take a bit of time to push to gitlab / get the pipeline started, this error happens if you talk to the bot before that's done

view this post on Zulip Théo Zimmermann (Apr 01 2022 at 14:23):

We should handle this gracefully, for instance by retrying with a delay.


Last updated: Feb 05 2023 at 21:03 UTC