Stream: Coq devs & plugin devs

Topic: issue reporting statistics


view this post on Zulip Jason Gross (Jul 23 2022 at 00:45):

I'm working on my presentation for the coq bug minimizer, and figured some of you all might be amused by this plot I made image.png

view this post on Zulip Karl Palmskog (Jul 24 2022 at 11:11):

hm, looks like bug reports was going down after a peak in 2018, but then 2020 broke the trend. I guess there can be many explanations for that

view this post on Zulip Théo Zimmermann (Jul 24 2022 at 20:33):

If you want to have statistically interesting results, you should remove Jason's reports from the data. Otherwise, you have too much that depends on the circumstances of a single individual. In my "Impact of switching bug trackers" paper, Jason's data was removed (as well as mine, but for different reasons).

view this post on Zulip Karl Palmskog (Jul 24 2022 at 20:44):

but what is really a "typical" Coq issue reporter? It seems most are generated by a small number of people with individual circumstances playing a big part

view this post on Zulip Jason Gross (Jul 24 2022 at 20:48):

Here's bug reports by person:
image.png

view this post on Zulip Jason Gross (Jul 24 2022 at 20:48):

And here's what you get if you exclude me:
image.png

view this post on Zulip Jason Gross (Jul 24 2022 at 20:50):

Median # bug reports is 1
Mean # bug reports is 9.2 if you include me, 7.2 if you don't

view this post on Zulip Jason Gross (Jul 24 2022 at 20:51):

If you exclude everyone with just 1 bug report, the median is 4; mean is 19.8 if you include me, 15.3 if you don't

view this post on Zulip Karl Palmskog (Jul 24 2022 at 20:52):

OK, so I guess one might want some cluster/percentile analysis. How many reports do top-X (X = 10?) reporters constitute out of the whole

view this post on Zulip Gaëtan Gilbert (Jul 24 2022 at 20:56):

where did all the non-jim people who were on the first image go?

view this post on Zulip Gaëtan Gilbert (Jul 24 2022 at 20:58):

oh the x labels don't include all the points

view this post on Zulip Gaëtan Gilbert (Jul 24 2022 at 20:58):

that's confusing

view this post on Zulip Jason Gross (Jul 24 2022 at 20:59):

Yeah, I think Excel is bad at bar plots with 984 bins...

view this post on Zulip Karl Palmskog (Jul 24 2022 at 21:04):

ah, I guess I missed the opportunity for appearing in that chart by mostly working on ecosystem side...

But more seriously, I think the best approach these days instead of jumping to create an issue is to start talking about the problem on Zulip or similar, and then report at request of devs (who can often tell if it's novel or not)

view this post on Zulip Karl Palmskog (Jul 24 2022 at 21:05):

at least for regular Coq users

view this post on Zulip Jason Gross (Jul 24 2022 at 21:06):

OK, so I guess one might want some cluster/percentile analysis. How many reports do top-X (X = 10?) reporters constitute out of the whole

image.png
image.png
image.png

view this post on Zulip Karl Palmskog (Jul 24 2022 at 21:09):

ah, so if I'm reading it right, top 10 reporters reported about 40% of all issues

view this post on Zulip Karl Palmskog (Jul 24 2022 at 21:09):

intuitively, this doesn't seem so bad, I guess in small projects it's likely to be > 90% by top 10

view this post on Zulip Jason Gross (Jul 24 2022 at 21:18):

In case anyone else is interested in playing with the data in excel: Coq-Bug-Report-Plots.xlsx
And the scripts I used to scrape the data are in https://github.com/JasonGross/coq-bug-minimizer-paper/tree/main/presentation

view this post on Zulip Jason Gross (Jul 25 2022 at 15:19):

If you want to see it broken out by years:
image.png
Coq-Bug-Report-Plots.xlsx


Last updated: Feb 05 2023 at 21:03 UTC