issue reporting statistics · Coq devs & plugin devs

Stream: Coq devs & plugin devs

Topic: issue reporting statistics

Jason Gross (Jul 23 2022 at 00:45):

I'm working on my presentation for the coq bug minimizer, and figured some of you all might be amused by this plot I made image.png

Karl Palmskog (Jul 24 2022 at 11:11):

hm, looks like bug reports was going down after a peak in 2018, but then 2020 broke the trend. I guess there can be many explanations for that

Théo Zimmermann (Jul 24 2022 at 20:33):

If you want to have statistically interesting results, you should remove Jason's reports from the data. Otherwise, you have too much that depends on the circumstances of a single individual. In my "Impact of switching bug trackers" paper, Jason's data was removed (as well as mine, but for different reasons).

Karl Palmskog (Jul 24 2022 at 20:44):

but what is really a "typical" Coq issue reporter? It seems most are generated by a small number of people with individual circumstances playing a big part

Jason Gross (Jul 24 2022 at 20:48):

Here's bug reports by person:
image.png

Jason Gross (Jul 24 2022 at 20:48):

And here's what you get if you exclude me:
image.png

Jason Gross (Jul 24 2022 at 20:50):

Median # bug reports is 1
Mean # bug reports is 9.2 if you include me, 7.2 if you don't

Jason Gross (Jul 24 2022 at 20:51):

If you exclude everyone with just 1 bug report, the median is 4; mean is 19.8 if you include me, 15.3 if you don't

Karl Palmskog (Jul 24 2022 at 20:52):

OK, so I guess one might want some cluster/percentile analysis. How many reports do top-X (X = 10?) reporters constitute out of the whole

Gaëtan Gilbert (Jul 24 2022 at 20:56):

where did all the non-jim people who were on the first image go?

Gaëtan Gilbert (Jul 24 2022 at 20:58):

oh the x labels don't include all the points

Gaëtan Gilbert (Jul 24 2022 at 20:58):

that's confusing

Jason Gross (Jul 24 2022 at 20:59):

Yeah, I think Excel is bad at bar plots with 984 bins...

Karl Palmskog (Jul 24 2022 at 21:04):

ah, I guess I missed the opportunity for appearing in that chart by mostly working on ecosystem side...

But more seriously, I think the best approach these days instead of jumping to create an issue is to start talking about the problem on Zulip or similar, and then report at request of devs (who can often tell if it's novel or not)

Karl Palmskog (Jul 24 2022 at 21:05):

at least for regular Coq users

Jason Gross (Jul 24 2022 at 21:06):

OK, so I guess one might want some cluster/percentile analysis. How many reports do top-X (X = 10?) reporters constitute out of the whole

image.png
image.png
image.png

Karl Palmskog (Jul 24 2022 at 21:09):

ah, so if I'm reading it right, top 10 reporters reported about 40% of all issues

Karl Palmskog (Jul 24 2022 at 21:09):

intuitively, this doesn't seem so bad, I guess in small projects it's likely to be > 90% by top 10

Jason Gross (Jul 24 2022 at 21:18):

In case anyone else is interested in playing with the data in excel: Coq-Bug-Report-Plots.xlsx
And the scripts I used to scrape the data are in https://github.com/JasonGross/coq-bug-minimizer-paper/tree/main/presentation

Jason Gross (Jul 25 2022 at 15:19):

If you want to see it broken out by years:
image.png
Coq-Bug-Report-Plots.xlsx

Last updated: Apr 17 2024 at 02:02 UTC