how are you getting those numbers? EDF tells me "1 kWh = 0.057 kg CO₂" ie 1000 times less than your estimate
and 1kWh / month seems like a lot less than 1 housing
(1 kWh / month = 1.4 W if I'm not mistaken)
Yes, even coal is around 1000 g of CO2 per kWh so the 4 tons above are probably more like 5 kg. I agree with Théo that the hardware is probably most of the footprint. To get a very rough order of magnitude, if we consider 4 servers at about 2 tons of CO2, changed avery five years, that about 2 tons of CO2 per year. Far from negligible but probably small in comparison to plane travels of the team to conferences.
800k CI min = 13,33k CI hours
One laptop is probably more powerful than a VM and (I would assume) less efficient.
I didn't find easily accessible info to estimate the electricity consumption of a VM, so let's assume that we are using a laptop at its maximum power, i.e. about 200W, so about 200Wh in one hour, and thus 2 667 kWh for 800k CI, which is probably a large overestimation, but still somewhat consistent with the 100kWh (maybe 10x more) that Guillaume estimated.
The average French household uses 400-500 kWh of electricity per month, but this should be noted that this does not represent the total energy consumption (actually most of the emissions of French households are probably outside the electricity realm). In any case, even with the 2 667 kWh/month figure, this would amount to the electricity consumption of about 5-7 housings, not 100 (and less than 1 housing with the 100kWh figure).
There is also the question of how carbon-neutral the electricity production is. In France, it is pretty carbon-neutral, as emphasized by the figure provided by Gaëtan. When we were using GitLab shared runners, we were on Google's compute. Google and other GAFAM aim to 100% renewable electricity for their datacenter, maybe Google is already there.
However, electricity consumption of the servers themselves is really just part of the picture. There is obviously also the energy to cool them down, but this can also be on "clean" electricity. But every time I have looked at computing impact figures, what stands out is that hardware production is the biggest share, and it is not account for in this calculation.
14 messages were moved here from #Coq devs & plugin devs > runners by Théo Zimmermann.
Pierre Roux said:
Yes, even coal is around 1000 g of CO2 per kWh so the 4 tons above are probably more like 5 kg. I agree with Théo that the hardware is probably most of the footprint. To get a very rough order of magnitude, if we consider 4 servers at about 2 tons of CO2, changed avery five years, that about 2 tons of CO2 per year. Far from negligible but probably small in comparison to plane travels of the team to conferences.
Thanks for providing the estimate of the hardware impact. The issue though is that 4 servers is probably not enough, even in the current light/full CI setup that we have today, and we certainly used more (at least during peaks) when we were on GitLab shared runners.
Gaëtan Gilbert said:
how are you getting those numbers? EDF tells me "1 kWh = 0.057 kg CO₂" ie 1000 times less than your estimate
and 1kWh / month seems like a lot less than 1 housing
Keep in mind that a lot of countries (including the ones running AWS or Google servers) are not using nuclear power but coal or gas. For example, RWE (Germany) is closer to 1kgCO2 per kWh.
Unfortunately, the only reason why we could get figures such as the 800k CI minutes is that GitLab decided to start charging for them. And we do not have any overview of what our usage is with our custom runners.
About carbon intensity of electricity per country: https://ourworldindata.org/grapher/carbon-intensity-electricity
And if you want to look at real-time data (arguably less useful): https://app.electricitymaps.com/
Filecoin green has been working on such energy metrics: https://green.filecoin.io/
I'm not sure whether they have a direct answer, but they've certainly been doing similar things.
This might be of interest: https://infra.ocaml.org/2023/05/30/emissions-monitoring.html
Let's also not forget that there are a lot of opportunities to try to make our CI setup more incremental, in some cases that could have a large impact (c.f. https://github.com/coq/coq/issues/16201 for example)
I am pretty unclear about how much we should expect to be saved from such schemes
Indeed we haven't sorted out data to see how much we can save on average
basically for any PR yielding not changes to coqc
you cache the whole build
then you could have an optimistic mode
so in theory you can save a lot, but requires first understanding where to optimize the incremental build
Emilio Jesús Gallego Arias said:
basically for any PR yielding not changes to
coqc
you cache the whole build
that won't happen often though
I don't know how often it does happen; cache can be tuned in different ways so you can ignore certain coqc changes in draft runs.
The fast mode doesn't provide an asymptotic speedup, but in terms of fixed costs can save a lot.
This would help for changes to only docs and stdlib; do you have stats on how many those are? It'd also let you reuse CI build artifacts locally under the right conditions
For the latter you'd need a local copy or mount of the cache (several GBs to rsync) and to run coqc inside a container — my colleagues tested that with proof general and vscoq
FWIW, this would also be achievable with a smart Nix-based CI.
I would expect cooling and water consumption to also have significant impact.
@Théo Zimmermann I thought Nix can only reuse entire packages, or did that change? OTOH, that'd be enough when coqc doesn't change, and IIUC today Nix has more tooling for distributed caching
@Paolo Giarrusso Indeed, I meant caching at the package granularity. And we would need to be smart so that the source of each Coq package (coq-core / coq-stdlib) is distinguished and that a change in coq-stldib does not trigger a coq-core rebuild.
Maxime Dénès said:
I would expect cooling and water consumption to also have significant impact.
I don't know about the water consumption. For cooling, assuming the cooling system has an efficiency of 3, you'd have to add 30% to the energy consumption of the computers themselves. Probably the most impact is again in the hardware: what's the building impact of the cooling system and how often it is replaced.
My 2p... For computers, manufacturing the computer seems indeed to be in general the same order of magnitude than usage regarding CO2e footprint. For instance, I don't know how much this site can be trusted, but https://boavizta.org/en/blog/empreinte-de-la-fabrication-d-un-serveur says that for the French electricity mix, the average repartition between manufacturing and using is estimated to be 45%/55% (the former being itself typically evaluated to be 400-700 kg CO2e per desktop).
It is not easy to find clear figures about the electricity comsumption in idle and full activity mode: it does not seem to be significantly so different, so the number of servers seems indeed to be a more important criterion than the exact electricity consumption.
When electricity from renewable is used, that does not mean that it has no CO2e impact at the global level: the demand in "green" electricity reduces the ability to use green electricity instead of fossil-based electricity for clients not having explicitly contracted for green electricity (a few month ago, this surrealistic ad for Riot platforms bitcoin miner was quite popular).
A Paris-Nice flight round trip is estimated to emit ~340 kg CO2e in one shot, that is a ~6th of the maximum 2 tons CO2e that the earth is currently able to capture yearly per person. A Paris-Boston round trip is estimated to ~2 tons, that is in one shot the annual CO2e budget of the person who is travelling. That is, from the Coq community point of view, avoiding planes as much as possible for meetings is the NUMBER ONE PRIORITY.
As computable from https://infra.ocaml.org/2023/05/30/emissions-monitoring.html, the order of magnitude is the following: one Paris-Nice return flight is about 2 years of (and a Paris-Boston is about 10 years of) the yearly emissions (190 kg) of one of the servers used by OCaml.
The idle ~= full part surprised me; 5x usage differences are possible even for servers https://www.reddit.com/r/homelab/comments/oam7bs/comment/h3id1sg/?utm_source=reddit&utm_medium=web2x&context=3, but most people seem to report >> 100w idle usage https://www.reddit.com/r/homelab/comments/xr4i3m/power_consumption_for_2nd3rd_gen_xeon_scalable/
5x usage differences are possible even for servers
Thanks. I find it hard to find clear statistics in general. Your links are interesting.
my workstation (16 core AMD Ryzen 9) uses about 60-70W when idle, according to my UPS
for 100W idle, you might be talking some serious server grade CPUs, like 64 core and the like
I'm not an expert and I've also found little, so I have no clear conclusion beyond asking your sysadmins, since there's a 10x range between idle powers (other links went as low as 20w); CPUs didn't even seem the major problem — disk arrays (5w idle per-disk means 50w, unless you spin down the disks which might be a bad idea); peripherals not designed for idling preventing deeper sleep states; etc
for compute servers, seems strange to use anything but SSDs. Either that or you have some local file server.
The question of energy cost of CI was raised again yesterday at a meeting involving the climate group and developers at Inria Paris. For instance, there seem to be plans (from Kim there) to monitor CI consumption.
On our side, I appreciate the manual triggering of the full ci.
Now that (almost) everything runs on the INRIA gitlab, do we have a cleare idear of the amount of hardware used by the CI?
It should help with making an analysis in any case!
And BTW, if for some reason we think that there would be benefits to have more of our CI on GitLab Inria (e.g., for environmental reasons), then we could migrate the GitHub Actions for Windows and macOS testing as well (the custom runner infra does provide Win and macOS runners).
@Pierre Roux I think the answer is yes, or at least, yes in theory, and that's what they want to investigate.
@Théo Zimmermann: Maxime and Thierry are already in contact iiuc, so I'd say it is on their side to say.
Quick talk with Thierry this morning. Another way to grasp the order of magnitude corresponding to 800k CI min / month is, iiuc, that a full CI is equivalent to 32 cores running for ~4h each, that is 5 days of computation of one core, which looks quite impressive.
Maybe worth thinking twice about it.
Hugo power requirement on cores is far from linear scaling, in the sense that higher core density by die usually implies better efficiency
right?
But indeed if we put some resources on it we could reduce our CI time by quite a lot IMHO
Given how many cores modern server have, I would really like to see:
So I guess it is up for the team to decide how prioritary reducing CI minutes while preserving the checks are; I can see many ideas here.
Also 800k CI minutes seems like a lot, that's our peak, but indeed, where we can get more complete stats?
I dunno we average that
Last updated: Dec 05 2023 at 12:01 UTC