Stream: Coq devs & plugin devs

Topic: Extended CI Runners


view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 19:26):

This could be of our interest, please feel free to open an issue to discuss more, in particular the large runners could speed up our CI by quite a bit:

We are pleased to announce that shared docker runners are now available
for gitlab.inria.fr!

Thanks to everyone who helped testing the service during the (very
long!) beta period. You can right now enable the shared runners in the
CI/CD settings of your Gitlab projects.

See the shared runners documentation for more details.
https://ci.inria.fr/doc/page/gitlab/

The three following docker runners are available:

The runners will handle jobs declared in the .gitlab-ci.yml file of the
project repository, at the condition that the jobs have at least one tag
among ci.inria.fr, linux and either small, medium or large (these three
last tags are exclusive: they cannot be used together for the same job
since no runner will match more than one of them).

We would like to send a big thank you to Anthony Baire for all his
efforts in developping this service, and we would like to thank also the
CI infrastructure team of the Inria DSI who helped in deploying the
platform.

Thank you very much for using ci.inria.fr, and happy hacking!

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 19:36):

we're not on inria's gitlab though

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 19:53):

@Gaëtan Gilbert I was thinking we could also mirror there and try?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 19:54):

There's already a mirror actually

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 19:54):

https://gitlab.inria.fr/coq

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 19:59):

seems like it would be easier to add some VMs in https://ci.inria.fr/project/coq/slaves/new
xxLarge says 4x2 GHz + 12 GB RAM

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:00):

That's almost the same than the runners right?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:00):

gitlab stuff is free of maintenance I understand

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:00):

but we are mainly cpu bound, so if we got 4 cpu in both

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:01):

parallel jobs can eat a lot of memory too

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:01):

Problem with the ones you point out is that they need maintenance right?

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:01):

nothing is free of maintenance

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:01):

gitlab runners are, we just need to push by coqbot

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:01):

what more maintenance they need?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:02):

the instances you point out need to be spawned manually

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:02):

we are using 2 x 4Gb

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:02):

so 4 x 8Gb should work

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:02):

Gaëtan Gilbert said:

nothing is free of maintenance

yes, but some things are harder to maintain that others

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:02):

te mirroring system needs maintenance
I don't want to have CI in more places than needed

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:03):

yes it does, but we already have to maintain it

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:03):

don't we?

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:03):

can coqbot push to 2 gitlabs?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:03):

what's the difference between pushing to gitlab.com vs inria.gitlab.com

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:03):

sorry gitlab.inria.fr

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:03):

or do you want to move completely to inria gitlab?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:03):

you are right if gitlab.inria.fr does work faster

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:03):

we should not have 2

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:04):

I don't see a point having 2

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:04):

@Théo Zimmermann is the one that knows if pushing to 2 places is easy or hard

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:04):

but a possible 2x speedup could be nice

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:04):

but maybe not worth it

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:04):

personally I'm not sure, just copying the mail

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:04):

to see what you folks think

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:05):

actually I was thinking that we could indeed then disable rocqueabalu

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:05):

so gitlab.inria.fr would be _less_ maintencance

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:05):

because our runners have certainly made us trouble

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:05):

and for sure they need us keeping them updated

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:05):

isn't it already disabled?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:06):

we still use the other

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:06):

don't we

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:06):

I have no idea, so that's why I'm interested in the Inria offering, so I don't have to worry about the custom runners anymore

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:06):

https://gitlab.com/coq/coq/-/settings/ci_cd#js-runners-settings only shows the bench runners and the windows runners (and we don't have windows jobs on gitlab anymore)

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:07):

we have these ones ci-coq-04.sophia.inria.fr

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:07):

I don't know what these are

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:07):

yes we should cleanup

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:08):

that's the bench runners?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:08):

ci-coq-04.sophia.inria.fr ?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:08):

oh so we are fully running on gitlab's runners, that's a lot of cpu we are taking XD

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:09):

yup

view this post on Zulip Maxime Dénès (Mar 10 2022 at 20:50):

If we need more runners, we can also consider spawning VMs on demand with a cloud provider

view this post on Zulip Maxime Dénès (Mar 10 2022 at 20:51):

it has become easier for public institutions like Inria to order from cloud providers

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:52):

I think our best case would be for that to happen transparently

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:52):

@Maxime Dénès

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:52):

so we can just push to gitlab.inria.fr

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:52):

and they have enough runners for us, so we don't care

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:52):

and if they need more, it is handled centrally at the gitlab.inria.fr level

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:53):

Something that gitlab.inria.fr could do very well for us is to have a good, fast, and large cache

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:53):

so our CI jobs can access to it

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 20:53):

once we can move to a build system that implements caching

view this post on Zulip Gaëtan Gilbert (Mar 10 2022 at 20:55):

cache what?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:14):

build results

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:15):

dune cache for example

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:15):

that could be a way to help with PRs that for example only change the stdlib

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:15):

or the documentation

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:15):

what would be cache hit/miss ratio would be a cool thing to check for a paper

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:16):

imagine we put the whole CI under dune [including .vo files]

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:38):

I'm a bit lost TBH

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:39):

@Maxime Dénès this thread is about moving from gitlab.com runners to gitlab.inria.fr runners

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:40):

it could make sense for:

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:40):

also, last point is "does gitlab.inria.fr offer more advantages"

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:40):

and I was thinking that maybe Inria runners could indeed have much faster artifact access

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:41):

which could be useful for example to share a build cache

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:41):

if we use a system like dune for all CI

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:41):

we don't need anymore to pass the build artifacts from one job to another

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:41):

but we can just mount dune's cache

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:41):

generically, but that's a quite speculative thing

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:42):

@Maxime Dénès does it make sense?

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:45):

I don't see why these runners would have faster access to artifacts

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:45):

Don't get me wrong, I use them for projects hosted on the gitlab Inria, and they are very useful

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:46):

But for Coq, I would have thought if we need more computing power, it is easier to add runners to our current gitlab instance rather than switching

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:46):

Don't forget that it is a community edition instance, so you typically don't get more things than gitlab.com

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:47):

@Maxime Dénès how it is easier to add runners to the current gitlab instance

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:47):

than to use the default ones at gitlab.inria ?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:47):

aren't the runners at gitlab.inria already there?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:48):

vs the first case we need to configure and take care of them?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:48):

Maxime Dénès said:

I don't see why these runners would have faster access to artifacts

I thought they would be in the same data center, but that was a hypothesis

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:49):

Maxime Dénès said:

Don't forget that it is a community edition instance, so you typically don't get more things than gitlab.com

all we use is for CI so I guess that'd be OK, but I'm not expert

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:49):

we also get some shared runners on gitlab.com, don't we?

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:50):

I guess I missed the starting point: what do we need that we don't have? :)

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:50):

if we need more computational power than the default gitlab.com runners, it is pretty easy to setup a docker machine runner than can auto scale

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:51):

Maxime Dénès said:

I guess I missed the starting point: what do we need that we don't have? :)

Runners on gitlab.inria are 2x faster

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:52):

Maxime Dénès said:

if we need more computational power than the default gitlab.com runners, it is pretty easy to setup a docker machine runner than can auto scale

yes, but the starting point was more "can we get CI 2x faster just by updating a variable in coqbot"?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:52):

Yes we can set a docker machine, but that's quite a bit of work I think, and has a big bus factor, doesn't it?

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:53):

a big bus factor is good :)

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:53):

but ok, I had missed the Inria ones were twice as fast

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:53):

how did you compare?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:54):

Indeed a big bus factor is good, I guess I mean a small one :D

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:54):

Maxime Dénès said:

how did you compare?

4 cores vs 2 in the gitlab offering

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:54):

see the post from Thierry we quoted

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:54):

cores? you mean vCPUs ?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:54):

large (4 CPUs, 8096 MB RAM), tags: ci.inria.fr, linux, large

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:54):

yes that's the specs

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:54):

gitlab's ones are 1 or 2 cores I think

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:54):

well, I guess CPU is not the real thing :)

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:55):

yeah that's what Thierry wrote, he means vcpu of course

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:55):

comparing numbers of vCPUs has to be taken with a grain of salt

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:55):

so maybe the first would be to run a small experiment on a workload, I guess

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:55):

so what the potential gain is

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:56):

(also the frequency isn't made explicit in the post)

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:56):

indeed, but most gilab seem to be one core these days

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:56):

yeah but at least we should compare the frequencies

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:57):

Maxime Dénès said:

so maybe the first would be to run a small experiment on a workload, I guess

I agree, but given that most gitlab instances are giving only one vcpu it seems these days, we should see faster

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:57):

lol

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:57):

Similar frequencies on paper

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:57):

but you are right that's hard to know

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:57):

as these are shared machines

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:57):

ah, that's the bit of info I didn't see

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:57):

well I took a random gitlab runner, they vary quite a bit

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:58):

but indeed we don't have code to autodetect the number of cpus and use the right flags for make

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:58):

for dune yes

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:58):

I see

view this post on Zulip Maxime Dénès (Mar 10 2022 at 21:59):

maybe we could just bench one build of Coq

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:59):

but yeah given that gitlab is free tier runners, I think it is not a too crazy hypothesis that inria ones are faster

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:59):

yes we can just push

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:59):

we don't even need the bot to do it, just us doing the push manually to gitlab.inria.fr should work

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 21:59):

I didn't realize until now

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:00):

yes, that's what I meant, a manual try

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:00):

I don't have access to https://gitlab.inria.fr/coq

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:00):

oh, I forgot it existed

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:00):

Maxime Dénès said:

yes, that's what I meant, a manual try

yes that's a good idea, and very easy to do

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:00):

let's see

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:00):

and we can compare CI times

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:01):

Hugo, Matej, Pierre L and Matthieu have rights

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:01):

Ok let's write to Hugo and Matthieu in private to get rights , and then let's fix the permissions too

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:01):

but you can push in any temporary fork, in fact

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:02):

for the test

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:02):

Indeed I forgot I can now

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:02):

I'm still busy with grant writing so feel free to try

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:03):

I have a significant backlog too, unfortunately, but at least that's the clear next step

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:04):

I wonder if the docker registry there could also solve the problems docker people were having

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:06):

you mean the dockerhub rate limiting?

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:07):

the standard gitlab solution is to use the dependency proxy, on the consumer side (the project that needs to fetch the image)

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:07):

https://gitlab.inria.fr/egallego/coq/-/pipelines/449421

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:07):

that was very easy so I had to do it

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:08):

Maxime Dénès said:

you mean the dockerhub rate limiting?

I don't know, I just know they had some problems with the limits indeed

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:08):

There's no runners with the right tags, it seems

view this post on Zulip Maxime Dénès (Mar 10 2022 at 22:08):

Either they are not enabled, or they don't have the "docker" tag

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:10):

New try, let's see if docker works tho

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:11):

https://gitlab.inria.fr/egallego/coq/-/jobs/1837812

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:11):

https://gitlab.inria.fr/egallego/coq/-/pipelines/449422

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:11):

Umm, now I'm at a loss, I disabled the tags for the docker job

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:11):

maybe that's implicit?

view this post on Zulip Emilio Jesús Gallego Arias (Mar 10 2022 at 22:12):

Well, this can be fixed, but it won't be me today :/

view this post on Zulip Emilio Jesús Gallego Arias (Mar 11 2022 at 00:06):

https://gitlab.inria.fr/egallego/coq/-/pipelines/449487

view this post on Zulip Emilio Jesús Gallego Arias (Mar 11 2022 at 00:06):

running now was a silly config error

view this post on Zulip Emilio Jesús Gallego Arias (Mar 11 2022 at 00:20):

After understanding a bit more the architecture, it seems to me that it would possible indeed to share a build cache here

view this post on Zulip Emilio Jesús Gallego Arias (Mar 11 2022 at 00:20):

with fast network access

view this post on Zulip Emilio Jesús Gallego Arias (Mar 11 2022 at 00:53):

https://gitlab.inria.fr/egallego/coq/-/pipelines/449502

view this post on Zulip Enrico Tassi (Mar 11 2022 at 07:38):

I really don't see why the cache would not be almost always invalid. eg all .vo would depend on the coq binary which is different in every PR.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 11 2022 at 09:48):

Enrico Tassi said:

I really don't see why the cache would not be almost always invalid. eg all .vo would depend on the coq binary which is different in every PR.

Many PRs don't touch coqc

view this post on Zulip Emilio Jesús Gallego Arias (Mar 11 2022 at 09:52):

Emilio Jesús Gallego Arias said:

https://gitlab.inria.fr/egallego/coq/-/pipelines/449502

Results seems much faster, but a few of the CI don't pass the right make -j flags, also we got a too large artifact error


Last updated: Feb 02 2023 at 13:03 UTC