This could be of our interest, please feel free to open an issue to discuss more, in particular the large runners could speed up our CI by quite a bit:
We are pleased to announce that shared docker runners are now available
for gitlab.inria.fr!Thanks to everyone who helped testing the service during the (very
long!) beta period. You can right now enable the shared runners in the
CI/CD settings of your Gitlab projects.See the shared runners documentation for more details.
https://ci.inria.fr/doc/page/gitlab/The three following docker runners are available:
- small (1 CPU, 2048 MB RAM), tags: ci.inria.fr, linux, small
- medium (2 CPUs, 4096 MB RAM), tags: ci.inria.fr, linux, medium
- large (4 CPUs, 8096 MB RAM), tags: ci.inria.fr, linux, large
The runners will handle jobs declared in the .gitlab-ci.yml file of the
project repository, at the condition that the jobs have at least one tag
among ci.inria.fr, linux and either small, medium or large (these three
last tags are exclusive: they cannot be used together for the same job
since no runner will match more than one of them).We would like to send a big thank you to Anthony Baire for all his
efforts in developping this service, and we would like to thank also the
CI infrastructure team of the Inria DSI who helped in deploying the
platform.Thank you very much for using ci.inria.fr, and happy hacking!
we're not on inria's gitlab though
@Gaëtan Gilbert I was thinking we could also mirror there and try?
There's already a mirror actually
seems like it would be easier to add some VMs in https://ci.inria.fr/project/coq/slaves/new
xxLarge says 4x2 GHz + 12 GB RAM
That's almost the same than the runners right?
gitlab stuff is free of maintenance I understand
but we are mainly cpu bound, so if we got 4 cpu in both
parallel jobs can eat a lot of memory too
Problem with the ones you point out is that they need maintenance right?
nothing is free of maintenance
gitlab runners are, we just need to push by coqbot
what more maintenance they need?
the instances you point out need to be spawned manually
we are using 2 x 4Gb
so 4 x 8Gb should work
Gaëtan Gilbert said:
nothing is free of maintenance
yes, but some things are harder to maintain that others
te mirroring system needs maintenance
I don't want to have CI in more places than needed
yes it does, but we already have to maintain it
don't we?
can coqbot push to 2 gitlabs?
what's the difference between pushing to gitlab.com vs inria.gitlab.com
sorry gitlab.inria.fr
or do you want to move completely to inria gitlab?
you are right if gitlab.inria.fr does work faster
we should not have 2
I don't see a point having 2
@Théo Zimmermann is the one that knows if pushing to 2 places is easy or hard
but a possible 2x speedup could be nice
but maybe not worth it
personally I'm not sure, just copying the mail
to see what you folks think
actually I was thinking that we could indeed then disable rocqueabalu
so gitlab.inria.fr would be _less_ maintencance
because our runners have certainly made us trouble
and for sure they need us keeping them updated
isn't it already disabled?
we still use the other
don't we
I have no idea, so that's why I'm interested in the Inria offering, so I don't have to worry about the custom runners anymore
https://gitlab.com/coq/coq/-/settings/ci_cd#js-runners-settings only shows the bench runners and the windows runners (and we don't have windows jobs on gitlab anymore)
we have these ones ci-coq-04.sophia.inria.fr
I don't know what these are
yes we should cleanup
that's the bench runners?
oh so we are fully running on gitlab's runners, that's a lot of cpu we are taking XD
yup
If we need more runners, we can also consider spawning VMs on demand with a cloud provider
it has become easier for public institutions like Inria to order from cloud providers
I think our best case would be for that to happen transparently
@Maxime Dénès
so we can just push to gitlab.inria.fr
and they have enough runners for us, so we don't care
and if they need more, it is handled centrally at the gitlab.inria.fr
level
Something that gitlab.inria.fr could do very well for us is to have a good, fast, and large cache
so our CI jobs can access to it
once we can move to a build system that implements caching
cache what?
build results
dune cache for example
that could be a way to help with PRs that for example only change the stdlib
or the documentation
what would be cache hit/miss ratio would be a cool thing to check for a paper
imagine we put the whole CI under dune [including .vo files]
I'm a bit lost TBH
@Maxime Dénès this thread is about moving from gitlab.com
runners to gitlab.inria.fr
runners
it could make sense for:
also, last point is "does gitlab.inria.fr offer more advantages"
and I was thinking that maybe Inria runners could indeed have much faster artifact access
which could be useful for example to share a build cache
if we use a system like dune for all CI
we don't need anymore to pass the build artifacts from one job to another
but we can just mount dune's cache
generically, but that's a quite speculative thing
@Maxime Dénès does it make sense?
I don't see why these runners would have faster access to artifacts
Don't get me wrong, I use them for projects hosted on the gitlab Inria, and they are very useful
But for Coq, I would have thought if we need more computing power, it is easier to add runners to our current gitlab instance rather than switching
Don't forget that it is a community edition instance, so you typically don't get more things than gitlab.com
@Maxime Dénès how it is easier to add runners to the current gitlab instance
than to use the default ones at gitlab.inria ?
aren't the runners at gitlab.inria already there?
vs the first case we need to configure and take care of them?
Maxime Dénès said:
I don't see why these runners would have faster access to artifacts
I thought they would be in the same data center, but that was a hypothesis
Maxime Dénès said:
Don't forget that it is a community edition instance, so you typically don't get more things than gitlab.com
all we use is for CI so I guess that'd be OK, but I'm not expert
we also get some shared runners on gitlab.com, don't we?
I guess I missed the starting point: what do we need that we don't have? :)
if we need more computational power than the default gitlab.com runners, it is pretty easy to setup a docker machine runner than can auto scale
Maxime Dénès said:
I guess I missed the starting point: what do we need that we don't have? :)
Runners on gitlab.inria are 2x faster
Maxime Dénès said:
if we need more computational power than the default gitlab.com runners, it is pretty easy to setup a docker machine runner than can auto scale
yes, but the starting point was more "can we get CI 2x faster just by updating a variable in coqbot"?
Yes we can set a docker machine, but that's quite a bit of work I think, and has a big bus factor, doesn't it?
a big bus factor is good :)
but ok, I had missed the Inria ones were twice as fast
how did you compare?
Indeed a big bus factor is good, I guess I mean a small one :D
Maxime Dénès said:
how did you compare?
4 cores vs 2 in the gitlab offering
see the post from Thierry we quoted
cores? you mean vCPUs ?
large (4 CPUs, 8096 MB RAM), tags: ci.inria.fr, linux, large
yes that's the specs
gitlab's ones are 1 or 2 cores I think
well, I guess CPU is not the real thing :)
yeah that's what Thierry wrote, he means vcpu of course
comparing numbers of vCPUs has to be taken with a grain of salt
so maybe the first would be to run a small experiment on a workload, I guess
so what the potential gain is
(also the frequency isn't made explicit in the post)
indeed, but most gilab seem to be one core these days
yeah but at least we should compare the frequencies
Maxime Dénès said:
so maybe the first would be to run a small experiment on a workload, I guess
I agree, but given that most gitlab instances are giving only one vcpu it seems these days, we should see faster
lol
Similar frequencies on paper
but you are right that's hard to know
as these are shared machines
ah, that's the bit of info I didn't see
well I took a random gitlab runner, they vary quite a bit
but indeed we don't have code to autodetect the number of cpus and use the right flags for make
for dune yes
I see
maybe we could just bench one build of Coq
but yeah given that gitlab is free tier runners, I think it is not a too crazy hypothesis that inria ones are faster
yes we can just push
we don't even need the bot to do it, just us doing the push manually to gitlab.inria.fr should work
I didn't realize until now
yes, that's what I meant, a manual try
I don't have access to https://gitlab.inria.fr/coq
oh, I forgot it existed
Maxime Dénès said:
yes, that's what I meant, a manual try
yes that's a good idea, and very easy to do
let's see
and we can compare CI times
Hugo, Matej, Pierre L and Matthieu have rights
Ok let's write to Hugo and Matthieu in private to get rights , and then let's fix the permissions too
but you can push in any temporary fork, in fact
for the test
Indeed I forgot I can now
I'm still busy with grant writing so feel free to try
I have a significant backlog too, unfortunately, but at least that's the clear next step
I wonder if the docker registry there could also solve the problems docker people were having
you mean the dockerhub rate limiting?
the standard gitlab solution is to use the dependency proxy, on the consumer side (the project that needs to fetch the image)
https://gitlab.inria.fr/egallego/coq/-/pipelines/449421
that was very easy so I had to do it
Maxime Dénès said:
you mean the dockerhub rate limiting?
I don't know, I just know they had some problems with the limits indeed
There's no runners with the right tags, it seems
Either they are not enabled, or they don't have the "docker" tag
New try, let's see if docker works tho
https://gitlab.inria.fr/egallego/coq/-/jobs/1837812
https://gitlab.inria.fr/egallego/coq/-/pipelines/449422
Umm, now I'm at a loss, I disabled the tags for the docker job
maybe that's implicit?
Well, this can be fixed, but it won't be me today :/
https://gitlab.inria.fr/egallego/coq/-/pipelines/449487
running now was a silly config error
After understanding a bit more the architecture, it seems to me that it would possible indeed to share a build cache here
with fast network access
https://gitlab.inria.fr/egallego/coq/-/pipelines/449502
I really don't see why the cache would not be almost always invalid. eg all .vo would depend on the coq binary which is different in every PR.
Enrico Tassi said:
I really don't see why the cache would not be almost always invalid. eg all .vo would depend on the coq binary which is different in every PR.
Many PRs don't touch coqc
Emilio Jesús Gallego Arias said:
Results seems much faster, but a few of the CI don't pass the right make -j flags, also we got a too large artifact error
Last updated: Dec 05 2023 at 05:01 UTC