@Enrico Tassi : since 2 days the snap CI is failing (reproducibly). The issue is that I don't really see an error - the platform build runs through to the end and then it says that snap failed. Can you make sense out of this (see e.g. https://github.com/coq/platform/actions/runs/4011369172/jobs/6888873076) ?
What about adding --debug
to the Snap build command as they suggest?
That is good in interactive mode, since it opens a shell inside the container.
Locally, I never managed to set up LXD, so I use another (soon deprecated) container thing
Anyway, I'll investigate it
I can also try locally - I have the same Ubuntu version and it should also use LXD (I guess).
I am just busy this week, so I asked if you have a clue ...
I did a PR which saves snapcraft logs in case of failure. Let's see if they are of any use
@Enrico Tassi : it is a bit of a miracle to me. With 4GB (default settings) I reproducibly can't compile Coq 8.16.1 in the snapcraft VM. With more memory it works. The thing I don't understand is what changes in GitHub between working / not working. I compared the opam install list and it is identical. Coq doesn't even compile when I run in sequential mode (install each opam package with a separate opam install command). Any thoughts?
I'm sorry but I've no idea
@Enrico Tassi : the only difference I can see between passing and failing builds is a line lxd (5.0/stable) 5.0.2-838e1b2 from Canonical** refreshed
in the failing builds. The snap store says that LXD was last updated January 19th, but our CI stopped working January 24th. Can it be that there is such a delay?
Ah, I found this in the snapstore release table:
5.0/stable 5.0.2-838e1b2 25 January 2023
And the last working one was Jan 24th, the first failing one was Jan 25th.
Possibly we should try a later version (there is a 5.10 track) or the previous release of the 5.0 track.
Not sure how I would do this, though?
Unfortunately it looks hardwired:
https://github.com/snapcore/action-build/blob/3457752ec9b1c79a8290b5167fce2d14df0997c1/src/tools.ts#L75-L89
@Enrico Tassi : Actually I can change the lxd version by installing the latest version of a different channel beforehand. The refresh then says it is already up to date. I now tried 5.10 / latest instead of 5.0. The problem is that it has the same issue.
It looks pretty severe btw. It seems to forget which user it is. The github runner user has root rights, so opam usually gives a "WARNING running as root is not recommended" on every command. With the broken LXDs it gives this warning for a while, but then it stops to give this warning, which means the user incarnation magically changed in the middle of the script.
I see if I can get some interested from the LXD maintainers for this, but I guess it will take a longer time to fix.
Do you know if I can install an older version, which is no longer advertised on the snap store?
I don't thinks so. But now that you have pinpointed the problem you can report it upstream, or google for it. It is odd we are the only ones affected.
The problem with debugging this for the LXD team will be that opam runs for a few hours before LXD falls apart. But at least it is reproducible.
What I am trying to understand before reporting to LXD is if the error 120 comes from opam, and if so what it means. Opam seems to have custom error messages in the 12X range. I filed an issue, but no response as yet.
Last updated: Dec 07 2023 at 09:01 UTC