Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use run-lcg-view github action and switch to more recent LCG releases in CI #182

Merged
merged 6 commits into from
Mar 30, 2021

Conversation

tmadlener
Copy link
Collaborator

@tmadlener tmadlener commented Feb 25, 2021

BEGINRELEASENOTES

  • Use run-lcg-view github action and switch to more recent LCG releases to run CI.
  • Update README to include status of CI

ENDRELEASENOTES

After this the builds are:

platform LCG releases (+compilers) comments
centos7 99 (gcc8, clang10), dev3 (clang10), dev4 (gcc10, clang10) ENABLE_SIO=ON
centos8 99 (gcc10) uses ENABLE_SIO=ON
key4hep N/A builds on top of the latest key4hep release
ubuntu 99, dev3, dev4 (gcc9 ) Ubuntu 20.04 and ENABLE_SIO=ON
mac 99, dev4 (clang120), 98python3 (clang110) keeping 98 for different clang version (no sio there)
python 99 (gcc10) only runs python linter

The most important change is that I have removed all builds that use python2, so that all of our CI is now using python3 exclusively. I have also removed the dedicated workflow file for the sio tests and instead integrated that into the other workflows.

Now that we have coverity working again, maybe we can also advertise our CI a bit more prominently ;)
Should we add all badges, i.e. the sio build and the python checks as well?

@petricm
Copy link

petricm commented Feb 25, 2021

If you are contemplating the CI, it would be also good to redefine the CI jobs with the run-lcg-view Action

@tmadlener tmadlener changed the title Add CI badges to read me Use run-lcg-view github action and switch to more recent LCG releases in CI Feb 25, 2021
@petricm
Copy link

petricm commented Feb 25, 2021

please be patient with the CI failures, we have a ticket open with IIS Web Hosting Service there are some issues with the EOS mount on the machines that host the webpage where the cvmfs binaries are fetched from

Edit: s/month/mount/

.github/workflows/mac.yml Outdated Show resolved Hide resolved
@tmadlener
Copy link
Collaborator Author

please be patient with the CI failures, we have a ticket open with IIS Web Hosting Service there are some issues with the EOS month on the machines that host the webpage where the cvmfs binaries are fetched from

Alright. Thanks for the heads up.
A general question: would it help if we reduce the number of "simultaneous" fetches by reducing the number of jobs that run? Currently we have quite a lot of jobs, and I think we could reduce them quite a bit without losing much test coverage.

@petricm
Copy link

petricm commented Feb 25, 2021

would it help if we reduce the number of "simultaneous" fetches by reducing the number of jobs that run?

No, we are very far away from saturating the server, it's not that kind of a problem

.github/workflows/mac.yml Outdated Show resolved Hide resolved
@tmadlener
Copy link
Collaborator Author

I just realized that the required checks have to be adapted after merging this, since they have the LCG release in their name. Currently this is waiting for jobs to finish, which no longer exist after this PR.

@petricm
Copy link

petricm commented Feb 26, 2021

I think you can still merge test.yml and ubuntu.yml

@andresailer
Copy link
Member

but if you keep them separate you can rerun them separately

@petricm
Copy link

petricm commented Feb 26, 2021

but if you keep them separate you can rerun them separately

It is really annoying that they have still not implemented single job restarts

@tmadlener
Copy link
Collaborator Author

So, I think it is best to keep it like this until single job restarts are a possibility, to at least have some sort of granularity.

@andresailer
Copy link
Member

From the meeting discussion: Use only LCG_99, maybe also dev3 (root master) and dev4 (usually latest release patch branch)

@petricm
Copy link

petricm commented Mar 3, 2021

The CVMFS issues should be fixed, the cause was:

The problem has been understood (OOM memory leak in the eosxd), and solution provided by my colleague Alex, https://gitlab.cern.ch/cloud/eosd/-/merge_requests/6, that should help on the self healing when this happens.

And as far as I understand the EOS head nodes should have received the patch today at 10.

@petricm
Copy link

petricm commented Mar 3, 2021

I was to quick, they fixed another problem with EOS that also causes 403 but not this one

I confirmed that the fix is still not applied on the web infrastructure, the merge request needs to be accepted by our colleagues. We are pinging them to have this as soon as possible accepted.

So far we have better documented the troubleshooting and detection of 403 errors on webeos to at least be able to react quicker.

But the goal is to get this self-healing patch accepted and applied as soon as possible (we are also studying the option to install a local patched version of eosxd in case it takes much longer to get it accepted centrally)

@tmadlener
Copy link
Collaborator Author

@petricm are there any news on the cvmfs issue? If it is fixed, I think this could be merged and the settings adjusted accordingly, so that also the other PRs can run against an updated CI again.

@petricm
Copy link

petricm commented Mar 15, 2021

It is merged, but they did not say that it was deployed. But you can merge this regardless, as podio anyway used the cvmfs action already, so with or without this PR the CI is equally affected.

@vvolkl
Copy link
Contributor

vvolkl commented Mar 18, 2021

For the cvmfs errors, I see that as well in the ci of the key4hep packages, and there is about a 1/50 chance. So good to have it fixed, but this is fine to merge I think, as it is easily restarted.

Thanks to Marko, this action can now also use the /cvmfs/sw.hsf.org installation, I'll set this up in a future pr.

@petricm
Copy link

petricm commented Mar 23, 2021

The fix for EOS was deployed

@tmadlener
Copy link
Collaborator Author

Thanks to Marko, this action can now also use the /cvmfs/sw.hsf.org installation, I'll set this up in a future pr.

After our discussion during yesterdays meeting I have now also added an additional workflow that builds on top of the latest key4hep release. I think with this we really have everything that we would like to have in this PR.

@gaede gaede merged commit 7440c9e into AIDASoft:master Mar 30, 2021
@tmadlener tmadlener deleted the ci-badges branch August 13, 2021 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants