Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Test pathogen repo CI builds with the final image #148

Merged
merged 1 commit into from
May 8, 2023

Conversation

tsibley
Copy link
Member

@tsibley tsibley commented May 5, 2023

A useful check for if new images will break our pathogen builds.

I included all pathogen repos that already use our pathogen-repo-ci reusable workflow. It should be minimal effort to maintain this list over time—I expect it to only grow—but perhaps in the future we will want to abstract it out into a shared list of known pathogen repos.

I don't like that we have to copy the build-args for a few of the repos here since it'll be easy for this copy to diverge from the repo's authoritative build-args, but it's necessary for now. Over time as we work towards increased automation of pathogen builds, I think we can get rid of this build-args copy by further standardizing how each repo configures itself for automation. For example, instead of specifying build-args in a repo's CI workflow, the args for CI could be stored in a broader workflow metadata file (e.g. nextstrain-workflow.yaml) read by pathogen-repo-ci, or defined by some other convention.

An alternative to directly running pathogen-repo-ci against each repo here would be instead triggering the CI workflows themselves within each repo. The downside to that is it would divorce the outcomes of those workflows from this one and render them not visible from PRs in this repo. It would also require updates to each repo to support triggering and passing in of additional parameters (i.e. for the image). And finally those CI workflows sometimes run other jobs, like linting and other integration tests (e.g. with Cram), that aren't always necessary to run with a new image.

Resolves #147.

Testing

  • Checks pass

@tsibley
Copy link
Member Author

tsibley commented May 5, 2023

Ah, it's as I feared and jobs which call reusable workflows can't use the continue-on-error job key. This key was not included in the list of accepted fields, but I thought it was worth trying anyway as it might be an oversight.

I'll figure out how to rework this to have ~equivalent behaviour without that key, but first I'm going to repush without it to see if the rest of the changes work.

@tsibley tsibley force-pushed the trs/test-pathogen-repo-ci branch from 0122876 to 54fde05 Compare May 5, 2023 18:43
@tsibley
Copy link
Member Author

tsibley commented May 5, 2023

Oh, hmm. To pull from ghcr.io the job needs to be authenticated. That'll require threading some changes thru the shared pathogen-repo-ci workflow. Alternatively, we can switch to the docker.io registry, but then the new test jobs need to move after the push-{build,branch} jobs, and that's potentially annoying for workflow DAG (conditional execution) reasons. And in any case, the pulls from docker.io should ideally still use authentication so we don't hit the low rate limits.

tsibley added a commit to nextstrain/.github that referenced this pull request May 5, 2023
For docker.io, this lifts low rate limits on image pulls.  For ghcr.io,
this allows the use of docker-base images we transiently stage there
before publishing to docker.io.

DOCKER_TOKEN_PUBLIC_READ_ONLY is an org-level secret available to all
our public GitHub repos.  On Docker Hub, it's granted "public read-only"
access as nextstrainbot.

Related-to: <nextstrain/docker-base#148>
tsibley added a commit to nextstrain/.github that referenced this pull request May 5, 2023
For docker.io, this lifts low rate limits on image pulls.  For ghcr.io,
this allows the use of docker-base images we transiently stage there
before publishing to docker.io.

DOCKER_TOKEN_PUBLIC_READ_ONLY is an org-level secret available to all
our public GitHub repos.  On Docker Hub, it's granted "public read-only"
access as nextstrainbot.

Related-to: <nextstrain/docker-base#148>
tsibley added a commit to nextstrain/.github that referenced this pull request May 5, 2023
For docker.io, this lifts low rate limits on image pulls.  For ghcr.io,
this allows the use of docker-base images we transiently stage there
before publishing to docker.io.

A new "permissions:" block with "packages: read" restricts the ghcr.io
access to read-only.  This addition requires explicitly enumerating the
rest of the required permissions too, which is only "contents: read".

Related-to: <nextstrain/docker-base#148>
tsibley added a commit to nextstrain/.github that referenced this pull request May 5, 2023
This lifts low rate limits on image pulls.  However, calling workflows
must explicitly opt in with "secrets: inherit" in order for this
reusable workflow to be able to see the org-level secret containing the
token.

Related-to: <nextstrain/docker-base#148>
tsibley added a commit to nextstrain/.github that referenced this pull request May 5, 2023
This allows the use of docker-base images we transiently stage at
ghcr.io before publishing to docker.io.  A new "permissions:" block with
"packages: read" restricts the ghcr.io access to read-only.  This
addition requires explicitly enumerating the rest of the required
permissions too, which is only "contents: read" for actions/checkout.

Related-to: <nextstrain/docker-base#148>
@tsibley tsibley force-pushed the trs/test-pathogen-repo-ci branch from 54fde05 to 4471bfc Compare May 5, 2023 21:32
@tsibley
Copy link
Member Author

tsibley commented May 5, 2023

It works!

image

Before merge:

@tsibley tsibley marked this pull request as ready for review May 5, 2023 23:19
@tsibley tsibley requested a review from a team May 5, 2023 23:19
test-pathogen-repo-ci:
needs: build
strategy:
# XXX TODO: Test on multiple platforms via the matrix too, as above?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds reasonable!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do this we'd first need to add a platform input to the pathogen-repo-ci.yaml shared workflow and update that workflow to DTRT with it, namely: set DOCKER_DEFAULT_PLATFORM and run docker/setup-qemu-action@v2 if the desired platform is not the native platform.

I'm going to leave that for a separate PR, and I may not do it myself right away. Someone else should feel free to beat me to it. :-)

- { pathogen: lassa }
- { pathogen: monkeypox }
- { pathogen: mumps }
- { pathogen: ncov, build-args: all_regions -j 2 --profile nextstrain_profiles/nextstrain-ci }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice config with the build-args, brilliant!

Copy link
Member

@corneliusroemer corneliusroemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing, thanks so much!

@corneliusroemer
Copy link
Member

Would it make sense to also validate new conda-base environments like this? This could potentially be simple as all we need to do is tell the cli to install and use --conda instead of --docker.

@tsibley
Copy link
Member Author

tsibley commented May 8, 2023

Would it make sense to also validate new conda-base environments like this? This could potentially be simple as all we need to do is tell the cli to install and use --conda instead of --docker.

Yep. I'd probably do this with two additional supporting changes in pathogen-repo-ci.yaml shared workflow:

  1. Set up the Conda runtime and run the pathogen build in it in addition to the current use of the Docker runtime. This enables testing in both runtimes for normal pathogen CI (i.e. triggered by changes to the pathogen repos) and also allows the conda-base CI to have a very similar setup as docker-base does with this PR.

  2. Add workflow inputs to explicitly disable/enable the runtimes tested. The idea is to default both Docker and Conda runtimes to enabled so normal pathogen CI tests in both, while the uses in docker-base CI and conda-base CI would disable every runtime but themselves so as not to waste cycles / potentially raise false alarms.

A useful check for if new images will break our pathogen builds.

I included all pathogen repos that already use our pathogen-repo-ci
reusable workflow.  It should be minimal effort to maintain this list
over time—I expect it to only grow—but perhaps in the future we will
want to abstract it out into a shared list of known pathogen repos.

I don't like that we have to copy the build-args for a few of the repos
here since it'll be easy for this copy to diverge from the repo's
authoritative build-args, but it's necessary for now.  Over time as we
work towards increased automation of pathogen builds, I think we can get
rid of this build-args copy by further standardizing how each repo
configures itself for automation.  For example, instead of specifying
build-args in a repo's CI workflow, the args for CI could be stored in a
broader workflow metadata file (e.g. nextstrain-workflow.yaml) read by
pathogen-repo-ci, or defined by some other convention.

An alternative to directly running pathogen-repo-ci against each repo
here would be instead triggering the CI workflows themselves within each
repo.  The downside to that is it would divorce the outcomes of those
workflows from this one and render them not visible from PRs in this
repo.  It would also require updates to each repo to support triggering
and passing in of additional parameters (i.e. for the image).  And
finally those CI workflows sometimes run other jobs, like linting and
other integration tests (e.g. with Cram), that aren't always necessary
to run with a new image.

Resolves <#147>.
@tsibley tsibley force-pushed the trs/test-pathogen-repo-ci branch from f351e4c to 12000a2 Compare May 8, 2023 21:17
@tsibley tsibley merged commit 0cca5cd into master May 8, 2023
@tsibley tsibley deleted the trs/test-pathogen-repo-ci branch May 8, 2023 21:18
@tsibley
Copy link
Member Author

tsibley commented May 8, 2023

This passed CI for the PR but then failed CI after the merge to master. The reason why comes down to the difference in NEXTSTRAIN_DOCKER_IMAGE values between the two runs:

NEXTSTRAIN_DOCKER_IMAGE=ghcr.io/nextstrain/base:branch-trs-test-pathogen-repo-ci
NEXTSTRAIN_DOCKER_IMAGE=ghcr.io/nextstrain/base:build-20230508T211841Z

The build-* tag triggers a different code path within Nextstrain CLI's Docker runner for nextstrain update:

https://github.com/nextstrain/cli/blob/babc215092465758bf3df88ccf3eaae54add47c3/nextstrain/cli/runner/docker.py#L434-L440

and it ends up trying to enumerate the build-* tags of the ghcr.io image by making requests to the docker.io registry. This results in a 400 Bad Request from the latter. Oops.

tsibley added a commit to nextstrain/.github that referenced this pull request May 9, 2023
Allows calling workflows to succeed even if the job in this called
workflow fails.  This is a workaround for the calling working being
unable to specify continue-on-error itself.¹

Related-to: <nextstrain/docker-base#148>

¹ <https://docs.github.com/en/actions/using-workflows/reusing-workflows#supported-keywords-for-jobs-that-call-a-reusable-workflow>
tsibley added a commit that referenced this pull request May 9, 2023
This makes them advisory-only, as they're currently broken.¹

We may also opt to leave them advisory-only longer term, as they might
be noisy and not fit for stopping an image release.

Idea for the pass thru this uses from @joverlee521.²

Related-to: <#148>

¹ <#148 (comment)>
² <#148 (comment)>
@tsibley tsibley mentioned this pull request May 9, 2023
1 task
tsibley added a commit to nextstrain/.github that referenced this pull request May 9, 2023
Allows calling workflows to succeed even if the job in this called
workflow fails.  This is a workaround for the calling working being
unable to specify continue-on-error itself.¹

Related-to: <nextstrain/docker-base#148>

¹ <https://docs.github.com/en/actions/using-workflows/reusing-workflows#supported-keywords-for-jobs-that-call-a-reusable-workflow>
@tsibley
Copy link
Member Author

tsibley commented May 9, 2023

Nextstrain CLI's handling here should be improved, but I'll also address the issue separately in this repo.

@tsibley
Copy link
Member Author

tsibley commented May 9, 2023

#928 should fix the broken CI by ignoring errors in the test-pathogen-repo-ci jobs. I can then address the failing jobs separately.

tsibley added a commit that referenced this pull request May 9, 2023
… ghcr.io

This should fix the currently-broken test-pathogen-repo-ci jobs when our
CI runs on master² (and they should continue to work on branches too).
It requires a little rearranging of jobs in the workflow, with a little
additional and unfortunate complexity due to conditionals.

This works around a Nextstrain CLI bug with registries other than
docker.io¹ during `nextstrain update docker` (and `nextstrain setup
docker`), which is run as part of our setup-nextstrain-cli action used
by this workflow.  Ideally we'll fix that bug and then be able to revert
this, especially since we might actually want to condition the pushing
to docker.io on the outcome of these test jobs in the future.

¹ <nextstrain/cli#279>
² <#148 (comment)>
@victorlin
Copy link
Member

#928 should fix the broken CI by ignoring errors in the test-pathogen-repo-ci jobs

I assume you meant nextstrain/.github#40 + #150 ?

@tsibley
Copy link
Member Author

tsibley commented May 10, 2023

Indeed. … I don't know where #928 came from!

tsibley added a commit to nextstrain/conda-base that referenced this pull request May 11, 2023
[ Commit message based on that of 12000a20 in nextstrain/docker-base.¹
  Code changes also based on that commit, plus subsequent commits.² ]

A useful check for if new packages will break our pathogen builds.

I included all pathogen repos that already use our pathogen-repo-ci
reusable workflow.  It should be minimal effort to maintain this list
over time—I expect it to only grow—but perhaps in the future we will
want to abstract it out into a shared list of known pathogen repos.

I don't like that we have to copy the build-args for a few of the repos
here since it'll be easy for this copy to diverge from the repo's
authoritative build-args, but it's necessary for now.  Over time as we
work towards increased automation of pathogen builds, I think we can get
rid of this build-args copy by further standardizing how each repo
configures itself for automation.  For example, instead of specifying
build-args in a repo's CI workflow, the args for CI could be stored in a
broader workflow metadata file (e.g. nextstrain-workflow.yaml) read by
pathogen-repo-ci, or defined by some other convention.

An alternative to directly running pathogen-repo-ci against each repo
here would be instead triggering the CI workflows themselves within each
repo.  The downside to that is it would divorce the outcomes of those
workflows from this one and render them not visible from PRs in this
repo.  It would also require updates to each repo to support triggering
and passing in of additional parameters (i.e. for the package).  And
finally those CI workflows sometimes run other jobs, like linting and
other integration tests (e.g. with Cram), that aren't always necessary
to run with a new package.

Related-to: <nextstrain/docker-base#148>
Related-to: <nextstrain/docker-base#150>
Related-to: <nextstrain/docker-base#151>

¹ <nextstrain/docker-base@12000a20>
² <nextstrain/docker-base@bc22a0bc>
  <nextstrain/docker-base@0a20a474>
  <nextstrain/docker-base@75254e92>
@tsibley
Copy link
Member Author

tsibley commented May 11, 2023

I wrote:

Yep. I'd probably [implement the same testing for the Conda runtime] with two additional supporting changes…

Both supporting changes done in nextstrain/.github#42.

Corresponding updates for docker-base in #154.

Testing for conda-base in nextstrain/conda-base#27.

tsibley added a commit to nextstrain/conda-base that referenced this pull request May 11, 2023
[ Commit message based on that of 12000a20 in nextstrain/docker-base.¹
  Code changes also based on that commit, plus subsequent commits.² ]

A useful check for if new packages will break our pathogen builds.

I included all pathogen repos that already use our pathogen-repo-ci
reusable workflow.  It should be minimal effort to maintain this list
over time—I expect it to only grow—but perhaps in the future we will
want to abstract it out into a shared list of known pathogen repos.

I don't like that we have to copy the build-args for a few of the repos
here since it'll be easy for this copy to diverge from the repo's
authoritative build-args, but it's necessary for now.  Over time as we
work towards increased automation of pathogen builds, I think we can get
rid of this build-args copy by further standardizing how each repo
configures itself for automation.  For example, instead of specifying
build-args in a repo's CI workflow, the args for CI could be stored in a
broader workflow metadata file (e.g. nextstrain-workflow.yaml) read by
pathogen-repo-ci, or defined by some other convention.

An alternative to directly running pathogen-repo-ci against each repo
here would be instead triggering the CI workflows themselves within each
repo.  The downside to that is it would divorce the outcomes of those
workflows from this one and render them not visible from PRs in this
repo.  It would also require updates to each repo to support triggering
and passing in of additional parameters (i.e. for the package).  And
finally those CI workflows sometimes run other jobs, like linting and
other integration tests (e.g. with Cram), that aren't always necessary
to run with a new package.

Related-to: <nextstrain/docker-base#148>
Related-to: <nextstrain/docker-base#150>
Related-to: <nextstrain/docker-base#151>
Related-to: <nextstrain/docker-base#154>

¹ <nextstrain/docker-base@12000a20>
² <nextstrain/docker-base@bc22a0bc>
  <nextstrain/docker-base@0a20a474>
  <nextstrain/docker-base@75254e92>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Run pathogen-repo-ci in CI with new docker-base images produced in CI
4 participants