Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build image based on conda-base? #222

Open
2 tasks
victorlin opened this issue Jul 16, 2024 · 7 comments
Open
2 tasks

Build image based on conda-base? #222

victorlin opened this issue Jul 16, 2024 · 7 comments
Labels
proposal Proposals that warrant further discussion

Comments

@victorlin
Copy link
Member

victorlin commented Jul 16, 2024

Initially proposed by @corneliusroemer on Slack.

now that both bioconda and conda-forge support not just osx-arm64 but also linux aarch64, we could stop maintaining docker-base and simply build docker images based off micromamba docker base and install the conda-base environment into it.
One source of truth, less maintenance!

Tasks

@victorlin victorlin added the proposal Proposals that warrant further discussion label Jul 16, 2024
@victorlin
Copy link
Member Author

Copying over my response from Slack:

I would consider this. The docker-base image still requires emulation for many programs mostly because the process to cross-compile successfully is kinda painful to figure out and varies for each program. If these programs are already precompiled for multiple platforms in conda, it would be nice to leverage that work.

@joverlee521
Copy link
Contributor

Seems reasonable as long as we continue to support tools that are not readily available via conda (mainly thinking of fauna, additional context in conda-base)

@corneliusroemer
Copy link
Member

Seems reasonable as long as we continue to support tools that are not readily available via conda (mainly thinking of fauna, nextstrain/conda-base#3)

@joverlee521 the simple solution is to just add everything to conda - like fauna. Any reason this is not possible?

Packaging things into bioconda/conda-forge has a clear advantage of making things also more easily available to the whole community.

@huddlej
Copy link
Contributor

huddlej commented Jul 18, 2024

+1 for one source of truth, but after trying unsuccessfully to get a TreeKnit Bioconda package built, I'm skeptical that everything we need in the future will be Conda-able.

If we do decide to prioritize a single source of truth and the ability to always have a Conda version of every package we need, then we need to enforce stricter guidelines about the tools we can support.

The Julia/TreeKnit issue is an obvious one. Another issue would be how Bioconda didn't support ARM64 for several years while we were able to create Docker images with ARM64 support through custom builds quite quickly.

@victorlin
Copy link
Member Author

Actually, we had considered this a bit in #127.

@huddlej said:

If we are considering installation from prebuilt binaries, we might also consider installing these tools with Conda. We already rely on Conda binaries in our workflow-specific environment files and our nextstrain-base environment. We could have micromamba installed in our first pass of the Docker build and use that to install the third-party binaries we want.

and @tsibley said:

Conda packages bring along other issues. For example, they expect to bring along everything but libc, so things like openssl and other common shared libs will get duplicated (increasing image size, increasing complexity of library interactions at runtime, and more). I'm reluctant to mix Conda packages with non-Conda packages for these reasons.

That said, we might take a step back and consider building the container image entirely from a static Conda environment. We've (or at least I've) considered this before, but decided it wasn't worth it then. Maybe that's changed, particularly in light of our new Conda runtime defined by a locked package? There are downsides though, like a tighter coupling between runtimes and what they can support (e.g. architectures). Tighter is good in some ways but worse in others. Also, other considerations aside, we may not want to put all our eggs in Conda's basket.

The new development is that most(?) tools we provide in the runtimes are now available as linux-aarch64 on Bioconda.

@corneliusroemer
Copy link
Member

corneliusroemer commented Jul 26, 2024

I didn't realize how out of date our pins are compared to conda-base (which is usually using latest versions).

There's still this open PR from 15 months ago: #145

The main argument I see for not updating is that there's no need to, and that there's a risk associated with it.

Downside is that one can't just use latest features of the tools we package, one needs to look at old versions of their docs. And if one wants to use newer features, like e.g. cmaple iqtree, one needs to make explicit PRs for it, like here #226

Maybe we could make a conda-base based docker image to allow test-driving in a few workflows to see what our experience is.

@victorlin
Copy link
Member Author

victorlin commented Jul 26, 2024

@corneliusroemer re: pins, this is a good point for discussion which I've started a separate issue for: #227

Maybe we could make a conda-base based docker image to allow test-driving in a few workflows to see what our experience is.

For test-driving latest versions of tools, it might be easier to remove the pins in the existing Dockerfile rather than rewriting it to use conda-base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Proposals that warrant further discussion
Projects
None yet
Development

No branches or pull requests

4 participants