-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenMPI dependency resolving to external
build
#153
Comments
I'm wondering whether these packages are somehow broken. I would recommend you contact the conda-forge team via chat, I don't know how to help you further. Perhaps making a new build of that version would help, but I'm not sure. |
I am clueless too, as we did ensure the non-external builds have a higher priority openmpi-feedstock/recipe/meta.yaml Lines 5 to 8 in 16df999
maybe you can share the outcome of conda list and examine if there are potential conflicts?
|
I really don't understand what is happening. Repoquery resolve its ok, but install still installs the conda info
repoquery for package
conda install package
|
As of right now, an easy way to reproduce this appears to be starting with a fresh CPython miniforge (on x86_64) and running
which will offer to install
(both pocl and mpi4py depend on libhwloc, but pocl lags a few version behind in this dependency) |
@conda-forge/core I can reproduce @inducer's observation above. Can someone help us understand why the priority design is now broken and how we should fix it? |
Something rang the bell. I see we use openmpi-feedstock/recipe/meta.yaml Line 132 in bdb136f
According to what I recalled, mamba (which is now the default solver) hates it:https://mamba.readthedocs.io/en/latest/advanced_usage/package_resolution.html
@inducer @folmos-at-orange if you explicitly request for
I observed it would not resolve to use the external build. Could you confirm? |
Yeah I think your hunch is close to correct @leofang. Depending on the requested set of packages, the solver does not weigh down the external builds enough. We already weigh down those builds with a track_feature and a lower build number, so it is somewhat surprising they still bubble up. I did notice that when you specify openmpi in the install command, it pulls openmpi 4, instead of the external openmpi 5. This indicates, I think, that something in the deps of openmpi 5 is preventing the actual openmpi 5 from being installed. Apparently, in the solver's effort to maximize versions, the one with the feature gets sorted ahead of the older version. Maybe we need to add the same deps/pins to the external build as the real build? Maybe this will force both variants to not be satisifed in this case and hopefully get the solver to pick the lower version of the real package. |
Thanks, @beckermr. I have a naive question: couldn't we just remove the usage of |
Maybe but that won't fix this issue though I think. The solver always wants a higher version over a higher build number. |
There seems to be a perfectly good solution that does involve openmpi 5 and a recent pocl, found if you force matters a bit:
Detailed package versions found solutionconda-24.5.0 | py310hff52083_0 939 KB conda-forge frozendict-2.4.4 | py310hc51659f_0 48 KB conda-forge libclang-cpp15-15.0.7 |default_h127d8a8_5 16.4 MB conda-forge libevent-2.1.12 | hf998b51_1 417 KB conda-forge libgfortran-ng-13.2.0 | h69a702a_7 24 KB conda-forge libgfortran5-13.2.0 | hca663fb_7 1.4 MB conda-forge libhwloc-2.10.0 |default_h2fb2949_1000 2.3 MB conda-forge libllvm15-15.0.7 | hb3ce162_4 31.8 MB conda-forge libllvmspirv15-15.0.0 | h0cdce71_1 1010 KB conda-forge libnl-3.9.0 | hd590300_0 716 KB conda-forge llvm-spirv-15-15.0.0 | h0cdce71_1 49 KB conda-forge mpi-1.0 | openmpi 4 KB conda-forge mpi4py-3.1.6 | py310hb2ba3f8_1 533 KB conda-forge ocl-icd-2.3.2 | hd590300_1 133 KB conda-forge openmpi-5.0.3 | h47314c5_102 14.0 MB conda-forge openssl-3.3.0 | h4ab18f5_3 2.8 MB conda-forge pocl-5.0 | h03a6ac1_4 13 KB conda-forge pocl-core-5.0 | hbf9fd79_4 720 KB conda-forge pocl-cpu-5.0 | hea57645_4 24 KB conda-forge pocl-cpu-minimal-5.0 | h1b31331_4 13.1 MB conda-forge pocl-cuda-5.0 | hd8896d7_4 966 KB conda-forge pocl-remote-5.0 | h1b31331_4 53 KB conda-forge I wonder why that solution would get passed over in favor of lower-version openmpi or external builds thereof. |
Ah that's good to know. The solver might be maximizing more versions of other packages? You'd have to diff the solve results. cc @jaimergp to see if he can help with some insight here. |
I can reproduce with -- edit -- All libsolv derivatives get
|
Sure, here are all the package versions:
Baseline:
And here are the diffs:
$ conda install pocl mpi4py conda-24.5.0 | py310hff52083_0 939 KB conda-forge frozendict-2.4.4 | py310hc51659f_0 48 KB conda-forge libclang-cpp15-15.0.7 |default_h127d8a8_5 16.4 MB conda-forge libhwloc-1.11.13 | h8b7812e_2 1.8 MB conda-forge libllvm15-15.0.7 | hb3ce162_4 31.8 MB conda-forge libllvmspirv15-15.0.0 | h0cdce71_1 1010 KB conda-forge llvm-spirv-15-15.0.0 | h0cdce71_1 49 KB conda-forge mpi-1.0 | openmpi 4 KB conda-forge mpi4py-3.1.6 | py310hb2ba3f8_1 533 KB conda-forge ocl-icd-2.3.2 | hd590300_1 133 KB conda-forge openmpi-5.0.3 | external_2 13 KB conda-forge openssl-3.3.0 | h4ab18f5_3 2.8 MB conda-forge pocl-5.0 | h03a6ac1_5 14 KB conda-forge pocl-core-5.0 | h1fad545_5 721 KB conda-forge pocl-cpu-5.0 | h55a2082_5 24 KB conda-forge pocl-cpu-minimal-5.0 | hf9ad923_5 13.1 MB conda-forge pocl-cuda-5.0 | hb452e98_5 965 KB conda-forge pocl-remote-5.0 | hf9ad923_5 53 KB conda-forgeWith explicit OpenMPI: conda-24.5.0 | py310hff52083_0 939 KB conda-forge frozendict-2.4.4 | py310hc51659f_0 48 KB conda-forge libclang-cpp15-15.0.7 |default_h127d8a8_5 16.4 MB conda-forge libgfortran-ng-13.2.0 | h69a702a_7 24 KB conda-forge libgfortran5-13.2.0 | hca663fb_7 1.4 MB conda-forge libhwloc-1.11.13 | h8b7812e_2 1.8 MB conda-forge libllvm15-15.0.7 | hb3ce162_4 31.8 MB conda-forge libllvmspirv15-15.0.0 | h0cdce71_1 1010 KB conda-forge llvm-spirv-15-15.0.0 | h0cdce71_1 49 KB conda-forge mpi-1.0 | openmpi 4 KB conda-forge mpi4py-3.1.6 | py310h2a790f2_0 532 KB conda-forge ocl-icd-2.3.2 | hd590300_1 133 KB conda-forge openmpi-4.1.6 | hc5af2df_101 3.9 MB conda-forge openssl-3.3.0 | h4ab18f5_3 2.8 MB conda-forge pocl-5.0 | h03a6ac1_5 14 KB conda-forge pocl-core-5.0 | h1fad545_5 721 KB conda-forge pocl-cpu-5.0 | h55a2082_5 24 KB conda-forge pocl-cpu-minimal-5.0 | hf9ad923_5 13.1 MB conda-forge pocl-cuda-5.0 | hb452e98_5 965 KB conda-forge pocl-remote-5.0 | hf9ad923_5 53 KB conda-forge zlib-1.2.13 | hd590300_5 91 KB conda-forgeWith explicitly versioned openmpi: conda-24.5.0 | py310hff52083_0 939 KB conda-forge frozendict-2.4.4 | py310hc51659f_0 48 KB conda-forge libclang-cpp15-15.0.7 |default_h127d8a8_5 16.4 MB conda-forge libevent-2.1.12 | hf998b51_1 417 KB conda-forge libgfortran-ng-13.2.0 | h69a702a_7 24 KB conda-forge libgfortran5-13.2.0 | hca663fb_7 1.4 MB conda-forge libhwloc-2.10.0 |default_h2fb2949_1000 2.3 MB conda-forge libllvm15-15.0.7 | hb3ce162_4 31.8 MB conda-forge libllvmspirv15-15.0.0 | h0cdce71_1 1010 KB conda-forge libnl-3.9.0 | hd590300_0 716 KB conda-forge llvm-spirv-15-15.0.0 | h0cdce71_1 49 KB conda-forge mpi-1.0 | openmpi 4 KB conda-forge mpi4py-3.1.6 | py310hb2ba3f8_1 533 KB conda-forge ocl-icd-2.3.2 | hd590300_1 133 KB conda-forge openmpi-5.0.3 | h47314c5_102 14.0 MB conda-forge openssl-3.3.0 | h4ab18f5_3 2.8 MB conda-forge pocl-5.0 | h03a6ac1_4 13 KB conda-forge pocl-core-5.0 | hbf9fd79_4 720 KB conda-forge pocl-cpu-5.0 | hea57645_4 24 KB conda-forge pocl-cpu-minimal-5.0 | h1b31331_4 13.1 MB conda-forge pocl-cuda-5.0 | hd8896d7_4 966 KB conda-forge pocl-remote-5.0 | h1b31331_4 53 KB conda-forge --- baseline 2024-05-29 10:44:36.884179263 -0500
+++ add-ompi 2024-05-29 10:45:14.231089401 -0500
@@ -10,6 +10,7 @@
added / updated specs:
- mpi4py
+ - openmpi
- pocl
@@ -20,14 +21,16 @@
conda-24.5.0 | py310hff52083_0 939 KB conda-forge
frozendict-2.4.4 | py310hc51659f_0 48 KB conda-forge
libclang-cpp15-15.0.7 |default_h127d8a8_5 16.4 MB conda-forge
+ libgfortran-ng-13.2.0 | h69a702a_7 24 KB conda-forge
+ libgfortran5-13.2.0 | hca663fb_7 1.4 MB conda-forge
libhwloc-1.11.13 | h8b7812e_2 1.8 MB conda-forge
libllvm15-15.0.7 | hb3ce162_4 31.8 MB conda-forge
libllvmspirv15-15.0.0 | h0cdce71_1 1010 KB conda-forge
llvm-spirv-15-15.0.0 | h0cdce71_1 49 KB conda-forge
mpi-1.0 | openmpi 4 KB conda-forge
- mpi4py-3.1.6 | py310hb2ba3f8_1 533 KB conda-forge
+ mpi4py-3.1.6 | py310h2a790f2_0 532 KB conda-forge
ocl-icd-2.3.2 | hd590300_1 133 KB conda-forge
- openmpi-5.0.3 | external_2 13 KB conda-forge
+ openmpi-4.1.6 | hc5af2df_101 3.9 MB conda-forge
openssl-3.3.0 | h4ab18f5_3 2.8 MB conda-forge
pocl-5.0 | h03a6ac1_5 14 KB conda-forge
pocl-core-5.0 | h1fad545_5 721 KB conda-forge
@@ -35,27 +38,31 @@
pocl-cpu-minimal-5.0 | hf9ad923_5 13.1 MB conda-forge
pocl-cuda-5.0 | hb452e98_5 965 KB conda-forge
pocl-remote-5.0 | hf9ad923_5 53 KB conda-forge
+ zlib-1.2.13 | hd590300_5 91 KB conda-forge
------------------------------------------------------------
- Total: 70.2 MB
+ Total: 75.6 MB and --- baseline 2024-05-29 10:44:36.884179263 -0500
+++ add-ompi-versioned 2024-05-29 10:46:41.716373658 -0500
@@ -20,42 +21,50 @@
conda-24.5.0 | py310hff52083_0 939 KB conda-forge
frozendict-2.4.4 | py310hc51659f_0 48 KB conda-forge
libclang-cpp15-15.0.7 |default_h127d8a8_5 16.4 MB conda-forge
- libhwloc-1.11.13 | h8b7812e_2 1.8 MB conda-forge
+ libevent-2.1.12 | hf998b51_1 417 KB conda-forge
+ libgfortran-ng-13.2.0 | h69a702a_7 24 KB conda-forge
+ libgfortran5-13.2.0 | hca663fb_7 1.4 MB conda-forge
+ libhwloc-2.10.0 |default_h2fb2949_1000 2.3 MB conda-forge
libllvm15-15.0.7 | hb3ce162_4 31.8 MB conda-forge
libllvmspirv15-15.0.0 | h0cdce71_1 1010 KB conda-forge
+ libnl-3.9.0 | hd590300_0 716 KB conda-forge
llvm-spirv-15-15.0.0 | h0cdce71_1 49 KB conda-forge
mpi-1.0 | openmpi 4 KB conda-forge
mpi4py-3.1.6 | py310hb2ba3f8_1 533 KB conda-forge
ocl-icd-2.3.2 | hd590300_1 133 KB conda-forge
- openmpi-5.0.3 | external_2 13 KB conda-forge
+ openmpi-5.0.3 | h47314c5_102 14.0 MB conda-forge
openssl-3.3.0 | h4ab18f5_3 2.8 MB conda-forge
- pocl-5.0 | h03a6ac1_5 14 KB conda-forge
- pocl-core-5.0 | h1fad545_5 721 KB conda-forge
- pocl-cpu-5.0 | h55a2082_5 24 KB conda-forge
- pocl-cpu-minimal-5.0 | hf9ad923_5 13.1 MB conda-forge
- pocl-cuda-5.0 | hb452e98_5 965 KB conda-forge
- pocl-remote-5.0 | hf9ad923_5 53 KB conda-forge
+ pocl-5.0 | h03a6ac1_4 13 KB conda-forge
+ pocl-core-5.0 | hbf9fd79_4 720 KB conda-forge
+ pocl-cpu-5.0 | hea57645_4 24 KB conda-forge
+ pocl-cpu-minimal-5.0 | h1b31331_4 13.1 MB conda-forge
+ pocl-cuda-5.0 | hd8896d7_4 966 KB conda-forge
+ pocl-remote-5.0 | h1b31331_4 53 KB conda-forge
------------------------------------------------------------
- Total: 70.2 MB
+ Total: 87.2 MB
|
pocl has a build number drop here - that seems relevant |
cc @matthiasdiener: do you recall what the story was with pocl build numbers vs hwloc versions? |
Hm, the Edit: adding |
A couple realizations now:
|
One problem I see in the verbose logs is that mpi4py are not selected by their dependency tree, but by their timestamp. So it might be completely an arbitrary selection depending on when they were uploaded. For
We have to select between:
Who won? mpich variant, because it was uploaded last. It then proceeds with the mpich variants:
And correctly selects For
In this case we have:
This time
The non-external variants were never given a chance 🤔 |
I will add pixi does pick the correct openmpi (non-external) if you specify a This [project]
name = "openmpi-external"
version = "0.1.0"
description = "Add a short description here"
authors = ["jaimergp <jaimergp@users.noreply.github.com>"]
channels = ["conda-forge"]
platforms = ["linux-64"]
[tasks]
[dependencies]
pocl = "*"
mpi4py = "*"
python= "3.10.*" # also ok with "3.11.*" and "3.12.*" This does not: [project]
name = "openmpi-external"
version = "0.1.0"
description = "Add a short description here"
authors = ["jaimergp <jaimergp@users.noreply.github.com>"]
channels = ["conda-forge"]
platforms = ["linux-64"]
[tasks]
[dependencies]
pocl = "*"
mpi4py = "*"
# you can add python = "3.8" and python = "3.9" |
After all, I think it does boil down to libhwloc=2 vs v1. I couldn't see it before because I was not setting my virtual GLIBC to 2.17. So to sum up:
So I'm assuming that if |
Since |
Looking at it... there's some non-standard config in those .ci_support files. |
From what I can tell, build 5 is the (currently) intended hwloc configuration (i.e., building against hwloc1.11.13 and hwloc2.10.0). I'm not sure we really need that hwloc1 build though. cc @isuruf |
Ah, there are two builds, ok. I'm not sure why only forcing libhwloc=2 + glibc=2.17 works then. The pixi logs do have a bit more detail about how and why packages are selected or discarded, but I could not find anything conclusive. |
The external build has way fewer dependencies. This is likely causing large scale changes in how the solver inspect branches? |
POCL overrides the docker images (to outdated CUDA versions). I don't know why that would have been desirable, but those images should move on (or just use the CUDA 12 components in-recipe and drop the override). |
Alright. All of the PRs have been merged here. It will take roughly 1-2 hours for the infrastructure to sync everything back up. Once it does, let's try again and see what happens. One possible side effect is that the solver will just pick the openmpi 4 external builds. We can deal with that if it happens via some complicated and ugly repodata patching. |
This appears to be fixed in that now we get intel mpi which is a real mpi
|
As I predicted the solver now defaults to 4 for openmpi
This is OK since it is a real build and a valid solution. |
If I correctly supply the 2.17 virtual package, this works
|
We appear to have solved the immediate issue, but I am still not 100% happy here. I suspect the solver is also trying to minimize the size of environments and that somehow this happens at the expense of the openmpi version. Does that make any sense @jaimergp? |
Yes, could be. In I'm concerned about the lack of criteria to select one or other mpi variant and that it's mostly based on timestamps and their arbitrary upload time. See #153 (comment). |
Yeah we don't prioritize any of the various mpi providers. That is a separate issue. |
@minrk @jaimergp I have a draft plugins package here: https://github.com/regro/conda-forge-conda-plugins If you all can try it out and see how it works, that'd be great. If we are happy with it, we can start shipping it in conda-forge and making the external variants use the packages. |
It turns out the build number drop observed here is important, and I'm trying to figure out how to prevent it. Arguably, that's a separate issue, so I filed a separate issue: |
Comment:
I'm building a package depending on
openmpi
. I pinned the run dependency version to4.1.6
(which is on conda-forge). However when installing the package it sometimes resolve to theexternal
build. As a consequence when I install the package it doesn't work because the external build doesn't install any binaries (as expected).Any idea on why this happens?
I might want to pin to
openmpi=4.1.6=h*
but that would defeat the purpose of having external builds.Thanks in advance
The text was updated successfully, but these errors were encountered: