Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove cuda-python and cudatoolkit from rapids metapackage. #695

Merged
merged 1 commit into from
Jan 30, 2024

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Jan 30, 2024

We observed that some CUDA packages were being pulled from the nvidia channel here: https://github.com/rapidsai/docker/actions/runs/7709797316/job/21011757079#step:9:552

I found that removing cuda-python changes the solver behavior in a way that satisfies our needs.

I also removed cudatoolkit because that is a transitive dependency of RAPIDS libraries, and only needed for CUDA 11. It is not needed at the top level rapids metapackage. Like cuda-python, it is constrained by cuda-version, so I think it's better to cut it.

Details:

The explicit dependency rapids -> cuda-python is forcing the solver to prefer newer cuda-python versions, which in turn makes cuda-version=12.0 ineffective at constraining cuda-cudart. The solver prefers to take cuda-python 12.3 from conda-forge, but it also has to have cuda-version 12.0 per our constraints (cuda-python 12.3 requires cuda-cudart>=12.3.101,<13.0a0 and cuda-version>=12.0,<13.0a0). So then we get nvidia channel packages (which are not constrained by cuda-version) of cuda-cudart 12.3 to satisfy cuda-python 12.3 rather than install cuda-python 12.0 and get cuda-cudart from conda-forge.

@bdice bdice requested a review from a team as a code owner January 30, 2024 22:08
@@ -29,12 +29,6 @@ requirements:
- cuda-version ={{ cuda_version }}
run:
- {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}
{% if cuda_major == "11" %}
- cuda-python {{ cuda11_cuda_python_version }}
- cudatoolkit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking... was removing cudatoolkit for CUDA 11.x packages also intentional here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw this in the description:

I also removed cudatoolkit because that is a transitive dependency of RAPIDS libraries, and only needed for CUDA 11

But this is in a block that's CUDA-11-specific.

Copy link
Contributor Author

@bdice bdice Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was intentional. We can revert that change if we want to keep this PR narrower, but it falls into the same bucket of "don't put packages in the rapids metapackage if we expect them to be supplied transitively by RAPIDS libraries, because it has the potential to make the solver do bad things with its priorities."

There are other packages that also fall in this category, like ptxcompiler (which is a hard dependency of cudf's CUDA 11 package).

The rapids metapackage (imo) should be only RAPIDS libraries, plus any other constraints needed to make the environment simpler for conda/mamba to resolve or more reproducible. If others agree, we could remove things like numba, numpy, etc. and rely on RAPIDS libraries to pin their own compatible versions of those. We certainly don't want the rapids metapackage to make the pinnings for packages like those any tighter than what RAPIDS libraries collectively support.

Changes of that scale would require careful testing, and wouldn't be suitable for 24.02.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thank you for that explanation! Just wanted to double-check that this was intentional, and that I understood it fully.

@raydouglass
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 9897d67 into rapidsai:branch-24.02 Jan 30, 2024
11 checks passed
rapids-bot bot pushed a commit to rapidsai/docker that referenced this pull request Feb 1, 2024
Fixes attempted in rapidsai/integration#695 & rapidsai/integration#697 don't play well with CEC at large, but this repo can pin `cuda-python` more restrictively.

This PR pins `cuda-python` to the CUDA `major.minor.*` version.

See also conda-forge/cuda-python-feedstock#66

Authors:
  - Ray Douglass (https://github.com/raydouglass)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Jake Awe (https://github.com/AyodeAwe)

URL: #624
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants