Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add full CI for wholegraph #58

Open
wants to merge 23 commits into
base: branch-24.12
Choose a base branch
from

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Oct 17, 2024

This PR introduces full end-to-end CI for wholegraph.

Notes for Reviewers

Most of these changes were pulled from a mix of the following:

@jameslamb jameslamb added improvement Improves an existing functionality non-breaking Introduces a non-breaking change DO NOT MERGE Hold off on merging; see PR for details labels Oct 17, 2024
@jameslamb jameslamb removed the DO NOT MERGE Hold off on merging; see PR for details label Oct 18, 2024
@@ -0,0 +1,74 @@
name: build
Copy link
Member Author

@jameslamb jameslamb Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow is the one that'll publish conda packages to rapidsai-nightly and wheels to https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/.

I added this because it's there in #53 but @alexbarghi-nv ... are you all ready for that? If this PR is merged as-is, then both this repo and https://github.com/rapidsai/wholegraph will be publishing packages with the same names, and with an overlapping set of versions.

That's especially problematic because the wholegraph code here is out of sync with the wholegraph repo. For example, I saw some CI builds here picking up pylibwholegraph packages from the rapidsai-nightly channel instead of the artifacts from CI, resulting in failures like this:

pylibwholegraph           24.12.00a9      cuda12_py312_241018_g0efba33_9    rapidsai-nightly
...
AttributeError: module 'pylibwholegraph.binding.wholememory_binding' has no attribute 'determine_partition_plan'

(build link)

Because the changes from rapidsai/wholegraph#194 appear not to have been ported over here.

I see 2 options:

  1. run nightly / branch builds in this repo, but don't turn on publishing yet
  2. merge this PR as-is, and then:
    • bring the wholegraph code in this repo up to date with https://github.com/rapidsai/wholegraph
    • move all wholegraph development to this repo (but keep the wholegraph repo un-archived for now, in case we need to produce 24.10 hotfixes)
    • have RAPIDS ops mark all other 24.12 pylibwholegraph and wholegraph conda packages on rapidsai-nightly as broken

I think you should pursue Option 2, but you have a lot more context than I do about the plan for this repo and these libraries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want Option 2. I'll reach out to the WholeGraph team.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we should hold off on a decision. I forgot there's another discussion going on right now that needs to be completed first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should hold off on a decision

No problem.

What about removing the nightly workflows entirely (build.yaml and test.yaml) for now, and just limiting the scope of this work to getting CI fully working on PRs?

That'd allow us to keep making progress on setting up CI here, and as long as nothing's being published from here yet, it won't cause any conflicts for anyone depending on the packages being produced from the wholegraph / cugraph repos.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine for now. I'll let you know what happens with the other discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright thanks! I just push a commit removing those nightly workflows. I'll put this up for review.

"wheel",
"cmake>=3.26.4,!=3.30.0",
"cython>=3.0.0",
"ninja",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry about these changes for build dependencies of the cugraph-dgl and cugraph-pyg wheels... we're not building wheels yet for cugraph-pyg / cugraph-dgl on this repo yet anyway.

In the next PR after this, I'll update those libraries to rapids-build-backend and make the appropriate splits in dependencies.yaml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@jameslamb jameslamb changed the title WIP: [DO NOT MERGE] add full CI for wholegraph add full CI for wholegraph Oct 18, 2024
@jameslamb jameslamb marked this pull request as ready for review October 18, 2024 17:23
@jameslamb jameslamb requested review from a team as code owners October 18, 2024 17:23
Copy link
Member

@alexbarghi-nv alexbarghi-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants