Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Enzyme into nightly rustc #611

Closed
3 tasks done
ZuseZ4 opened this issue Apr 12, 2023 · 3 comments
Closed
3 tasks done

Integrate Enzyme into nightly rustc #611

ZuseZ4 opened this issue Apr 12, 2023 · 3 comments
Labels
major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted T-compiler Add this label so rfcbot knows to poll the compiler team

Comments

@ZuseZ4
Copy link

ZuseZ4 commented Apr 12, 2023

Proposal

Enzyme is an LLVM incubator project that is able to vectorize and differentiate (in the calculus sense) a given function. Computing gradients efficiently is necessary for various algorithms such as backpropagation, Bayesian inference, uncertainty quantification, and probabilistic programming.

The reason why this work should be integrated into the rust compiler is that it has been shown both in theory and in practise (PyTorch 2.0, Jax/XLA, ...) that Automatic Differentiation (AD) of code is hard to do efficiently on unoptimized source code like Rust. Enzyme has shown a geometric mean speedup of 4.5x over mature pre-optimization AD tools. Enzyme has also been successfully used in combination with parallel paradigms like CUDA, ROCm, MPI and OpenMP. Enzyme also supports other languages, e.g., Julia, Fortran, compiled Python and C++. Handling LLVM-IR generated by such a variety of compilers does help to recognize and fix bugs early which otherwise would have been hard to discover using only Rust test cases.

The latest proof-of-concept rust-enzyme has demonstrated that our approach is mature enough and is able to solve all fundamental issues of earlier iterations. We, therefore, propose to conditionally enable Enzyme for nightly releases to allow more Rust developers to test the current state and provide valuable feedback.

We do intend to stabilize this work at some point because we believe that good AD and vectorization support is a fundamental building block of scientific computing and machine learning. Working around the lack of such a tool in our opinion has similar consequences as working around the availability of async, asm, or simd support for the corresponding groups. User written implementations would either lack performance and test coverage as explained above, need to re-implement various LLVM/Enzyme optimizations or transpile the user code from Rust into other languages that have AD support. As a consequence, we hope to support both existing projects and maintainers, as well as lowering the barrier for new contributors in this field.
We do intend to write a full RFC in more than a year from now, once we had sufficient users provide feedback on the Rust frontend of our work.

FAQ

Let us answer some of the questions that came up in the past. Please feel free to ask anything else that you are wondering about! We want to note that we deliberately avoid code examples of our current interface since it is subject to change and once closer to stabilization will be part of a following RFC.

  1. Enzyme previously has been following LLVM Versions based on user requests. In order to become a stable Rust feature Enzyme instead would need to follow closer to the LLVM development tip.

    Yes, we have three solutions here. First, we work on increasing Tablegen usage in our code base to simplify upgrades. Second, we will cfg-gate our work such that it can be enabled only for specific nightlies and will be disabled by default after every LLVM upgrade. Once confidence increases we can switch the default to always build Enzyme, except for increasingly rarer occasions where Enzyme does break on an LLVM upgrade. Third, we won't ask for integration into a stable release without also being integrated into LLVM proper.

  2. How about the technical dept of Enzyme? Do you break any LLVM assumptions?

    We do support opaque pointers. We do have some repetitive schemes due to the nature of AD but work on reducing this through Tablegen. We are not aware of doing illegal assumptions. We do have unnecessary bad compile times due to suboptimal data structures used for the Type Analysis of variables, but this should be solved before we will merge Enzyme into rustc.

  3. How about unimplemented / unfinished features?

    We currently do require fat-lto if people differentiate through structures or functions that are implemented in a third party crate. We do have a custom-derivative interface that could be used to remove this requirement at performance costs, which seems reasonable. Manuel will start experimenting with this next week on the Julia/Enzyme side, the Rust side should be a simpler subset due to not supporting JIT compilation.
    We currently do not support all enums/unions reliably. There are a few possible solutions to this which we want to experiment with.
    We do support all other Rust/C types and Generics, but not vTables (dyn Trait). Enzyme itself does support vTables, but we did not decide on a "rusty" way to expose this feature. We are looking for user feedback here.

  4. How about Generics and Constexpr?

    We do support both as part of functions being differentiated. However, we do not allow differentiating wrt. a Generic Parameter. Since those are also usually integers describing an array size or similar concepts it is also not clear how meaningful a gradient would be.

  5. So you support rustc_codegen_llvm. How about other codegen backends?

    By its nature Enzyme can not support non-llvm backends. However, we do add our code in a generic way to rustc such that a team interested in AD for cg_gcc or cg_cranelift could add support for additional backends. We believe that for the next years tools like Burn or dfdx could fill the gap, by either using PyTorch as a backend or by offering a simplified AD tool on Rust language level.

  6. How much code will you add to rustc and who will maintain it?

    We (Lorenz and Manuel) will maintain the Rust Frontend of Enzyme and we are open to accepting additional contributors. GSoC LLVM/Enzyme devs over the last years also tended to be more interested in the Rust frontend than in other languages, likely due to the general Rust popularity. We expect to only add around 2k LoC for implementing a #[autodiff(..)] and a [vectorize(..)] macro, parsing and validating input and forwarding the corresponding LLVM-IR to Enzyme. Tests and Documentation on top of the actual implementation therefore probably will be the largest part of our PR. Once implemented we expect bugfixes to happen on the LLVM/Enzyme side, the Rust part should only change rarely due to macro design adjustments.

Mentors or Reviewers

We (Lorenz, Manuel) intend to upgrade our existing PoC to the latest Rust nightly and implement all requested changes ourselves.

Acknowledgments

The first PoC has been implemented by tiberiusferreira. The next iteration was supported by Chuyang Chen. Part of this work was supported by Prof. Hartwig Anzt as well as Google/LLVM through Google Summer of Code projects. For our final approach, moving Enzyme into a rustc fork, we would like to thank various rustc developers for answering our questions on zulip, especially bjorn3 who was very helpful in guiding us through rustc. Finally, we would like to thank the other Enzyme developers, especially William Moses for their ongoing support.

Process

The main points of the Major Change Process are as follows:

  • File an issue describing the proposal.
  • A compiler team member or contributor who is knowledgeable in the area can second by writing @rustbot second.
    • Finding a "second" suffices for internal changes. If however, you are proposing a new public-facing feature, such as a -C flag, then full team check-off is required.
    • Compiler team members can initiate a check-off via @rfcbot fcp merge on either the MCP or the PR.
  • Once an MCP is seconded, the Final Comment Period begins. If no objections are raised after 10 days, the MCP is considered approved.

You can read more about Major Change Proposals on forge.

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

@ZuseZ4 ZuseZ4 added major-change A proposal to make a major change to rustc T-compiler Add this label so rfcbot knows to poll the compiler team labels Apr 12, 2023
@rustbot
Copy link
Collaborator

rustbot commented Apr 12, 2023

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

cc @rust-lang/compiler @rust-lang/compiler-contributors

@rustbot rustbot added the to-announce Announce this issue on triage meeting label Apr 12, 2023
@oli-obk
Copy link
Contributor

oli-obk commented Apr 13, 2023

@rustbot second

@rustbot rustbot added the final-comment-period The FCP has started, most (if not all) team members are in agreement label Apr 13, 2023
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Apr 13, 2023
@apiraino
Copy link
Contributor

@rustbot label -final-comment-period +major-change-accepted

@rustbot rustbot added major-change-accepted A major change proposal that was accepted to-announce Announce this issue on triage meeting and removed final-comment-period The FCP has started, most (if not all) team members are in agreement labels Apr 25, 2023
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted T-compiler Add this label so rfcbot knows to poll the compiler team
Projects
None yet
Development

No branches or pull requests

4 participants