From 4c3a35a85293ab83c433bda387876709d1295920 Mon Sep 17 00:00:00 2001 From: William Moses Date: Sun, 25 Aug 2024 18:27:41 -0500 Subject: [PATCH] Add additional resources --- src/ecosystem.md | 14 ++++++++++---- src/future_work.md | 12 ++++++------ src/limitations.md | 2 +- src/other_Frontends.md | 5 ++--- 4 files changed, 19 insertions(+), 14 deletions(-) diff --git a/src/ecosystem.md b/src/ecosystem.md index 4dc8886..65492de 100644 --- a/src/ecosystem.md +++ b/src/ecosystem.md @@ -1,13 +1,19 @@ # History and ecosystem -Enzyme started as a PhD project of William Moses and Valentin Churavy, that was able to differentiate the LLVM-IR generated by a subset of C and Julia. It has since been extended by frontends for additional languages. Enzyme is an LLVM Incubator projects and intends to ask for upstreaming later in 2024. +Enzyme started as a project created by William Moses and Valentin Churavy to differentiate the LLVM-IR, including languages with an LLVM frontends like C, Julia, Swift, Fortran, etc. Operating within the compiler enables Enzyme to interoperate with optimizations, allowing for higher performance than conventional methods while simultaneously not needing special handling for each language and construct. Enzyme is an LLVM Incubator projects and intends to ask for upstreaming later in 2024. + +In 2020, initial investigations on using Enzyme on Rust was led by Tiberius Ferreria and William Moses through the use of foreign function calls (https://internals.rust-lang.org/t/automatic-differentiation-differential-programming-via-llvm/13188/7). + +In 2021, Manuel Drehwald and Lorenz Schmidt worked on [Oxide-Enzyme](https://github.com/EnzymeAD/oxide-enzyme) which aimed to directly integrate Enzyme as a compiler-aware cargo plugin. + +The current [Rust-Enzyme](https://github.com/EnzymeAD/rust) project direct embeds Enzyme into rust and makes available autodiff macros for easy usage. The project is led by Manuel Drehwald, in collaboration with Jed Brown, William Moses, Lorenz Schmidt, Ningning Xie, and Rodrigo Vargas-Hernandez. ## Development of a Rust-Enzyme frontend We hope that as part of the nightly releases Rust-Enzyme can mature relatively fast because: -1) Unlike Enzyme.jl, Rust won't encounter bugs based on Garbage Collection, JIT, or Type Unstable code. -2) Unlike Clang, we do ship the source code for the standard library. On the Rust side, we therefore don't need to manually add support for functions like [`_ZSt18_Rb_tree_decrementPKSt18_Rb_tree_node_base`](https://github.com/EnzymeAD/Enzyme/pull/764/files#diff-33703e707eb3c80e460e135bec72264fd2380201070a2959c6755bb26c72a504R190). +1) Unlike Julia, Rust does not emit code involving Garbage Collection, JIT, or Type Unstable code -- simplifying the inputs to Enzyme (and reducing the need to develop support for such mechanisms, which have since been added to Enzyme.jl). +2) Unlike Clang, we do ship the source code for the standard library. On the Rust side, we therefore don't need to manually add support for functions libstdc++ like [`std::map decrement`](https://github.com/EnzymeAD/Enzyme/pull/764/files#diff-33703e707eb3c80e460e135bec72264fd2380201070a2959c6755bb26c72a504R190). 3) Minimizing Rust code is reasonably nice and Cargo/crates.io makes it easy to reproduce bugs. @@ -15,7 +21,7 @@ We hope that as part of the nightly releases Rust-Enzyme can mature relatively f The key aspect for the performance of our solution is that AD is performed after compiler optimizations have been applied (and is able to run additional optimizations). This observation is mostly language independent and motivated in the -first Enzyme paper (covering C/C++/Julia), and also mentioned towards the end of this non-Enzyme java autodiff [case-study](https://github.com/openjdk/babylon-docs/blob/master/site/articles/auto-diff.md). +[2020 Enzyme Neurips paper](https://proceedings.neurips.cc/paper/2020/file/9332c513ef44b682e9347822c2e457ac-Paper.pdf), and also mentioned towards the end of this non-Enzyme java autodiff [case-study](https://github.com/openjdk/babylon-docs/blob/master/site/articles/auto-diff.md). ### Wrapping cargo instead of modifying rustc diff --git a/src/future_work.md b/src/future_work.md index 5110719..fce9305 100644 --- a/src/future_work.md +++ b/src/future_work.md @@ -2,9 +2,9 @@ ### Parallelism: -Enzyme currently does not handle Rust parallelism (rayon). -Enzyme does (partly) support various parallel paradigms: OpenMP, MPI, CUDA, Rocm, Julia tasks. -Enzyme only does need to support the lowest level of parallelism for each language, +Enzyme supports the ability to efficiently differentiate parallel code. Enzyme's unique ability to combine optimization (including parallel optimization) enables orders of magnitude improvements on performance and [scaling parallel code](https://ieeexplore.ieee.org/document/10046093). Each parallel framework needs only provide Enzyme lightweight markers describing where the parallelism is created (e.g. this is a parallel for or spawn/sync). Such markers have been added for various parallel paradigms, including: CUDA, ROCm, OpenMP, MPI, Julia tasks, and RAJA. + +Such markers have not been added for Rust parallel libraries (i.e. rayon). Enzyme only does need to support the lowest level of parallelism for each language, so adding support for rayon should cover most cases. We assume 20-200 lines of code in Enzyme core should be sufficient, making it a nice task to get started. [rsmpi](https://github.com/rsmpi/rsmpi) (Rust wrapper for MPI) should already work, but it would be good to test. @@ -36,14 +36,14 @@ Please let us know if you have an application that can benefit from a custom all otherwise this likely won't be implemented in the forseeable future. ### Checkpointing: -While Enzyme is very fast due to running optimizations before AD, we don't explore all the classical AutoDiff tricks yet. Namely we do miss support for adjusting checkpointing decisions, which describes the question of whether we want to cache or recompute values needed for the gradient computations. It generally lies in NP to find the optimal balance for each given program, but there are good approximations. You can think of it in terms of custom allocators. Replacing the algorithm might affect your runtime performance, but does not affect the result of your function calls. In the future it might be interesting to let the user interact with checkpointing. +While Enzyme is very fast due to running optimizations before AD, including various partial checkpointing algorithms -- such as a [min-cut algorithm](https://dl.acm.org/doi/abs/10.1145/3458817.3476165). The ability to control checkpointing (e.g. whether to recompute or store) has not yet been added to Rust. Optimal checkpointing generally lies in NP to find the optimal balance for each given program, but there are good approximations. You can think of it in terms of custom allocators. Replacing the algorithm might affect your runtime performance, but does not affect the result of your function calls. In the future it might be interesting to let the user interact with checkpointing. ### Supporting other Codegen backends: -Enzyme core consists of ~50k LoC. Most of the rules around generating derivatives for instructions are written in LLVM Tablegen.td declarations and as such it should be relatively easy to port them. Enzyme core also includes various experimental features which we don't need on the Rust side, an implementation for another codegen backend could therefore also end up a bit smaller. +Enzyme consists of ~50k LoC. Most of the rules around generating derivatives for instructions are written in LLVM Tablegen.td declarations and as such it should be relatively easy to port them. Enzyme also includes various experimental features which we don't need on the Rust side, an implementation for another codegen backend could therefore also end up a bit smaller. The cranelift backend would also benefit from ABI compability, which makes it very easy to test correctness of a new autodiff tool against Enzyme. Our modifications to `rustc_codegen_ssa` and previous layers of rustc are written in a generic way, s.t. no changes would be needed there to enable support for additional backends. ### GPU / TPU / IPU / ... support. -Enzyme core supports differentiating CUDA/ROCm Kernels. +Enzyme supports differentiating CUDA/ROCm Kernels. There are various ways towards exposing this capabilities to Rust. Manuel and Jed will be experimenting with two different approaches in 2024, and there is also a lot of simultaneous research. Please reach out if diff --git a/src/limitations.md b/src/limitations.md index 7c154ac..d304605 100644 --- a/src/limitations.md +++ b/src/limitations.md @@ -2,7 +2,7 @@ ## Safety and Soundness -Enzyme currently does assume that the user passes shadow arguments (`dx`, `dy`, ...) of appropriate size. +Enzyme currently assumes that the user passes shadow arguments (`dx`, `dy`, ...) of appropriate size. Under Reverse Mode, we additionally assume that shadow arguments are mutable. In both modes we insert automatically checks to verify that `Dual`/`Duplicated` slices have shadow arguments of the right size. In Reverse Mode we also adjust the outermost pointer or reference to be mutable. Therefore `&f32` will receive the shadow type `&mut f32`. diff --git a/src/other_Frontends.md b/src/other_Frontends.md index 829eb67..b4845ae 100644 --- a/src/other_Frontends.md +++ b/src/other_Frontends.md @@ -5,9 +5,8 @@ Enzyme currently has experimental frontends for C/C++, Julia, Fortran, Numba, so General LLVM/MLIR, as well as C/C++/CUDA documentation is available at [https://enzyme.mit.edu](https://enzyme.mit.edu) Julia documentation is available at [https://enzyme.mit.edu/julia](https://enzyme.mit.edu/julia) Rust documentation is available at [https://enzyme.mit.edu/rust](https://enzyme.mit.edu/rust) -Enzyme-JAX interop is available at [https://github.com/EnzymeAD/Enzyme-JAX](https://github.com/EnzymeAD/Enzyme-JAX) -Numba documentation is tba. -Fortran documentation is tba. +Enzyme-JAX (including HLO MLIR AD) is available at [https://github.com/EnzymeAD/Enzyme-JAX](https://github.com/EnzymeAD/Enzyme-JAX). +Enzyme has been demonstrated on various other languages including [Swift](https://passivelogic.com/blog/?post=using-enzyme-autodiff-with-swift&category=autodiff), and [Fortran](https://github.com/ludgerpaehler/LULESH-Fortran/blob/main/Makefile), but no frontend has been developed to improve ease of use and installation for these languages. We have a compiler-explorer fork with support for autodiff in C/C++/CUDA, Julia, and MLIR [here](https://enzyme.mit.edu/explorer).