Skip to content

Commit

Permalink
make schur factorization differentiable
Browse files Browse the repository at this point in the history
  • Loading branch information
baggepinnen committed May 30, 2023
1 parent 42e3955 commit 39a90d4
Show file tree
Hide file tree
Showing 4 changed files with 55 additions and 53 deletions.
2 changes: 1 addition & 1 deletion docs/src/examples/automatic_differentiation.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ The following issues are currently known to exist when using AD through ControlS
- The function `svdvals` does not have a forward rule defined. This means that the functions [`sigma`](@ref) and `opnorm` will not work for MIMO systems with ForwardDiff. SISO, MISO and SIMO systems will, however, work.
- [`hinfnorm`](@ref) requires ImplicitDifferentiation.jl and ComponentArrays.jl to be manually loaded by the user, after which there are implicit differentiation rules defined for [`hinfnorm`](@ref). The implicit rule calls `opnorm`, and is thus affected by the first limitation above for MIMO systems. [`hinfnorm`](@ref) has a reverse rule defined in RobustAndOptimalControl.jl, which is not affected by this limitation.
- [`are`](@ref), [`lqr`](@ref) and [`kalman`](@ref) all require ImplicitDifferentiation.jl and ComponentArrays.jl to be manually loaded by the user, after which there are implicit differentiation rules defined. To invoke the correct method of these functions, it is important that the second matrix (corresponding to input or measurement) has the `Dual` number type, i.e., the `R` matrix in `lqr(P, Q, R)` or `lqr(Continuous, A, B, Q, R)`
- The `schur` factorization is not amenable to differentiation using ForwardDiff. This is the fundamental reason for requireing ImplicitDifferentiation.jl to differentiate through the Ricatti equation solver. `schur` is called in several additional places, including [`balreal`](@ref) and all [`lyap`](@ref) solvers. To make `schur` differentiable, an implicit differentiation rule would be required.
- The `schur` factorization has an implicit differentiation rule defined, but the companion function `ordschur` does not. This is the fundamental reason for requireing ImplicitDifferentiation.jl to differentiate through the Ricatti equation solver. `schur` is called in several additional places, including [`balreal`](@ref) and all [`lyap`](@ref) solvers. Many of these algorithms also call `givensAlgorithm` which has no rule either.
- An implicit rule is defined for continuous-time [`lyap`](@ref) and [`plyap`](@ref) solvers, but not yet for discrete-time solvers. This means that [`gram`](@ref) [`covar`](@ref) and [`norm`](@ref) (``H_2``-norm) is differentiable for continuous-time systems but not for discrete.

### Reverse-mode AD
Expand Down
2 changes: 1 addition & 1 deletion docs/src/man/differences.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The rest of this page will list noteworthy differences between ControlSystems.jl
- In Julia, functionality is often split up into several different packages. You may therefore have to install and use additional packages in order to cover all your needs. See [Ecosystem](@ref) for a collection of control-related packages.
- In Julia, `1` has a different type than `1.0`, and the types in ControlSystemsBase.jl respect the types chosen by the user. As an example, `tf(1, [1, 1])` is a transfer function with integer coefficients, while `tf(1.0, [1, 1])` will promote all coefficients to `Float64`.
- Julia treats matrices and vectors as different types, in particular, column vectors and row vectors are not interchangeable.
- In Julia, code can often be differentiated using automatic differentiation. When using ControlSystems.jl, we recommend trying [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl/) for AD. An example making use of this is available [here](https://github.com/JuliaControl/ControlExamples.jl/blob/master/autotuning.ipynb).
- In Julia, code can often be differentiated using automatic differentiation. When using ControlSystems.jl, we recommend trying [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl/) for AD. An example making use of this is available in [Automatic Differentiation](@ref).
- In Julia, the source code is often very readable. If you want to learn how a function is implemented, you may find the macros `@edit` or `@less` useful to locate the source code.
- If you run into an issue (bug) with a Julia package, you can share this issue (bug report) on the package's github page and it will often be fixed promptly. To open an issue with ControlSystems.jl, [click here](https://github.com/JuliaControl/ControlSystems.jl/issues/new/choose). Thank you for helping out improving open-source software!
- Julia compiles code just before it is called the first time. This introduces a noticeable lag, and can make packages take a long time to load. If you want to speed up the loading of ControlSystems.jl, consider building a system image that includes ControlSystems.jl using [PackageCompiler.jl](https://julialang.github.io/PackageCompiler.jl/stable/). More info about this is available below under [Precompilation for faster load times](@ref)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -227,35 +227,40 @@ function hinfnorm(sys::StateSpace{Continuous, <:Dual}; kwargs...)
end


# ## Schur, currently not working when the matrix A has complex eigenvalues.
# Sec 4.2 in "A PROCEDURE FOR DIFFERENTIATING PERFECT-FORESIGHT-MODEL REDUCED-FORM OEFFICIENTS", Gary ANDERSON has a formula for the derivative, but it looks rather expensive to compute, involving the factorization of a rather large Kronecker matrix. THis factorization only has to be done once, though, since it does not depend on the partials.
# function forward_schur(A)
# F = schur(A)
# ComponentVector(; F.Z, F.T), F
# end

# function conditions_schur(A, F, noneed)
# (; Z, T) = F
# [
# vec(Z' * A * Z - T);
# vec(Z' * Z - I + LowerTriangular(T) - Diagonal(T))
# ]
# end
# ## Schur
function forward_schur(A)
F = schur(A)
ComponentVector(; F.Z, F.T), F
end

# linear_solver = (A, b) -> (Matrix(A) \ b, (solved=true,))
# const implicit_schur = ImplicitFunction(forward_schur, conditions_schur, linear_solver)

# # vectors = Z
# # Schur = T
# # A = F.vectors * F.Schur * F.vectors'
# # A = Z * T * Z'
# function LinearAlgebra.schur(A::AbstractMatrix{<:Dual})
# ZT, F = implicit_schur(A)
# n = length(A)
# Z = reshape(ZT[1:n], size(A))
# T = reshape(ZT[n+1:end], size(A))
# Schur(T, Z, F.values)
# end
function conditions_schur(A, F, s)
(; Z, T) = F
if all(isreal, s.values)
[
vec(Z' * A * Z - T);
vec(Z' * Z - I + LowerTriangular(T) - Diagonal(T))
]
else
[
vec(Z' * A * Z - T);
vec(Z' * Z - I + UpperTriangular(T) - Diagonal(T))
]
end
end

const implicit_schur = ImplicitFunction(forward_schur, conditions_schur)

# vectors = Z
# Schur = T
# A = F.vectors * F.Schur * F.vectors'
# A = Z * T * Z'
function LinearAlgebra.schur(A::AbstractMatrix{<:Dual})
ZT, F = implicit_schur(A)
n = length(A)
Z = reshape(ZT[1:n], size(A))
T = reshape(ZT[n+1:end], size(A))
Schur(T, Z, F.values)
end


end # module
Expand Down
43 changes: 20 additions & 23 deletions lib/ControlSystemsBase/test/test_implicit_diff.jl
Original file line number Diff line number Diff line change
Expand Up @@ -183,34 +183,31 @@ J2 = fdgrad(difffun, pars)



## Schur currently not working when the matrix A has complex eigenvalues.
## Schur decomposition

# sys = ssrand(1,1,3, proper=true)
sys = ssrand(1,1,3, proper=true)
# sys.A .-= 1I(3)
# (; A, B, C, D) = sys

# function schur_sys(sys)
# SF = schur(sys.A)
# A = SF.T
# B = SF.Z'*sys.B
# C = sys.C*SF.Z
# ss(A,B,C,sys.D, sys.timeevol)
# end

# function difffun(pars)
# (; A,B,C,D) = pars
# sys = ss(A, B, C, 0)
# sys = schur_sys(sys)
# sum(abs2, freqresp(sys, 0.1))
# end
(; A, B, C, D) = sys

# pars = ComponentVector(; A,B,C,D)
function schur_sys(sys)
SF = schur(sys.A)
A = SF.T
B = SF.Z'*sys.B
C = sys.C*SF.Z
ss(A,B,C,sys.D, sys.timeevol)
end

# J1 = ForwardDiff.gradient(difffun, pars)[:]
# J2 = fdgrad(difffun, pars)[:]
function difffun(pars)
(; A,B,C,D) = pars
sys = ss(A, B, C, 0)
sys = schur_sys(sys)
sum(abs2, freqresp(sys, 0.1))
end

pars = ComponentVector(; A,B,C,D)

J1 = ForwardDiff.gradient(difffun, pars)[:]
J2 = fdgrad(difffun, pars)[:]
# @show norm(J1-J2)

# @test J1 ≈ J2 rtol = 1e-3
@test J1 J2 rtol = 1e-5

0 comments on commit 39a90d4

Please sign in to comment.