-
Notifications
You must be signed in to change notification settings - Fork 2
Support exact Hessian of objective function J
in NonLinMPC
and MovingHorizonEstimator
#193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
NonLinMPC
and MovingHorizonEstimator
NonLinMPC
and MovingHorizonEstimator
J
in NonLinMPC
and MovingHorizonEstimator
@gdalle What's the approach to troubleshoot overly-conservative I almost finished the implementation of exact objective function Hessians for hess = AutoSparse(
AutoForwardDiff(),
sparsity_detector = TracerSparsityDetector(),
coloring_algorithm = GreedyColoringAlgorithm()
) and looking at the sparsity pattern with: prep_∇²J = prepare_hessian(Jfunc!, hess, Z̃_J, context_J...; strict)
display(sparsity_pattern(prep_∇²J)) always results in a fully dense patern, e.g.: 65×65 SparseArrays.SparseMatrixCSC{Bool, Int64} with 4225 stored entries:
⎡⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⎤
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⎥
⎣⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⎦ I look at the entries and many of them are zero. I'm fairly certain that many of them are globally zero (as it is generally the case for the Hessian of a NLP) but not correctly detected by I also tried 65×65 SparseArrays.SparseMatrixCSC{Bool, Int64} with 185 stored entries:
⎡⠑⣤⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠉⠿⢇⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠘⢻⣶⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠿⣧⡄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠉⢱⣶⣀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⠿⣧⡄⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⢿⣷⎦ It also works well for the optimization from my quick closed-loop tests. But, I'm not entirely confident that this tracer is too optimistic, and the optimization will fails in some corner cases. |
Here's another hint that the global detector is "wrong". If I use this sparse symbolic backend: adbackend = AutoSparse(AutoSymbolics()) the returned sparsity pattern is the same as the one returned by 65×65 SparseArrays.SparseMatrixCSC{Bool, Int64} with 185 stored entries:
⎡⠑⣤⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠉⠿⢇⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠘⢻⣶⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠿⣧⡄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠉⢱⣶⣀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⠿⣧⡄⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⢿⣷⎦ I assume that the built-in sparsity pattern detector of |
Could you maybe provide
Essentially the question is whether this is unavoidable or not. Right now I can't tell you ^^ |
I'm not able to produce a MWE without MPC.jl, unfortunately. 🫤 You can using ModelPredictiveControl, ControlSystemsBase
using DifferentiationInterface, SparseConnectivityTracer, SparseMatrixColorings
import ForwardDiff
f(x,u,_,_) = 0.1*x + u
h(x,_,_) = x
model = NonLinModel(f, h, 50.0, 1, 1, 1, solver=nothing)
adbackend = AutoSparse(
AutoForwardDiff(),
sparsity_detector = TracerSparsityDetector(), # or TracerLocalSparsityDetector(),
coloring_algorithm = GreedyColoringAlgorithm()
)
mpc = NonLinMPC(model; Hp=10, hessian=adbackend, transcription=MultipleShooting())
res = sim!(mpc, 1, x̂_0=zeros(2), x_0=zeros(1)) giving for 23×23 SparseArrays.SparseMatrixCSC{Bool, Int64} with 529 stored entries:
⎡⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎤
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎥
⎣⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠇⎦ and for 23×23 SparseArrays.SparseMatrixCSC{Bool, Int64} with 43 stored entries:
⎡⠑⣤⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠛⣤⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠛⣤⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠛⣤⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠛⣤⠀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠛⠄⎦ |
My best guess is that you're hitting some of the overly conservative linear algebra overloads, but I'm not sure which one. Will investigate |
OK, I figured it out and there are two layers of shady stuff going on. The underlying issue is that SCT ignores values when handling global tracers, even the values of fixed, ordinary numbers. In a nutshell, But this wouldn't be an issue at all if MPC handled structural sparsity better. I managed to make the SCT routine work in #198 by replacing some dense matrices with sparse ones, which turns "accidental" zeros into structural zeros. I am not sure this is the right implementation for your use case, but I definitely think you should pay more attention to this idea in general. |
Thanks a lot for your time and the PR, there is certainly great ideas in it. I avoided "hard-coded" sparse matrices in MPC.jl to avoid loosing performances on small matrices. It is well known that dense matrix are generally faster than sparse one when the matrix are small. This is even more important for real-time applications like MPC. Indeed, a plant model that is well-suited for real-time applications is a simple and low-order one. If an engineer as the choice between a crude and low-order plant model (thus with small matrices) or a high-fidelity and high-order plant model (thus with large matrices), the crude model is typically more suited for MPC. Remember that MPC incorporate feedback, it is thus more forgiving on approximate plant model in comparison to open-loop optimal control (trajectory optimization). That being said:
|
I guess it's hard to create a framework that works equally well with small and large instances. But there are various ways to be small or large. Typically, regarding block-diagonal matrices, are we talking about the size of each block, or the number of blocks?
Perhaps an interesting question to ask is whether these large matrices are even needed in the first place. I don't know what requires the block diagonal structure (I guess time indices?) but maybe there is a way to apply these matrices as lazy operators, without even materializing them. In other words, maybe the way we write down the math is not the way we should code? Take everything I say with a grain of salt of course, I know nothing about optimal control. But I'd be happy to collaborate on research if new perspectives open from the software side |
@franckgaga there's an SCT fix on the branch adrhill/SparseConnectivityTracer.jl#243 if you want to take it out for a spin |
No description provided.
The text was updated successfully, but these errors were encountered: