Optimality — Hamilton-Jacobi-Bellman (HJB) versus Riccati
Solution 1:
My simple answer would be that they are quite the same thing.
Explanation
Riccati Equations can be derived from Hamilton-Jacobi-Bellman equations in the particular case of LQR problem, an optimal control problem where the dynamics is linear and the cost is quadratic.
Consider the finite horizon LQR problem. In this particular case, the Hamilton-Jacobi-Bellman equation has the expression (for semplicity I put $N=0$)
$$ \partial_t V(x,t) + \min_u \left\{ \partial_x V(x,t) \cdot (Ax+Bu) + x^T Q x + u^T Ru \right\} = 0 $$ with the terminal condition $$ V(x,T) = x^T Q_f x . $$
Now we look for solutions of the form $V(x,t) = x^T P(t) x$, where $P(t)$ is a symmetric matrix for each $t \in [0,T]$. If we substitute this expression in the (HJB) equation, we get
$$ x^T P'(t) x + \min_u \left\{ 2 P(t)x \cdot (Ax+Bu) + x^T Q x + u^T Ru \right\} = 0 . $$
We can explicitly find the minimum of the expression inside the curly brackets. For a given couple $(t,x)$, let us define $\Phi$ as $$ \Phi(u) = 2 P(t)x \cdot (Ax+Bu) + x^T Q x + u^T Ru .$$
The minimum is obtained when $∇\Phi(u) = 0$, that is when $$ 2B^T P(t) x + 2 Ru = 0,$$ so the optimal control is $$ u^*(t,x) = -R^{-1} B^T P(t)x ,$$ with $$ \Phi(u^*(t,x)) = 2 x^T P(t)^T \left(Ax-BR^{-1} B^T P(t)x \right) + x^T Q x + x^T P(t)^T B R^{-1} B^T P(t)x $$
So we can rewrite again the (HJB) equation without the minimization term: $$ x^T P'(t) x + 2 x^T P(t)^T \left(Ax-BR^{-1} B^T P(t)x \right) + x^T Q x + x^T P(t)^T B R^{-1} B^T P(t)x = 0 , $$ and by grouping the $x^T$ and the $x$ term and doing some simple algebraic steps ($P(t)$ is symmetric) we get $$ x^T \left( P'(t) + 2 P(t) A - P(t) BR^{-1} B^T P(t) + Q \right) x = 0 , $$
Since the above equation must hold for each $x$, it is equivalent to the matrix differential equation
$$ P'(t) + 2 P(t) A - P(t) BR^{-1} B^T P(t) + Q = 0 .$$
Finally, in order to satisfy the final condition, it must be $$ P(T) = Q_f .$$
Comments:
- This is not a rigorous proof that the two equations are equivalent, but it shows that they are quite the same thing.
- I couldn't get the term $A^TP(t) + P(t)A$ of the Riccati equations, instead I found $2P(t) A$.