Can a dynamic programming problem be transformed into a linear algebra problem?

Here is a simple standard economic problem:

Let Robin Crusoe have an endowment $w_0$ of lembas bread. She is immortal and discounts at a rate of $\beta$ per period. Each period (from $t=0$ on) she choose to consume some amount $c_t\in [0,w_t]$, and since lembas bread does not decay, $w_{t+1} = w_t - c_t$. Optimize Robin's utility $$\sum_{t=0}^\infty \beta^t u(c_t)$$ subject to the constraint $w_{t+1} + c_t = w_t$.

Here $u$ is a concave function, e.g., $u(c) = c^{\theta}$ or $u(c) = \log(c)$.

To solve this using dynamic programming, one uses Bellman's principle that each period, a choice will be made given optimal behavior in all future periods. Thus one obtains a series of value functions $v_t$ solving $$v_t(w_t) = \max_{w'\in [0,w_t]} u(w'-w_t) + \beta v_t(w').$$ This is to say, on the constraint set $\Omega_t = [0,w_t]$, $v_t$ is a stationary point of the maximization operator $$T_uv(b) = \max_{b'\in \Omega_t}u(b'-b) + \beta v(b'),$$ acting on (say) $L^\infty(\Omega_t)$.

This is not a linear operator, so tools from linear algebra cannot be brought to bear. I'm wondering if the problem can be "linearized" in the following sense:

Is there a linear operator $L_u$ acting on $L^\infty(\Omega_t)$ which "contains the same information as $T_u$" in the sense that a stable point of $T_u$ will either be an eigenvector of $L_u$ or will lie in the kernel of $L_u$?

I've played around with the idea that an integral operator $L$, say, $Lv(b) = \int U(b,b')v(b')db'$, might be able to capture this maximization problem, but I'm afraid I just don't yet have the intuition to see how this should play out. So I put the question to you.


You may be interested in the policy iteration algorithm.

To clear something up the value function is not time-dependent and solves $$ v(w)=\max_{w'\in [0,w]}\{ u(w-w')+\beta v(w')\}. $$ This should be solved on the interval $[0,w_0]$ for $w_0$ the initial allotment of bread.

For policy iteration, fix a policy $c:[0,w_0]\rightarrow \mathbb{R}$ with $0\leq c(w)\leq w$ and then $v^c$ is solved by the linear equations $$ v^c(w)=u(c(w))+\beta v^c(w-c(w)). $$ This step will use linear algebra, but the next step still requires non-linear maximization.

Given $v^c$ choose the next policy by $$ c'(w)={\rm argmax}_{d\in [0,w]}\{ u(d)+\beta v(w-d)\}. $$

If you start with the policy $c_0(w)=w$, then $v^0(w)=u(w)$ and the first step finds $$ c_1(w)={\rm argmax}_{d\in [0,w]}\{ u(d)+\beta u(w-d)\}. $$

This procedure often converges very rapidly but may not be exactly what you are looking for. Cheers!


Hi Neal: This may not be what you are looking for, as I don't know if I can offer the more technical answer regarding linearizations and kernels, but it occurred to me that the following condition might be a useful solution approach:

Select $c_t : u(c_t) = \beta u(w_t-c_t)\;,\; \sum\limits_{t=0}^{\infty} c_t = w_0$ That way, you eat in time t until the opportunity cost for eating the rest becomes negative relative to eating the rest it the next day.

Again, not a complete answer, but maybe that will help you make progress?