A Category Theoretical view of the Kalman Filter

Some basic background

The Kalman filter is a (linear) state estimation algorithm that presumes that there is some sort of uncertainty (optimally Gaussian) in the state observations of the dynamical system. The Kalman filter has been extended to nonlinear systems, constrained for cases where direct state observation is impossible, etc. As one should know, the algorithm essentially performs a prediction step based on prior state knowledge, compares that prediction to measurements, and updates the knowledge of the state based on a state covariance estimate that is also updated at every step.

In short, we have a state-space vector usually represented by a vector in $\mathbb{R}^n$. This vector is operated on by a plant matrix, added to a control term (which is operated on by a control matrix), operated on by a covariance matrix, etc. There are usually not intrinsic constraints on the state-space vector beyond those which may be embedded in the description of the dynamical system.

My question

Can we describe the Kalman filter using the language of category theory? In other words, can we generalize the algebra of what is happening with the Kalman filter for a dynamical system to develop Kalman-like estimators for other processes with vector-like components derived from some known (or partially known) model?

A motivating example

A repeated matrix game involves two or more players choosing strategies according to some payoff function -- which is a function of the game's state space and each player's action -- typically with some notion of equilibrium (e.g. Nash equilibrium). A repeated game with a finite action space implies that strategies are discrete probability distributions representable by a vector in $\mathbb{R}^N$ such that every element of the strategy vector is non-negative and the vector sums to unity.

This is a more restrictive case than a general dynamical system because we have constraints on the vector representation of the strategy. But, given input (i.e. actions taken), output (state measurement of the game's state space), and a payoff structure, it might make sense to attempt a Kalman-like estimator for the strategy vector in the game.

The challenge, of course, is developing a compatible notion of covariance. It's not clear what covariance would mean in relation to probability distributions, since we don't treat the quantity $P(\textrm{action} = \textrm{index})$ as random variable. But if we had a category theoretical view of the Kalman filter, could we derive a similar structure based on its algebraic properties in such a way that we can guarantee that our estimate of the opponent's strategy vector converges in some way?

Does what I'm asking even make sense?


Solution 1:

The most direct approach, already mentioned by Chris Culter, is to use the fact that Kalman filters are a kind of graphical model and look at categorical treatments of graphical models. This is a rather active area particularly amog people interested in quantum computation, quantum information and quantum foundations in general. Graphical models are usually treated as string/tensor/trace diagrams, a.k.a Penrose notation or graphical languages, rather than directly as categories. The main reference on the abstract categorical treatment of these kinds of diagrams is Peter Selinger, A survey of graphical languages for monoidal categories There is also a very nice interdisciplinary survey by John Baez that covers this and a number of related areas. People with publications in this are include Samson Abramsky, Bob Coecke, Jared Culbertson, Kirk Strutz, Robert Spekkens, John Baez, Jacob Biamonte, Mike Stay, Bart Jacobs, David Spivak and Brendan Fong whose thesis was already provided by Chris Culter in his answer.

Categorical ttreatments graphical models require dealing with probability which is usually done with monads, often using Giry's monad.

You may also want to look at the work of the group at ETH Zurich that includes Patrick Vontobel & Hans-Andrea Loeliger. They work on a kind of graphical model they call Forney factor graphs. They do not use category theory themselves but they have papers on how to translate many kinds of computations, including Kalman filters, electrical circuits and EM algorithms to & from their graphical models. This also provides a means to translate between Kalman filters and electrical circuits indirectly, so their work could be useful to transform KFs into other things things that already have categorical representations.

Here are links to some of the papers:

  • Kalman Filters, Factor Graphs, and Electrical Networks
  • Category theory in Machine learning on nCategory-cafe
  • Picturing classical and quantum Bayesian inference
  • Bayesian machine learning via category theory
  • A categorical foundation for Bayesian probability
  • Lectures on Penrose graphical notation for tensor network states
  • Towards a Categorical Account of Conditional Probability
  • The operad of temporal wiring diagrams: formalizing a graphical language for discrete-time processes

Solution 2:

A Kalman filter performs inference on a model which is a simple Bayesian network, and a Bayesian network is a kind of graphical model, and graphical models kind of look like categories, so there may be a connection there. Unfortunately, I can't personally go beyond that vague description. Searching the web for "graphical model" + "category theory" turned up an interesting thesis: Brendan Fong (2012) "Causal Theories: A Categorical Perspective on Bayesian Networks". There's no attempted connection to game theory, but it might be worth a look!

Solution 3:

Here is something on these lines (mean-field game theory and Kalman-type filtering): http://arxiv.org/abs/1302.6563