a dynamical systems view of the central limit theorem?

I have seen many heuristic discussions of the classical central limit theorem speak of the normal distribution (or any of the "stable distributions") as an "attractor" in the space of probability densities. For example, consider these sentences at the top of Wikipedia's treatment:

In more general usage, a central limit theorem is any of a set of weak-convergence theorems in probability theory. They all express the fact that a sum of many independent and identically distributed (i.i.d.) random variables, or alternatively, random variables with specific types of dependence, will tend to be distributed according to one of a small set of attractor distributions. When the variance of the i.i.d. variables is finite, the attractor distribution is the normal distribution.

This dynamical systems language is very suggestive. Feller also speaks of "attraction" in his treatment of the CLT in his second volume (I wonder if that is the source of the language), and Yuval Flimus in this note even speaks of the "basin of attraction." (I don't think he really means "the exact form of the basin of attraction is deducible beforehand" but rather "the exact form of the attractor is deducible beforehand"; still, the language is there.) My question is: can these dynamical analogies be made precise? I don't know of a book in which they are.

Roughly, I imagine taking the phase space to be a suitable infinite-dimensional function space (the space of probability densities, say with finite variance) and taking the evolution operator to be repeated convolution with an initial condition. But I have no sense of the technicalities involved in making this picture work or whether it is worth pursuing. I would guess that since I can't find a treatment that does pursue this approach explicitly, there must be something wrong with my sense that it can be done or that it would be interesting. If that is the case, I would like to hear why.

Solution 1:

The analogy of dynamical system might be seem from the view of Ornstein-Uhlenbeck semigroup. Let us consider the Central Limit Theorem in $\mathbb{R}^d$. Let $f$ be a $1$-Lipschitz function, for any $x \in \mathbb{R}^d$ and $t \geq 0$, we define an opertor $\mathcal T$ by $$ [\mathcal T(t)]\ f(x) = f\left ( e^{-t} x + \sqrt{1- e^{-2t}} Z\right ), $$ where $Z \sim N(0, I_d)$. Operator $\mathcal T$ is slightly different from the Ornstein-Uhlenbeck operator, which takes expectation over $Z$. It can be shown that $$ \mathcal T (t+s) \stackrel{d.}= \mathcal T(t) \circ \mathcal T(s) $$ for all $t, s \geq 0$, where equality means the distributions are the same. In addition, $T(0)$ is the identity mapping. Thus $\mathcal T$ can be viewed as the evolution operator.

Moreover, for $Z \sim N(0, I_d)$, $$ [\mathcal T(t) ] \ f(Z) \stackrel{d.}{=} f(Z), $$ that is, $N(0, I_d)$ is invariant to $\mathcal T$. And when $t \rightarrow \infty$, $[\mathcal T(t) ] \ f(x) \xrightarrow {d.} f(Z)$ for all $x \in \mathbb{R}^d$. Thus the Gaussian distribution is an attractor of this operator.

Why can this cosine sum function show all primes less than $N^2$?

Why does the Ellipsograph/Trammel of Archimedes draw an ellipse, really?

What differences would it make to live in $T^3$, in $S^2 \times S^1$, in $\mathbb{R}P^3$, or in $S^3$?

Can one factor matrices?

Find all functions on the non-zero reals to itself satisfying $f(xy)=f(x+y)(f(x)+f(y))$

How is the automorphism group of a Lie group given a differential structure?

Why does a golden angle based spiral produce evenly distributed points?

Maximum number of edges in a directed graph with the following "at most one path" condition

Calculate Bockstein homomorphism

how prove this inequality $\sum\limits_{1\le i<j\le n}|z_{i}-z_{j}|\le n\cot{\frac{\pi}{2n}}$

Exponential diophantine equation involving a prime number

Is a smooth function sending algebraic numbers to algebraic numbers a polynomial?