Is ergodic markov chain both irreducible and aperiodic or just irreducible?
As I find some definition says: Ergodic = irreducible. And then Irreducible + aperiodic + positive gives Regular Markov chain.
A Markov chain is called an ergodic chain if it is possible to go from every state to every state (not necessarily in one move).
Ergodic Markov chains are also called irreducible.
A Markov chain is called a regular chain if some power of the transition matrix has only positive elements.
Reference http://www.math.dartmouth.edu/archive/m20x06/public_html/Lecture15.pdf
However, in others the Regular Markov chain concept doesn't exist. And Ergodic replaces Regular in all literate.
So if our Markov chain is aperiodic, irreducible, and positive recurrent (all the ones we use in Bayesian statistics usually are), then it is ergodic.
we shall sometimes refer to an irreducible, aperiodic Markov chain as ergodic
Reference http://www.people.fas.harvard.edu/~plam/teaching/methods/mcmc/mcmc_print.pdf
http://www.cs.berkeley.edu/~sinclair/cs294/n2.pdf
I prefer the first definition by far. I relate the question to ergodic theory, as seems appropriate, and assume that the chain hass finitely many possible values, so as to not bother with positive recurrence.
Let us consider a finite state space $A$, and denote all the possible sequences of element in $A$ by $X:=A^{\mathbb{N}}$. Let us define a transformation $\sigma$ on $X$ by $(\sigma x)_n = x_{n+1}$ on $X$. For $x \in X$, we have $x_n = (\sigma^n x)_0$. In other words, by applying the transformation $\sigma$, I can read the successive values of a given sequence.
Now, let us take some probability measure $\mu$ on $A$ with full support (so as to see everything), and a stochastic matrix $P$ (the transition kernel). Using $\mu$ as the distribution of $X_0$ and the matrix $P$ to define transitions, we get a Markov chain $(X_n)_{n \geq 0} = x = ((\sigma^n x)_0)_{n \geq 0}$, which is a stochastic process with values in $A$. The distribution of $(X_n)_{n \geq 0}$ is a measure $\overline{\mu}$ on $A^{\mathbb{N}}$ which satisfies the usual conditions on cylinders, and whose first marginal is $\mu$.
The construction may look a bit confusing. However, if you forget about $\sigma$, it is what is done more or less informally when one defines Markov chains (that is the construction may be hidden, but it is there).
Hence, we can consider a Markov chain as a dynamical system $(X, \sigma)$ together with a probability measure $\overline{\mu}$. We can use the definitions of ergodic theory, and what we get in the end is that:
- the system $(X, \sigma, \overline{\mu})$ is measure-preserving if and only if $\mu$ is stationnary for $P$;
- the system $(X, \sigma, \overline{\mu})$ is ergodic (in the sense of ergodic theory) if and only if the Markov chain is irreducible;
- the system $(X, \sigma, \overline{\mu})$ is mixing if and only if the Markov chain is irreducible and aperiodic.
So these are two very different conditions, and aperiodicity does not correspond to ergodicity. As a corollary, one can apply ergodic theorems to Markov chains with no need for aperiodicity.
I consider also just Markov chains with finite state space, as I do not know much about ergodic theory with infinite measures.
Irreducibility of the transition graph of the state space means that a sample path cannot get trapped in smaller subsets of the state space, since one can go from everywhere to everywhere.
This implies, but one has to work this out, that the whole chain is ergodic: (almost) every path of the chain shows statistical stable behavior, that roughly means that the frequencies visiting any state of the space converge. Dynamically speaking, there exists an invariant probability measure on the space of all paths.
The opposite is obviously true: if there is an invariant measure (that is supported by the whole state space) then one can go from every state to every state.
Aperiodic transition graphs imply roughly speaking that there are no periodic phenomena hidden in the dynamic: the recurrence times of any point have greatest common divisor 1; more generally the n- step-transition matrix has positive entries for suitable large times n.
This matrix property is usually used to prove convergence of the probabilities at time n to a single distribution called equilibrium as n tends to infinity (read the Perron Frobenius theorem). The chain is then called mixing. Mixing is stronger than ergodic: it implies that the future states become even asymptotically independent of the initial state.
I personally never heard of regular transition graphs - but the matrix property referred in your link is again the same property one uses to show converge to equilibrium. I guess, but I don't know, that it is just a formal rather useless definition, since the only regular but not mixing chains I can think of can be decomposed in a suitable sense into mixing chains.
Hope this helps.
The first definition is preferable.
Ergodic Markov Chain is also called communicating Markov chain is one all of whose states form a single ergodic set; or equivalently, a chain in which it is possible to go from every state to every other state. Thus, the matrix is irreducible.
There are two types of Ergodic chain:
- Aperiodic ergodic chain (period = 1)
- Cyclic Markov Chain: ergodic chain with period larger than 1.
Regular Chains are always ergodic but Ergodic chains are not always regular.
Here are the simple way to tests:
Suppose $ \mathbf{P} $ is the stochastic transition matrix. Regular chain is primitive, that means there is positive integer k such that $$ \mathbf{P}^k>0 $$
For Ergodic chain, it is always irreducible (the graph is strongly connected), thus we can find positive integer k that make $ \mathbf{I+P} $ primitive: $$ \mathbf{(I+P)}^k>0 $$
Reference: Kemeny and Snell, Finite Markov Chains, Spinger-Verlag, 1976.