Expected Value of Random Walk

Can someone very simply explain to me how to compute the expected distance from the origin for a random walk in $1D, 2D$, and $3D$? I've seen several sources online stating that the expected distance is just $\sqrt{N}$ where $N$ is the number of time steps, but others say that the expected distance is $\sqrt{\frac{2N}{\pi}}$. Which one is it and is it the same regardless of dimension?

Thanks


Solution 1:

The expected value of the square of the absolute distance from the origin is $N$ (you are adding together $N$ independent random variables with mean $0$ and absolute magnitude $1$), and this is true in any dimension.

So those sources which are telling you $$\sqrt N$$ are giving you this as in some sense the "root mean square" distance from the starting point. It is not the expected value of the distance.

For a one dimensional random walk the expected absolute distance from the origin after $N$ steps is not easy to state explicitly, but as $N$ increases it becomes close to $$\sqrt{\dfrac{2N}{\pi}}.$$ So the sources which give you that are in a sense talking about a limit.

This changes for higher dimensions: if there are $d$ dimensions then the expected absolute distance from the origin after $N$ steps becomes close to $$\sqrt{\dfrac{2N}{d}} \dfrac{\Gamma(\frac{d+1}{2})}{\Gamma(\frac{d}{2})}$$ where $\Gamma$ is the Gamma function. As the number of dimensions increases, this get close to but is still below $\sqrt N$.

This is related to the means of the chi- and chi-squared distributions.

Solution 2:

In an arbitrary dimension d:

Let $\vec{R}$ be the end-to-end distance vector of a random walk of fixed step length $|\vec{r}_i| = l$. $\vec{R}$ can then be expressed as $\displaystyle \vec{R} = \sum_{i=1}^N \vec{r}_i$, where $\vec{r}_i$ is the vector of the $i$-th step. The Root-Mean-Square End-to-End Distance is given by $\textrm{RMS}=\sqrt { \langle R^2 \rangle }$. Since the steps are mutually independent, the covariance of two steps $\vec{r}_i$ and $\vec{r}_j$ is zero if $i\neq j$ and $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j)= \textrm{Var}(\vec{r}_i)$ if $i=j$. The variance of $ \vec{r}_i$ can be expressed as $ \textrm{Var}(\vec{r}_i)= \langle \vec{r}_i \cdot \vec{r}_i \rangle - \langle \vec{r}_i \rangle^2$. Due to symmetry $\langle \vec{r}_i \rangle=\vec{0}$ and therefore the variance of of $ \vec{r}_i$ is simply $ \textrm{Var}(\vec{r}_i)= \langle \vec{r}_i \cdot \vec{r}_i \rangle = |\vec{r}_i|^2 = l^2$. Altogether, the covariance of $\vec{r}_i$ and $\vec{r}_j$ equals $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j)=\delta_{ij}l^2$. The covariance of $\vec{r}_i$ and $\vec{r}_j$ can also be expressed as $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j) = \langle \vec{r}_i \cdot \vec{r}_j \rangle - \langle \vec{r}_i \rangle \cdot \langle \vec{r}_j \rangle$. Combining the two different expressions for the covariance and using that $\langle \vec{r}_i \rangle=0$, results in $\langle \vec{r}_i \cdot \vec{r}_j \rangle =\delta_{ij}l^2$. This result can be used to determine the RMS:

$$\textrm{RMS}=\sqrt { \langle R^2 \rangle } = \sqrt { \langle \vec{R} \cdot \vec{R} \rangle } =\sqrt { \big\langle \sum_{i=1}^N \vec{r}_i \cdot \sum_{j=1}^N \vec{r}_j \big\rangle } =\sqrt { \sum_{i=1}^N \sum_{j=1}^N \langle \vec{r}_i \cdot \vec{r}_j \rangle }= $$ $$=\sqrt { \sum_{i=1}^N \sum_{j=1}^N l^2 \delta_{ij} + 0^2}=\sqrt { \sum_{i=1}^N l^2}=\sqrt { N l^2}=l\sqrt { N }$$

Let $Z_i$ denote the $i$-th coordinate of the end-to-end distance vector $\vec{R}$ after $N$ steps, and let $X_i$ and $Y_i$ denote the number of steps taken in the $i$-th dimension in the positive and negative direction respectively. Then the set of random variables $\{X_i, Y_i\}_{i=1}^d$ follows a multinomial distribution with parameters $N$ and $\displaystyle p_i=\frac{N}{2d}$. For sufficiently large values of $N$, $\{X_i, Y_i\}_{i=1}^d$ are approximately iid (independent and identically distributed) Poisson random variables with parameters $\displaystyle \lambda_i = \frac{N}{2d}$. For $\lambda > 20$, i.e. $N>40d$, $\textrm{Po}(\lambda) \sim \textrm{N}(\lambda, \lambda)$. $ Z_i = l(X_i - Y_i)$ and therefore $\displaystyle Z_i \sim \textrm{N}(l(\lambda - \lambda), l^2(\lambda+\lambda))=\textrm{N}(0, 2l\lambda)=\textrm{N}\left(0, \frac{l^2N}{d}\right)$.

$\displaystyle \langle R \rangle = \langle \sqrt{R^2} \rangle = \left\langle \sqrt{ \sum_{i=1}^d Z_i^2} \right\rangle$. The square root of a sum of $k$ independent $\textrm{N}(0, 1)$-distributed random variables is distributed according to the chi distribution, $\chi_k$. Therefore $\displaystyle \sqrt{ \sum_{i=1}^d \frac{dZ_i^2}{l^2N}}$ is approximately $\chi_d$-distributed for large values of $N$. The expected value of a $\chi_k$-distributed random variable is $\displaystyle \sqrt{2} \frac{ \Gamma \left(\frac{k+1}{2}\right) }{\Gamma \left( \frac{k}{2}\right)}$.

Hence $\displaystyle \langle R \rangle =\left\langle\sqrt{ \sum_{i=1}^d Z_i^2}\right\rangle =\left\langle l \sqrt{\frac{N}{d}} \sqrt{ \sum_{i=1}^d \frac{dZ_i^2}{l^2N} }\right\rangle = l \sqrt{\frac{2N}{d} }\frac{ \Gamma \left(\frac{d+1}{2}\right) }{\Gamma \left( \frac{d}{2}\right)}$.



The derivation above assumes that the walk is on a hypercubic lattice.

I am not aware any theoretical derivation of the estimated end-to-end distance for freely jointed random walks in an arbitrary dimension.

BELOW SIMULATIONS ARE INCORRECT! (INCORRECT SAMPLING OF ANGLES)

From simulations, the estimated end-to-end distance for freely jointed random deviates from that of random walks on hypercubic lattices by a factor $\alpha\approx 1.02e^{0.00055\hat d}\hat d^{-0.0295}$, $\displaystyle\hat d = \max(d, 50)$ shown in the graph below.

e.i. $\langle R\rangle_\mathbb{R}$ is approximately given by $\displaystyle \langle R\rangle_\mathbb{R} \approx \alpha \langle R\rangle_\mathbb{Z} = \alpha\sqrt{\frac{2N}{d}}\frac{\Gamma\left(\frac{d+1}{2}\right)}{\Gamma\left(\frac{d}{2}\right)}$, where $\displaystyle\alpha \approx 1.02e^{0.00055\hat d}\hat d^{-0.0295}$, $\displaystyle\hat d = \max(d, 50)$.