Random walk on natural number

Problem:

You are standing at the position $0$ on the line of natural numbers $0, 1, 2, ..., n$. From this position you go to $1$ with probability $1$, but from any other position $i$ you go to $i+1$ with probability $p$ and respectively to $i-1$ with probability $1-p$. What is the expected number of steps to go from 0 to n?

Just to be clear. There is no way you can end up on the negative number. And once you reach $n$ the game stops, so you can not end up on any number bigger than $n$.


Here is my approach to tackle this problem:

Let $N_i$ is the expected number of steps to reach $n$ from a position $i$. It is obvious that $N_n = 0$. Also $N_0 = 1 + N_1$. And if we select any number $i$ from $1$ to $n-1$, then $N_i = (1 - p) \cdot N_{i-1} + p \cdot N_{i+1} = 1 + (1-p) \cdot N_{i-1} + p \cdot N_{i+1}$.

Now I am trying to simplify things having these three equations:

$$N_0 = 1 + N_1\\ N_i = 1 + (1-p) \cdot N_{i-1} + p \cdot N_{i+1}\\ N_n = 0$$

Approach 1

Putting $i = 1$ in equation (2) I found that $N_1 = 2/p - 1 + N_2$, which I later put in (1) to get $N_0 = 2/p + N_2$

Doing the same with $i = 2$ I got $N_2 = 1 - 2/p + 2/p^2 + N_3$ and $N_0 = 1 + 2/p^2 + N_3$.

Just one more step with $i = 3$ gives me $N_3 = N_4 + 4/p - 4/p^2 + 2/p^3 - 1$ and $N_0 = 4/p - 2/p^2 + 2/p^3 + N_4$

I was doing all this in hope of finding some relationship which would allow me to guess what will be the relationship for the last or one before last step. But I fail to see any and therefore I gave up.

Approach 2

I noticed that: $N_0 - N_1 = 1$ and $N_{i-1} - N_i = 1 + (1 - p) \cdot (N_{i-2} - N_i)$

Here if I will sum the left side, I will get $N_0 - N_n = N_0$ and this gave me hope that I can get some closed form expression if I will add the right side. But when I add them up I got $n + (1 - p)(N_0 + N_1 - N_{n-1} - N_{n})$. This gave me $N_0 = \frac{n + (N_1 - N_{n-1})(1-p)}{p}$ which leads me nowhere because of $N_1$ and $N_{n-1}$.

Approach 3 (similar to approach 2, but I expand the first element, not the second). I will not write it completely, not because I have not thought about, but because I value your time and I think that the post is already too long.

So $N_0 - N_1 = 1$ and $N_i - N_{i+1} = -1 + p (N_i - N_{i+2})$. After the same summation idea as in approach 2 I ended up with $N_1$, $N_2$ and $N_{n-1}$ elements.


After this I gave up completely. Here I am not really sure that my starting idea

Let $N_i$ is the expected number of steps to reach $n$ from a position $i$.

is correct or whether the close formula for expected value exist. So how should I approach this problem?

P.S. I know that for $p = 0.5$ (go left and right with equal probability) the close formula exists and expected number of steps one need to take from $0$ to $n$ is $n^2$.


Yes your recurrence is right, i.e., $$N_i = 1 + (1-p)N_{i-1} + pN_{i+1}$$ with $N_0 = 1+N_1$ and $N_n = 0$. This can be rearranged to give $$p(N_{i+1}-N_i) - (1-p)(N_i-N_{i-1}) = -1$$ with $N_1 - N_0 = -1$ and $N_n = 0$. Setting $v_i = N_{i+1} - N_{i}$, we obtain the equations to be $$pv_i - (1-p)v_{i-1} = -1$$ with $v_0 = -1$ and $N_n=0$. Setting $v_i = u_i + k$, we obtain $$pu_i - (1-p)u_{i-1} + k(2p-1) = -1$$ Hence, if set $k = -\dfrac1{2p-1}$, we obtain the recurrence $$u_i = \dfrac{1-p}p u_{i-1} \implies u_i = \left(\dfrac{1-p}p\right)^iu_0 = \left(\dfrac{1-p}p\right)^i \cdot \dfrac{2-2p}{2p-1}$$ Hence, $$v_i = \left(\dfrac{1-p}p\right)^i \cdot \dfrac{2-2p}{2p-1} - \dfrac1{2p-1}$$ Rewriting this in terms of $N_i$, we obtain $$N_{i+1} - N_{i} = \left(\dfrac{1-p}p\right)^i \cdot \dfrac{2-2p}{2p-1} - \dfrac1{2p-1} \text{ with }N_n = 0$$ Hence, summing from $i = {j}$ to $i=n-1$, we obtain $$N_n - N_{j} = \sum_{i=j}^{n-1}\left(\dfrac{1-p}p\right)^i \cdot \dfrac{2-2p}{2p-1} - \sum_{i=j}^{n-1}\dfrac1{2p-1}$$ Since $N_n = 0$, we obtain \begin{align} N_j & = \dfrac{n-j}{2p-1} + \dfrac{2(1-p)}{1-2p} \sum_{i=j}^{n-1} \left(\dfrac{1-p}p\right)^i = \dfrac{n-j}{2p-1} + \dfrac{2(1-p)}{1-2p} \left(\dfrac{1-p}p\right)^{j}\sum_{i=0}^{n-j-1} \left(\dfrac{1-p}p\right)^i\\ & = \dfrac{n-j}{2p-1} + \dfrac{2(1-p)}{1-2p} \left(\dfrac{1-p}p\right)^{j} \cdot \dfrac{\left(\dfrac{1-p}p\right)^{n-j}-1}{\dfrac{1-p}p-1}\\ & = \dfrac{n-j}{2p-1} + \dfrac{2p(1-p)}{(1-2p)^2} \left(\dfrac{1-p}p\right)^{j} \cdot \left(\left(\dfrac{1-p}p\right)^{n-j}-1 \right) \end{align}

Hence, $$N_0 = \dfrac{n}{2p-1} + \dfrac{2p(1-p)}{(1-2p)^2}\left(\left(\dfrac{1-p}p\right)^n-1\right)$$


Note: OPs approach is correct from my point of view. This is an alternate approach based upon the generating function of the transitition matrix of OPs automaton.

Let's assume the situation with nodes $\{0,\ldots,n\}$, $n\geq 1$ and walks according to OPs rules from $0$ to $n$. We consider the $(n+1)\times(n+1)$ transition matrix $M(n+1)$ describing the probability to go from node $i$ to node $j$ in one step. In the following we set for convenience only $q= 1-p$. We get \begin{align*} M(n+1)=(m_{i,j})_{1\leq i,j\leq n+1}= \begin{cases} m_{1,2}=1\\ m_{i,i+1}=p&\qquad i=2,\ldots,n\\ m_{i,i-1}=q&\qquad i=2,\ldots,n\\ m_{i,j}=0&\qquad otherwise\\ \end{cases} \end{align*} The matrix $M(n+1)$ is a tridiagonal matrix with zeroes in the main diagonal. We have e.g. for $n=4$ with nodes $\{0,1,2,3,4\}$ the transition matrix \begin{align*} M(5)= \begin{pmatrix} \color{blue}{0}&\color{blue}{1}&\color{grey}{0}&\color{grey}{0}&\color{grey}{0}\\ \color{blue}{q}&\color{blue}{0}&\color{blue}{p}&\color{grey}{0}&\color{grey}{0}\\ \color{grey}{0}&\color{blue}{q}&\color{blue}{0}&\color{blue}{p}&\color{grey}{0}\\ \color{grey}{0}&\color{grey}{0}&\color{blue}{q}&\color{blue}{0}&\color{blue}{p}\\ \color{grey}{0}&\color{grey}{0}&\color{grey}{0}&\color{blue}{0}&\color{blue}{0}\\ \end{pmatrix} \end{align*}

Note: It's well known, that the $k$-th power of the transition matrix contains the probabilites to go from node $i$ to node $j$ in $k$ steps. So, we are interested in the walks from $0$ to $n$ and the corresponding probability to go from $0$ to $n$ are coded in the element ${\left(M(n+1)\right)}^k_{(1,n+1)}$.

Therefore the expected number of steps to go from $0$ to $n$ is \begin{align*} \sum_{k\geq 1}k{\left(M(n+1)\right)}^k_{(1,n+1)}\tag{1} \end{align*}

We calculate (1) by introducing the generating function $F_{n+1}(z)$ with parameter $n+1$.

\begin{align*} F_{n+1}(z)=\sum_{k\geq 0}{\left(M(n+1)\right)}^k_{(1,n+1)}z^k\qquad\qquad n\geq 1\tag{2} \end{align*} and observe that the expectation value is given by \begin{align*} \left.{F}^{\prime}_{n+1}(z)\right|_{z=1}\tag{3} \end{align*}

Note: A great introductory text describing this approach is Generatingfunctionology by H.S. Wilf

According to Theorem 4.7.2 in R.P. Stanleys classic Enumerative Combinatorics, Vol.1 the generating function $F_{n+1}(z)$ in (2) is given by \begin{align*} F_{n+1}(z)=(-1)^{n}\frac{\det\left(I-zM(n+1):n+1,1\right)}{\det(I-zM(n+1))}\tag{4} \end{align*} whereby the notation $(I-zM(n+1):n+1,1)$ means, that the $(n+1)$-th row and the $1$-st column have to be skipped.

Intermezzo: Before we continue, we a look at two small examples. Calculating the first few numbers $N_0$ for $n=1,2,3$ based upon OPs recurrence relation give: \begin{align*} n=1&\qquad N_0=1\\ n=2&\qquad N_0=\frac{2}{p}\\ n=3&\qquad N_0=1+\frac{2}{p^2}\\ \end{align*} With $p=\frac{1}{2}$ the expectation value $N_0$ is $1,4$ and $9$. These values coincide nicely with OPs information, the expectation value being $n^2$ in the symmetric case $p=q=\frac{1}{2}$.

Example: $n=2$

Now, let's consider in (4) the case $n=2$. Then $$ I-zM(3)=\begin{pmatrix}\color{blue}{1}&\color{blue}{-z}&\color{grey}{0}\\\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}\\\color{grey}{0}&\color{blue}{0}&\color{blue}{1}\end{pmatrix}\qquad (I-zM(3):3,1)=\begin{pmatrix}\color{blue}{-z}&\color{grey}{0}\\\color{grey}{1}&\color{blue}{-pz}\end{pmatrix} $$ with $(I-zM(3):3,1)$ being the matrix $I-zM(3)$ with the $3$-rd row and $1$-st column skipped.

So, according to formula (4) we get \begin{align*} F_3(z)=\frac{\det\left(I-zM(3):3,1\right)}{\det(I-zM(3)}=\frac{pz^2}{1-qz^2}\\ \end{align*} and \begin{align*} \left.{F}^{\prime}_3(z)\right|_{z=1}=\left.\frac{2pz}{(1-qz^2)^2}\right|_{z=1}=\frac{2}{p} \end{align*}

Example: $n=3$

We proceed similarly with the case $n=3$. \begin{align*} I-zM(4)=\begin{pmatrix}\color{blue}{1}&\color{blue}{-z}&\color{grey}{0}&\color{grey}{0}\\\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}&\color{grey}{0}\\\color{grey}{0}&\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}\\\color{grey}{0}&\color{grey}{0}&\color{blue}{0}&\color{blue}{1}\end{pmatrix}\ \ (I-zM(4):4,1)=\begin{pmatrix}\color{blue}{-z}&\color{grey}{0}&\color{grey}{0}&\\\color{grey}{1}&\color{blue}{-pz}&\color{grey}{0}\\\color{grey}{-qz}&\color{grey}{1}&\color{blue}{-pz}\end{pmatrix} \end{align*}

Again, according to formula (4) we get \begin{align*} F_4(z)=-\frac{\det\left(I-zM(4):4,1\right)}{\det(I-zM(4)}=\frac{p^2z^3}{1-(1-p^2)z^2}\\ \end{align*} and \begin{align*} \left.{F}^{\prime}_4(z)\right|_{z=1}=\left.\frac{3p^2z^2-p^2(1-p^2)z^4}{(1-(1-p^2)z^2)^2}\right|_{z=1}=1+\frac{2}{p^2} \end{align*}

Observe, that the results in these examples coincide with OPs $N_0$ above.


[Update 2020-11-24]: The update below is thanks to the helpful comments from @sela who pointed at some mistakes.

Simple expression for $F_{n+1}(z)$:

The challenge now is to find a simple expression for the determinants. In order to do so, we determine a recurrence relation for $\det(I-zM(n+1))$ and solve it. But first the simple one: The determinant of $(I-zM(n+1):n+1,1)$ is obviously

\begin{align*} \det(I-zM(n+1):n+1,1)=(-1)^{n}p^{n-1}z^{n}\tag{5} \end{align*}

since, when looking at the example above for $n=3$ and $n=4$ we see, that $(I-zM(n+1):n+1,1)$ has a lower triangle shape and so the determinant is the product of the elements of the main diagonal.

Here we take a look at $M(5)$. After expanding $M(5)$ in a first step by the last row, we expand it in the next step by its minors along the first column. We obtain \begin{align*} &\det(I-zM(5))=\det \begin{pmatrix} \color{blue}{1}&\color{blue}{-z}&0&0&0\\\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}&0&0\\0&\color{blue}{-qz}&\qquad=\color{blue}{1}&\color{blue}{-pz}&0\\0&0&\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}\\ 0&0&0&0&1 \end{pmatrix}\\ &\qquad=\det \begin{pmatrix} \color{blue}{1}&\color{blue}{-z}&0&0\\\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}&0\\0&\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}\\0&0&\color{blue}{-qz}&\color{blue}{1} \end{pmatrix}\\ &\qquad=\det \begin{pmatrix} \color{blue}{1}&\color{blue}{-pz}&0\\\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}\\0&\color{blue}{-qz}&\color{blue}{1} \end{pmatrix} +qz\det \begin{pmatrix} \color{blue}{-z}&\color{blue}{0}&0\\\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}\\0&\color{blue}{-qz}&\color{blue}{1} \end{pmatrix}\\ &\qquad=\det \begin{pmatrix} \color{blue}{1}&\color{blue}{-pz}&0\\\color{blue}{-qz}&\color{blue}{1}&\color{blue}{-pz}\\0&\color{blue}{-qz}&\color{blue}{1} \end{pmatrix} -qz^2\det \begin{pmatrix} \color{blue}{1}&\color{blue}{-pz}\\\color{blue}{-qz}&\color{blue}{1} \end{pmatrix}\tag{6}\\ &\qquad=\det T_3(z)-qz^2\det T_2(z)\\ \end{align*}

In the last line of the equation we introduce according to the nicely structured matrices in (6) the matrix $T_n(z)$. Observe that $T_n(z)$ is a tridiagonal matrix with $1$'s in the main diagonal and entries $-pz$ and $-qz$ above and below the diagonal.


Recurrence relation for $\det(I-zM(n+1))$ and $\det T_n(z)$:

The general formula follows the same scheme, so that following holds: \begin{align*} \det(I-zM(n+1))=\det T_{n-1}(z)-qz^2\det T_{n-2}(z)\qquad n\geq 2\tag{7} \end{align*}

Expanding the determinant $\det T_n(z)$ by its minors leads to the recurrence relation:

\begin{align*} \det T_n(z)=\det T_{n-1}(z)-pqz^2\det T_{n-2}(z)\qquad n\geq3\tag{8} \end{align*} with initial conditions \begin{align*} \det T_1(z)&=\det \begin{pmatrix}1\end{pmatrix}=1\\ \det T_2(z)&=\det \begin{pmatrix}1&-pz\\-qz&1\end{pmatrix}=1-pqz^2\\ \end{align*}

We can solve the recurrence relation for $T_n(z)$ by applying standard techniques which can be found in H.S. Wilf's Generatingfunctionology. We obtain

\begin{align*} \det T_n(z)=\frac{\left(1+\sqrt{1-4pqz^2}\right)^{n+1}-\left(1-\sqrt{1-4pqz^2}\right)^{n+1}}{2^{n+1}\sqrt{1-4pqz^2}}\qquad n\geq 1\tag{9} \end{align*}


Conclusion: Putting all together we derive from (4) the following representation of the generating function $F_{n+1}(z)$:

\begin{align*} \color{blue}{F_{n+1}(z)}&=(-1)^{n}\frac{\det\left(I-zM(n+1):n+1,1\right)}{\det(I-zM(n+1))}\\ &=\frac{p^{n-1}z^n}{\det T_{n-1}(z)-qz^2\det T_{n-2}(z)}\tag{$\longrightarrow\quad (5),(7)$}\\ &=\frac{(pz)^n}{p\det T_{n-1}(z)-pqz^2\det T_{n-2}(z)}\\ &\,\,\color{blue}{=\frac{(pz)^n}{\det T_{n}(z)-q\det T_{n-1}(z)}}\tag{$\longrightarrow\quad (8)$} \end{align*}

We finally obtain number of expected steps from $0$ to $n$ by evaluating the first derivation at $z=1$ \begin{align*} \left.{F}^{\prime}_{n+1}(z)\right|_{z=1} \end{align*} where we can use the representation (9) of the determinants $\det T_n(z)$.


You already have the solution to the problem in the difference equation $$N_i=1+(1-p)\cdot N_{i-1}+p\cdot N_{i+1}$$ with boundary conditions $$N_n=0 \text{ and } N_0=1+N_1,$$ and all that remains is the technical issue of finding the solutions. You may want to look up solutions to difference equations to learn more about it, but here's a quick go at this particular problem.

First, let's rewrite it $$D[N]_i=N_i-(1-p)\cdot N_{i-1}-p\cdot N_{i+1}=1$$ where $D$ denotes the difference operation, at first without worrying about the boundary conditions at $i=0$ and $i=n$. Note that if you have any solution to this, you can add $\Delta N_i$ to it where $$D[\Delta N]_i=\Delta N_i-(1-p)\cdot \Delta N_{i-1}-p\cdot \Delta N_{i+1}=0$$ so the strategy is to find any solutions to the former and all solutions to the latter.

If we try $N_i=1$, this makes $D[N]_i=0$, so this would be a solution for $\Delta N_i$. If we try $N_i=i$, we get $D[N]_i=1-2p$; to get $D[N]_i=1$, we may enter $N_i=i/(1-2p)$ which works for $p\not=1/2$. If we also try out $N_i=i^2$, we find that for $p=1/2$ we may use $N_i=-i^2$ to get $D[N]_i=1$. (All this assumes I didn't make any mistakes in my calculations, but the idea should be the same.)

So now we have one solution to the difference equation. To get all, we need to solve $D[\Delta N]_i=0$. This type of difference equation often has solutions on the form $\Delta N_i=q^i$. We already know the one with $q=1$, and solving gives another with $q=1/p-1$.

Combining the initial solution $N_i=i/(1-2p)$ with the two solutions for $\Delta N_i$ gives the general solution for $p\not=1/2$: $$N_i=\frac{i}{1-2p}+a+b\cdot\left(\frac{1}{p}-1\right)^i.$$ Now, plug in $N_n=0$ and $N_0=1+N_1$ and solve for $a,b$ to get the appropriate boundary conditions.