What is the probability that the sign does not change (or changes exactly once) in a random walk with 1000 steps?

My first approach would be a simulation with 10 000 samples and count the change of signs, but there must be a more rigorous approach. With google I found a nice analysis about the expected distance from the origin: https://mathworld.wolfram.com/RandomWalk1-Dimensional.html but it does not answer my question. It only says that zero sign changes is the most probable.

Any advice is appreciated.


Solution 1:

Since there is a fixed number of steps, you can model this with a finite state space. Since you only care about the number of sign changes, WLOG you can take the first step to be from $0$ to $+1$. So the possible states are from $-998$ to $1000$. However, to track "multiple sign changes", we can modify the state space as follows:

We will add a "second $0$" and "second $+1$", which we'll call "$0b$" and "$1b$". We'll refer to the starting point as "$0a$" and the first $+1$ as "$1a$".

In general, from node $i$ there will be probability 0.5 to transition to node $i-1$ and probability 0.5 to transition to node $i+1$.

However, from node $0a$, the chain can transition to $-1$ or $1a$, but from node $-1$ the chain can transition to $-2$ or $0b$, and from $0b$ the chain can transition to $-1$ or $1b$.

Hitting $1b$ is equivalent to "multiple sign changes", so we'll make that an absorbing state, and compute the probability that, starting at $1a$, we are in state $1b$ after 999 steps.

To compute this analytically, we just need to form that transition matrix $P$, and compute $x P^{999}$, where $x$ is the row vector with $1$ in state $1a$ and zero elsewhere.

I computed this in Python and got an analytic solution of $\fbox{0.8993}$, to 4 decimal places.

I also ran a simulation with 10,000 trials, and the proportion with multiple sign changes was 0.8991

Here's the code:

import numpy as np

# Define transition matrix
P = np.zeros((2001,2001))
P[0,0] = 1   # -998 to -998
for i in range(1,1998):
    P[i,i-1] = 0.5
    P[i,i+1] = 0.5

P[1998,1998] = 1   # 1000 to 1000
P[997,998] = 0     # -1 to 0a
P[997,1999] = 0.5  # -1 to 0b
P[1999, 997] = 0.5 # 0b to -1
P[1999,2000] = 0.5 # 0b to 1b
P[2000,2000] = 1   # 1b to 1b

row_sums = P.sum(axis=1)
if np.min(row_sums)==1 and np.max(row_sums)==1:
    print("All rows sum to one.")
else:
    print("Error in row sums!")

# Initial value
x = np.zeros((1,2001))
x[0,999] = 1   # start at 1a

# Take 999 steps
print("Starting iteration...")
for i in range(999):
    x = x @ P

# Analytic solution
soln = x[0,2000]
sd = np.sqrt(soln*(1-soln))/100
left = soln - 2*sd
right = soln + 2*sd
print(f"Probability of multiple sign changes is {soln:.4f}")
print(f"95% CI of proportion in 10,000 trials = ({left:.4f}, {right:.4f})")

# Simulation
trials = 10000
changes = np.zeros(trials)
steps = np.random.choice([-1,1], size=(trials,999))

print("Starting simulation...")
for i in range(trials):
    loc = 1 # start at +1
    sign = 1 # sign is positive

    # Take 999 steps
    for j in range(999):
        loc += steps[i,j]

        if loc == -1*sign:
            sign *= -1
            changes[i] += 1

estimate = np.mean(changes > 1)

print(f"Simulation estimate is {estimate:.4f}")

# Histogram of number of sign changes
max_changes = np.max(changes)

fig = plt.figure()
plt.hist(changes, bins=int(max_changes+1))
fig.suptitle('Histogram of number of sign changes', fontsize=20)
plt.xlabel('Number of Sign Changes', fontsize=18)
plt.ylabel('Number of Trials', fontsize=16)
fig.savefig('sign_changes_hist.png')

And the output:

All rows sum to one.
Starting iteration...
Probability of multiple sign changes is 0.8993
95% CI of proportion in 10,000 trials = (0.8933, 0.9053)
Starting simulation...
Simulation estimate is 0.8991

I also made a histogram of the number of sign changes per trial, in the 10,000 trials. You can see that although zero sign changes is the most likely number of sign changes, followed closely by one sign change, together those two possibilities happen just slightly more than 10% of the time. Histogram of number of sign changes