Why do people simulate with Brownian motion instead of "Intuitive Brownian Motion"?

I have just recently begun studying Brownian motion and stochastic calculus at the level of an undergraduate or beginning graduate student of applied mathematics. (Textbooks I've looked at are by Mikosch, Gardiner, Kloeden and Platen.)

Back when I first started thinking about stochastic dynamics, I was a cognitive science student who was simulating neural decision making circuits. We had a differential equation to model our system, and my advisor suggested that I try "adding in some noise."

Intuitively, I would have written a discretized equation to model the evolution of some state variable $X$ like this: $$\Delta X = [ a(t,X)+ R ] \Delta t $$ where a is a coefficient that depends upon the state variable $x$ and time $t$, and where $R$ denotes a random variable (probably taken to be normal and centered at zero with variance $\sigma^2$)

In other words, at some level, I would have expected an "intuitive stochastic differential equation" to be expressed in the general form $$ \Delta X = [a(t,X,R)] \Delta t $$ where a is now any function of a random variable.

However, crucially for the theory of stochastic differential equations and the models I see (at least the ones that come up in modeling neural decisions -- e.g. Ornstein-Uhlenbeck), the discretized equation has the stochastic part scaled with $\sqrt{\Delta t}$, rather than just $\Delta t$. For instance, Brownian motion is discretized as $$\Delta X = R \sqrt{ \Delta t}$$ where R denotes a standard normal.

Finally my question: Why is it so desirable, for modeling stochastic dynamics generally speaking, to have the stochastic part of a stochastic difference equation scale with the square root of the change in time rather than simply linearly with the change in time?


Solution 1:

The fundamental reason is that variances are additive, while standard deviations are not. If you didn't make the variance proportional to the time step, your discretizations would not behave consistently when you changed the time step.

To be explicit, let's take $X_{i+1}=X_i+R_{i+1}\sqrt{\Delta t}$, where the $R_i$ are independent and standard normal. After two steps, we have $$X_2=X_0+(R_1+R_2)\sqrt{\Delta t}.$$ But $R_1+R_2$ is normal with variance $2$, so we can write it as $\sqrt2 R_{12}$, where $R_{12}$ is again standard normal. Then we get $$X_2=X_0+\sqrt2 R_{12}\sqrt{\Delta t}=X_0+R_{12}\sqrt{2\Delta t},$$ so it's exactly the same as if we took a single time step of twice the length. This doesn't happen if you take the stochastic component to be $R_i\Delta t$ instead (try it).

What's really going on is that you're adding a constant variance, say $\sigma^2$, to the stochastic part at each time step. So after $n$ time steps you will have accumulated a variance of $n\sigma^2$. If you were to take a single time step of length $n\Delta t$ instead, you'd want your stochastic part to have the same variance $n\sigma^2$, and that corresponds to scaling by $\sqrt n$, not by $n$.

In other words: $\text{variance}\propto\text{time step}$; $\text{scale}\propto\sqrt{\text{variance}}$; so $\text{scale}\propto\sqrt{\text{time step}}$.