$e$ popping up in topic I'm unfamiliar with

I programmed up a little algorithm that goes like this:

Fix two positive, real numbers, call them $\alpha$ and $\beta$.
Generate a new, random, real number, $x \in [0,1]$
Set $\alpha$ = $x\alpha\beta$,
Repeat 2 and 3 until $\alpha$ is equal to the smallest float supported (I'll just call this 0), or $\alpha$ is equal to the largest float supported (I'll call this $\infty$).

Initially I thought this would turn up nonsense, or only occasionally go to 0 or infinity.

But after experimenting a lot with values of $\beta$, I found a fascinating result:

If $\beta \in (0,e)$, $\alpha$ always goes to $0$.

If $\beta=e$, $\alpha$ goes to $0$ about 50% of the time, and to $\infty$ about 50% of the time.

If $\beta \in (e,\infty)$, $\alpha$ always goes to $\infty$.

Now these are all results based on running the above algorithm about 10,000 times, so maybe I'm in the wrong to put 'always', but certainly most of the time these results hold.

This was surprising to me! I'm not well versed in randomness, or computational mathematics, or probability so I can't really fathom why $e$ would pop up so prominently in this context. Can anyone point me towards the subject I need to examine to understand this result?

(Apologies if this post is mistagged, as I said, I'm not experienced with the subjects this post related to)

For a fixed $\beta$, your algorithm can be stated as follows:

Pick $\alpha_0$ arbitrarily.
For each $n \ge 1$, generate a uniformly random real number $X_n$ in $[0, 1]$ and set $\alpha_n = X_n\alpha_{n-1}\beta$.

Thus each $\alpha_n$ is got by multiplying the previous $\alpha_{n-1}$ by $\beta X_i$, so $$\alpha_n = \alpha_0\beta^n X_1 X_2 \cdots X_n \tag 1$$

Things become more tractable/familiar if we take logarithms, to convert products into sums: each $\log \alpha_n$ is got by taking the previous $\log \alpha_{n-1}$ and adding $\log\beta + \log X_i$, so we get the logarithmic version of $(1)$: $$\log \alpha_n = \log \alpha_0 + n\log\beta + \sum_{i=1}^n \log X_i \tag 2$$

Thus the sequence $\log \alpha_n$ is a random walk starting at $\log \alpha_0$ and changing by $Y_i = \log \beta + \log X_i$ at each step. For such a random walk $\sum_{i=1}^n Y_i$, its eventual behaviour is determined by whether $E[Y_i]$ is positive, negative, or zero.

We can in fact determine the distribution of $\log X_i$ (and therefore of $Y_i$). Each $X_i$ is uniformly distributed in $[0, 1]$. It turns out, this means that $(-\ln X_i)$ is exponentially distributed with mean $1$. Let's derive that. To find the distribution of $\log_b X_i$ (taking logarithms to arbitrary base $b$ to be clear on where $e$ pops up), we we can calculate the cumulative distribution function: for $t \le 0$ (assuming wlog that $b > 1$), we have $\Pr(\log_b X_i \le t) = \Pr(X_i \le b^t) = b^t$. Differentiating this wrt $t$ gives the probability density function, which is $b^t\ln b$ (and $0$ for $t > 0$). In particular, $$ \operatorname{E}[\log_b X_i] = \int_{-\infty}^{0} t \, b^t\ln b \, dt = - 1/\ln b = -\log_b e. $$

This is where $e$ pops up: although to convert products into sums we could have taken logarithm to any base $b$, the fact is that $E[\log_b X_i] = -\log_b e$, so $e$ pops up unavoidably (and we might as well take the natural logarithm).

So $E[Y_i] = \log\beta + \log X_i = \log \beta - \log e$ (whatever base we choose), which is positive, negative or zero depending on whether $\beta > e$, $\beta < e$ or $\beta = e$. It is therefore intuitively clear that when $\beta > e$ the random walk $\log\alpha_n$ drifts towards $\infty$ (and $\alpha_n$ towards $\infty$ as well) and when $\beta < e$ towards $-\infty$ (and $\alpha_n$ towards $0$). Proofs of precise statements below.

Let $S_n = \sum_{k=1}^n \log X_i$. We have, from the strong law of large numbers applied to the independent and identically distributed random variables $\log X_i$, the fact ("a.s." stands for "almost surely") that $$\frac{S_n}{n} \xrightarrow{a.s.} - 1.$$

This means that, with probability $1$, we have $$\frac{\log \alpha_n}{n} = \frac{\log \alpha_0}{n} + \log \beta + \frac{S_n}{n} \xrightarrow{a.s.} \log \beta - 1$$

In your problem, we iterate until either $\alpha_n$ hits an upper threshold or hits a lower threshold, and we record which threshold it hit. (Typical example thresholds induced by programming languages are on the order of $10^{38}$ and $10^{-38}$ for single-precision floating-point format, and $10^{308}$ and $10^{-308}$ for double-precision floating-point format.) In particular,

when $\beta < e$, we have $\lim\limits_{n\to\infty}\dfrac{\log \alpha_n}{n} < 0$ almost surely, which means that $a_n \to 0$ almost surely (and thus it is more likely to hit the upper threshold),
when $\beta > e$, we have $\lim\limits_{n\to\infty}\dfrac{\log \alpha_n}{n} = c > 0$ almost surely (where $c = \log\beta - 1$), which means that $a_n \to \infty$ almost surely ($\alpha_n$ grows like $e^{nc} \to \infty$) (and thus it is more likely to hit the upper threshold).

For the case when $\beta = e$, see below.

Note however that as our thresholds are not exactly $0$ and $\infty$ (neither of which is achievable anyway), we can only say "more likely" above: the fact that almost surely $a_n \to 0$ doesn't necessarily mean that the sequence $a_n$ hits $10^{-308}$ before $10^{308}$: it is possible (just very unlikely) that it does reach $10^{308}$ before $10^{-308}$, but eventually approaches $0$.

I don't know how to fix this, but at least for the $\beta = e$ case, the theory of random walks (see section 9.4) has some help. Let $Y_i = \log\beta + X_i$, and let $T_n = \sum_{i=0}^{n} Y_i$ (using $T_n$ to avoid confusion with the earlier $S_n$) so that $T_n$ is a random walk, and $\log\alpha_n = \log\alpha_0 + T_n$. Then if the lower threshold is $L$ and the upper threshold is $U$, the stopping events $\alpha_n \le L$ and $\alpha_n \ge U$ are equivalent to $T_n \le \log L - \log \alpha_0 = L'$ and $T_n \ge \log U + \log\alpha_0 = U'$.

For the special case when $\operatorname{E}[Y_i] = 0$ (this means $\beta = e$), we have, denoting $J$ as the stopping time, $\operatorname{E}[S_J] = \operatorname{E}[J]\operatorname{E}[Y_i] = 0$ (Wald's equality), which roughly gives, assuming that $S_J$ is either roughly $L'$ or $U'$ (i.e., $\alpha_J$ doesn't overshoot the thresholds by too much) that $0 = \operatorname{E}[S_J] = p_{L'}L' + (1-p_{L'})U'$ so $p_{L'} = \frac{U'}{U'-L'} = \frac{\log U + \log\alpha_0}{\log U - \log L + 2\log \alpha_0}$ which for the thresholds from the single-precision floating format $U \approx 3.4 × 10^{38}$ and $L = 2^{−149}$ gives roughly $\frac{88.7 + \log \alpha_0}{192.0 + 2\log\alpha_0}$ which at least for values of $\log\alpha_0$ small in magnitude gives roughly $0.46$ rather than $0.5$ exactly (though it's much closer to $0.50$ if you take your $L$ to be the smallest normalized number $2^{-126}$ instead).

You have $\alpha_{i+1}=\alpha_{i} X_{i+1}$, where $X_i\sim U(0,\beta)$; so $\alpha_{n}=\alpha_{0}\prod_{i=1}^{n}X_{i}$. The infinite product (and hence $\alpha_n$) diverges to $+\infty$ if its logarithm, $$ \sum_{i=1}^{n}\log X_i, $$ does, and by the law of large numbers this occurs almost surely if $E[\log X_i]>0$. Now, $$ E[\log X_i]=\frac{1}{\beta}\int_{0}^{\beta}\log x dx=\log(\beta)-1=\log(\beta/e), $$ which is positive when $\beta>e$. Similarly, the product diverges to $0$ if its logarithm diverges to $-\infty$, which occurs almost surely if $\beta < e$. Finally, if $\beta=e$ exactly, you have an unbiased one-dimensional random walk, which implies that $\alpha_n$ neither converges nor diverges: with probability $1$, it crosses $1$ (that is, goes from larger than $1$ to smaller, or vice-versa) infinitely many times.

$e$ popping up in topic I'm unfamiliar with

Related

Recent Posts