Successive records in mathematical sequences: surprising result
I was looking at sequences $x_n=\{n\alpha\}$ with $n=1,2,\cdots$ and $\alpha\in [0, 1]$ an irrational number. Such sequences are known to be equidistributed, so they get arbitrarily close to any number $x\in[0, 1]$, time and over. The brackets represent the fractional part function.
There is a subsequence $t_n$ of indices such that $x_{t_n}$ is the $n$-th record: it corresponds to the largest value observed sequentially, that is, $t_n$ is the smallest index such that $x_{t_n} > x_{t_{n-1}}$. Here $x_{t_{n-1}}$ is the previous record. In short, $t_n$ is the arrival time (index) of the $n$-th record. The sequence $t_n$ is strictly increasing and never ends. For $\alpha = \sqrt{2}/2$ (our focus in this question) we have:
I did not expect to see such a pattern. What's more,
$$t_{2n+1} = \Big\lfloor \frac{\sqrt{3+2\sqrt{2}}}{2}\Big(3+2\sqrt{2}\Big)^{n}\Big\rfloor,$$
$$t_{2n} = \Big\lfloor \frac{\sqrt{2}}{2}\Big(3+2\sqrt{2}\Big)^{n}\Big\rfloor.$$
Let us denote as $u_{2n+1}$ and $u_{2n}$ the expression inside the brackets $\lfloor \rfloor$ in the above twin formulas for $t_{2n+1}$ and $t_{2n}$. We have: $u_n - t_n \rightarrow 0$ (whether $n$ is even or odd) and the convergence is very fast.
Anyone interested in proving this result? It is based solely on empirical evidence. A similar result seems achievable for $\sqrt{3}/2$. Now for non-algebraic number, the kind of periodicity observed here is lost, but we get very intriguing results, with more mysterious patterns. Below is the data for $\alpha = \log_2 3$:
The nice pattern seems to end abruptly beyond the data being displayed here, and indeed the next maximum take places at index 190,537 and 10,781,274. I could not find the next one, maybe due to machine precision or due to a massive gap.
I am very interested in some asymptotic value for $t_n$ especially when $\alpha$ is not an algebraic number. I am surprised that the arrival times $t_n$ of the successive records behave so much differently than if the underlying sequence $\{n \alpha\}$ was truly random (I studied the random case here.)
Update
If $\alpha = \pi$, we get the following: $(t_5,t_6,t_7,t_8,t_9)=$ $(7,113,$ $33215,99532,364913)$. Define $\Delta t_n = t_n-t_{n-1}$. Then $(\Delta t_6, \Delta t_7,\Delta t_8,\Delta t_9)=(106, 33102, 66317,265381)$.
Now look at $\pi$ convergents (skipping the first ones): $22/7$, $333/106$, $355/113$, $103993/33102$, $104348/33215$, $208341/66317$, $312689/99532$, $833719/265381$, $11464108/364913$. The denominators in odd positions corresponds to the successive $\Delta t_n$. The denominators in even positions corresponds to the successive $t_n$.
This works for all irrationals (need to check) and it gives an alternate way to compute the convergents. For instance, take $t_8=99532$. Then the fraction $m/99532$ closest to $\pi$, with $m$ an integer, is a convergent. Using this fact, you can easily obtain $m=312689$.
Background
My interest in this problem stems from the following. The sequence $z_n =2^{\{-n \log_2 3\}}$ consists of rational numbers, all having a period (in their binary expansion) equal to $2\cdot 3^{n-1}$. All have exactly 50% zeros, 50% ones in their binary expansion. The median, which is one of them if $n$ is odd, tends to $\sqrt{2}/2$. Does it mean that $\sqrt{2}/2$ has 50% zeros too? No, you need to remove "extreme" numbers from this sequence. For instance all numbers $z_{t_n}$ with $t_n$ listed in the above table for $\alpha = \log_2 3$. What other numbers should I remove? I don't know yet, but this is a starting point. It depends on how fast/slowly the "extreme" numbers are building up as $n$ increases. See discussion on this topic here, look at the section entitled A different approach.
Potential machine learning applications of continued fractions
We found a connection between records (maxima) of some sequences, and continued fractions. Doing the same thing with minima will complete this analysis. Extreme events in business or economics settings are typically modeled using statistical distributions (Gumbel and so on), which did not live to their expectations in many instances.
The theory outlined here provides an alternative, with each number $\alpha$ serving as a particular model, with its own features distinct from all other numbers. In particular, the numerators and denominators of convergents of $\alpha$ can be used to model the arrival times of extreme events. While the records here are getting closer either to zero or to one, using a logarithm transformation allows you to model phenomena that are not bounded.
Note that the sequence $\{n\alpha\}$ has strong, long-range auto-correlations that are well studied (see appendix B in this book). One could try with the sequence $\{b^n \alpha\}$ instead ($b > 1$) as it exhibits exponentially decaying auto-correlations, also studied in the same book. Even cross-correlations between two sequences have been studied (say $\{\alpha n\}$ and $\{\beta n\}$), allowing you to model multivariate time series and their cross-dependencies. Other sequences are also being investigated.
For the case of $\frac{\sqrt{2}}{2}$, we look to find $k\in\mathbb{N}$ such that $\frac{\sqrt{2}}{2}k\pmod 1$ is as large as possible.
First, consider the sequence of "best possible" rational over-estimates of $\frac{\sqrt{2}}{2}$. We can do this using finite iterations of the continued fraction $[0;1,\bar 2]$ evaluated with the last term as $0$, giving the sequence: $$S_n=\frac{1}{1},\frac{3}{4},\frac{5}{7},\frac{17}{24},\frac{29}{41}...\space\space\space\space$$
The (strictly increasing) sequence of denominators is then:
$$D_n=1,4,7,24,41...$$
Since we know $\frac{\sqrt{2}}{2}<S_n$ for all $n$, we have that: $$\frac{\sqrt{2}}{2}D_n<S_nD_n=k,\space k\in\mathbb{N}$$
This implies:
$$\left(\frac{\sqrt{2}}{2}-S_n\right)D_n\equiv\frac{\sqrt{2}}{2}D_n\pmod 1\tag{1}$$
Since $S_n$ is a "best possible" approximation for $\frac{\sqrt{2}}{2}$, we have that:
$$\left|\frac{\sqrt{2}}{2}-\frac{a}{b}\right|<\left|\frac{\sqrt{2}}{2}-S_n\right|\implies b>D_n$$
Hence:
$$\left|\frac{\sqrt{2}}{2}-S_n\right|D_n=\left(S_n-\frac{\sqrt{2}}{2}\right)D_n$$
Is minimal. Multiplying both sides of $(1)$ by $-1$, we find that:
$$\left(S_n-\frac{\sqrt{2}}{2}\right)D_n\equiv -\frac{\sqrt{2}}{2}D_n\equiv 1-\frac{\sqrt{2}}{2}D_n \pmod 1$$
Must also be minimal. Therefore:
$$\frac{\sqrt{2}}{2}k\pmod 1$$
Is maximal for some $k\in\mathbb{N}\iff k=D_n$
Note:
On further inspection, I realised that (provided my proof is correct), substituting any irrational number $\alpha$ (perhaps rational too?) for $\frac{\sqrt{2}}{2}$ would lead to the same result:
$\alpha D_n\pmod 1>\alpha k\pmod 1$ for all $k\in\mathbb{N}, k<D_n \iff D_n$ represents the sequence of over-estimate convergent denominators from the continued fraction of $\alpha$.
So here is how it works. No proof here, but I am showing the pattern for all irrational numbers.
The above table is for the number $e$ but it works the same way for all irrational numbers. Follow the red path! The convergents of $e$ are listed below (source: here). Look at the denominator of each convergent to identify the pattern:
8/3,
11/4,
19/7,
87/32,
106/39,
193/71,
1264/465,
1457/536,
2721/1001,
23225/8544,
25946/9545,
49171/18089,
517656/190435,
566827/208524,
1084483/398959,
13580623/4996032,
14665106/5394991,
28245729/10391023,
410105312/150869313,
438351041/161260336,
848456353/312129649,...