About the distribution of $\{k(\sqrt{2}-1)\mid k\in \mathbb{N}\}$

This is the Putnam 2021 problem, number B6:

Prove that $S_n= \sum\limits_{k=1}^n (-1)^{ \lfloor k(\sqrt{2}-1) \rfloor } \geq 0$ for all $n\in \mathbb{N}$.

This involves the distribution of the set $\{k(\sqrt{2}-1)\mid k\in \mathbb{N}\}$ in $\mathbb{R}^+$ or of $\{k(\sqrt{2}-1)-\lfloor k(\sqrt{2}-1) \rfloor\mid k\in \mathbb{N}\}$ in $(0,1)$.

The 'walk' starts like: $S_1=1, S_2=2, S_3=1, S_4=0, S_5=1, S_6=2, S_7=3, S_8=2, \ldots$ and we can compute some first observations like:

  • A '1-step' up or down can only be of 'length' $2$ or $3$.
  • A '2-step' (up then down or opposite) can only be of 'length' $4$ or $5$.
  • A '3-step' (up then down then up or opposite) can only be of 'length' $7$ or $8$. Notice how $6$ is absent!
  • Then also a '4-step' can only be of 'length' $9$ or $10$ .$\quad 11$ is absent ! and a '5-step' can only be of 'length' $12$ or $13$.
    $...\ 18$ is absent ! ... etc, we can compute also the '12-step' configuration for example if we want.

This yields that it is impossible to have a $3$ up $3$ down in the walk, or a $2$up-$2$down-$2$up, etc... in a '3-step' there is at least one of length 3, continuing this will help us draw or determine the possible walk (walks) without computing explicitly $\lfloor k(\sqrt{2}-1) \rfloor$ for the given $k$ ( we drop using $k$ here and focus on showing that the walk doesn't 'sink' below $0$ ).

But the fact that it doesn't 'sink' is still puzzling me ! I am trying to determine integer values for which $m+ \lfloor m\sqrt{2} \rfloor $ fails to evaluate to, like we found 6, 11, and 18, the fact that they are infinite and their distribution, I think, must be linked to why this 'walk' keeps on the positive side. Trying to prove this by contradiction yields also to a similar idea linked to this distribution. My initial thought was: assume there is the smallest number $n_0$ for which $S_{n_0}=-1$, go from there and find another $n_1<n_0$ which gives the contradiction $S_{n_1}=-1$, but this doesn't seem to be fruitful.


Introduction

Naturally, the problem transforms itself (as mentioned in the comments as well) to studying the integral parts of $[k\alpha]$ where $\alpha = \sqrt 2 - 1$, and when these are odd and even. The assertion of the statement is that for any $N$, there are at least as many even numbers , as odd numbers, in the set $\{[k \alpha] : 1 \leq k \leq N\}$.

Now, the first thing that we are tempted to ask ourselves is the following : is there a pattern in the parities of $[k\alpha]$? Perhaps something like this might come out from investigation.

Ideally, we want periodicity. We would love it if the sequence was predictable beyond a very early point, so that we could sort out the initial terms , and then let the periodicity take us to infinity and beyond!

Now, the first thing that we realize is that the problem with periodicity is $\alpha$ itself. It's irrational. So how on earth are we going to think about periodicity, when there's something that could behave in a haywire fashion? Simple : replace it with something that doesn't behave erratically.

Namely, with a rational number. A rational number, that's close to this irrational number, so that the given quantity for $\alpha$ and that rational number becomes indistinguishable. Perhaps then, the creation of that rational number will be important.

Well, I believe that continued fractions are part of the Olympiad syllabus (if not, then I swear by Hall and Knight that they should be!) but that's the way to go.


Continued fractions

Well, here is the Wikipedia page for continued fractions. But essentially, any irrational number is well approximated by its convergents. These convergents , if the irrational number is simple enough, have a pattern that allows us to explicitly compute them.

Thankfully, an observation much later on will tell us we don't need to compute them. But for now, we note that this part of the Wiki page is super helpful. But first, the continued fraction expansion itself.

How do we derive the continued fraction? Note that we could follow the procedure as detailed in the question itself. However, we choose to be a tad smarter, and realize that if $x = \sqrt(2)-1$ then $x$ is the positive root of $x^2+2x-1 = 0$. Using this, we write : $$ x = \frac {1}{x+2} = \frac{1}{2 + \frac 1{x+2}} = \frac {1}{2+\frac {1}{2 + \frac 1{x+2}}} = 0 + \frac 1{2 +\frac{1}{2+\frac{1}{2+\frac{1}{\vdots}}}} $$

whence, by uniqueness of the continued fraction representation, it becomes clear to us that the representation is indeed $[0;2,2,2,....]$.

This, along with the theorem presented in the section I attached (which is fairly standard and important) is nice : it tells us immediately that there are recurrence relations for the numerators $h_n$ and denominators $k_n$ of the convergents of the sequence. Namely, we have $h_n = 2h_{n-1} + h_{n-2}$ and $k_n = 2k_{n-1} + k_{n-2}$.

Let's calculate some initial values. Of course, the first truncate is $0$, we'll ignore that. Then we've got $\frac 12$. Then we've got $\frac{1}{2 + \frac 12} = \frac{2}{5}$. So $h_1 = 1$, $h_2 = 2$ (and hence $h_3 = 5$), and $k_1 = 2,k_2 = 5$. But now : $h$ and $k$ have the same recurrence relation! And we already saw that $h_2 = k_1$ and $h_3 = k_2$.Therefore, we can conclude that $h$ stays one step behind $k$ i.e. that $k_{j} = h_{j+1}$ for $j \geq 1$.

In other words, the convergents are of the form $\frac{h_j}{2h_j+h_{j-1}}$ for some $h_j$.

This is frankly remarkable. I was reminded of the Fibonacci sequence in this instance, since a similar thing occurs there.

Using the initial values, we may easily calculate $h_n$ and $k_n$ explicitly.

Let's now give one name to the sequence. We are going to let $a_1 = 1$, $a_2 = 2$ and $a_{n} = 2a_{n-1} + a_{n-2}$ for all $n \geq 3$. So our convergents are $\frac{a_i}{a_{i+1}}$. Now, we are going to list some properties of $a_n$, along with brief proof sketches.

  • $a_n$ is even if and only if $n$ is even. This is fairly clear from the fact that $a_{i} \equiv a_{i-2} \pmod{2}$ from the recurrence.

-We have $\gcd(a_n,a_{n+1}) = 1$ easily proven by induction.

  • We can prove a Cassini-type (for the Fibonacci numbers) identity for $a_n$ which will be useful later. It is $a_{n+1}a_{n-1} - a_n^2 = (-1)^n$. This can be proven by induction, or using matrix representations. Just use the same technique used as for the Cassini identity for the Fibonacci numbers.

  • Another nice little Cassini-type identity is $a_{n+2}a_{n-1} - a_{n}a_{n+1} = 2(-1)^{n+1}$. Once again, this can be proven by induction, and we'll use this later.

  • We point out theorem 1.3.3 of the resource here. In isolation : if $\frac{p_n}{q_n}$ is a convergent of $\beta$ and $q<q_n$, then for any $p$ we have $|\beta - \frac{p_n}{q_n}| < |\beta - \frac pq|$. In other words, of all fractions with denominator at most $q_n$, the one closest to $\beta$ is $\frac{p_n}{q_n}$.


Comparing the irrational and the convergent

We have the following convenient lemma.

Approximation lemma : For all $0 \leq k < a_{n+1}$ we have $[k \alpha] = \left[\frac{ka_n}{a_{n+1}}\right]$.

Proof : If they are not equal, then there is an integer $r$ in between them. Either $k \alpha < r < k\frac{a_{n}}{a_{n+1}}$ or $k \alpha > r > k\frac{a_{n}}{a_{n+1}}$. Either way though, it puts $\frac rk$ in between $\alpha$ and $\frac{a_{n}}{a_{n+1}}$, and thus we obtain a strictly better approximation for $\alpha$ using denominators of size less than $a_{n+1}$, than $\frac{a_n}{a_{n+1}}$ itself. This is a contradiction to the last bullet point I mentioned earlier.

Now, we have this lemma, and this is brilliant. That's because now we know that for each $n$, the sequence $\sum_{i=1}^N (-1)^{[k \alpha]}$ matches with the sequence $\sum_{i=1}^N (-1)^{[k \frac{p_n}{q_n}]}$ till $q_n$ terms. So proving that these sequences are non-negative will do the job, since our sequence matches with this sequence till those many terms.


Forming a hypothesis

I tried, by hand for small values of $i$ and $n$, that $\sum_{k=1}^n (-1)^{[k\frac{a_i}{a_{i+1}}]} \geq 0$. What I found , interestingly enough, was that when $i$ was even ($i=2,i=4$), this actually worked out! However, when $i=3$ and when $i=5$ , this actually failed! I was still hoping and praying that things would work : but they would have to, wouldn't they!

However, this is what led me to hypothesizing a final statement that I went on to prove later.


Pattern finding!

But I need patterns! Using the convergents, I generated by hand the first $29$ terms of the sequence. Note that $29 = a_5$ and $12 = a_4$, so I basically found $[\frac{12k}{29}]$ for $1 \leq k \leq 29$ and I knew that this would match up with the identity that I had.

I'll write them down here. Let's NOW figure out patterns! (This is like that favourite part of your favourite recipe). $$ 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, -1 $$

At this point, I was tempted to figure out patterns. The first thing I did was to remove the last $1$ which is the $29$th term, and now : $$ 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, -1 $$

came out. I read this from back-to-front, and lo: I got the negative of the previous sequence! This felt like a great observation, which wasn't obvious to me. Then of course, I took the first $a_2= 5$ terms of the sequence and removed the first term. I get $1,1,-1,-1$. Read from back to front, again it's the negative of the sequence! As you'll see in the proof below, though, this is actually quite obvious.

I did observe a few other interesting things as well. First, though, my focus was on the $a_i$, and the positions of the $a_i$. A few of the patterns I attempted either came off (but I had no clue what to do with them! An example : the $k$th term and $k+12$th term have opposite parity for $k \leq 12$. I proved a general version of this, but I didn't know what to do with it!) or failed. I couldn't believe my luck, though, when I split the sequence into five terms each. $$ 1,1,-1,-1,1 \\ 1,1,-1,-1,1 \\ 1,1,-1,-1,1 \\ 1,-1,-1,-1,1 \\ 1,1,-1,-1,-1 \\ 1,1,-1,-1 \\ $$

I couldn't quite believe my eyes. The first three rows were exactly the same! Well, well. This seemed like something weird to conjecture, so I worked a half hour more (note : I know this is a contest question, I can promise you I did my best to do everything on paper here!) and with some menial additions, I actually checked that the first $87$ integral parts of $\frac{70k}{169}$ actually had the same repeating pattern!

But why? Well, we'll see why. But more importantly, these two observations are going to shape our way to the answer.


The main result (start of proof)

So what did I hypothesize? This :

With these two observations, we will prove the following statement by induction : for all $n$, $\sum_{i=1}^k (-1)^{\left[\frac{ia_{2n}}{a_{2n+1}}\right]} \geq 0$ for all $k \leq a_{2n+1}$.

The base case is obvious, let's move to the inductive case.

Indeed, we first note that the approximation lemma tells us that $$ \left[\frac{ia_{2n}}{a_{2n+1}}\right] = \left[\frac{ia_{2n+2}}{a_{2n+3}}\right] = \left[i\alpha\right] $$

for all $1 \leq i < a_{2n+1}$.

Using this for each $i$ and summing up, we know that $\sum_{i=1}^k (-1)^{\left[\frac{ia_{2n}}{a_{2n+1}}\right]}$ , $\sum_{i=1}^k (-1)^{\left[\frac{ia_{2n+2}}{a_{2n+3}}\right]}$ and $\sum_{i=1}^k (-1)^{\left[ i \alpha\right]} $ all coincide for $1 \leq k < a_{2n+1}$. So for $k < a_{2n+1}$, we know the sum is non-negative.


Reflection lemma

Let's now prove the reflection observation we made earlier, and apply it.

The integral parts of $x = \frac{ia_{2n+2}}{a_{2n+3}}$ and $y =\frac{(a_{2n+1}-i)a_{2n+2}}{a_{2n+3}}$ have different parity for all $1 \leq i \leq a_{2n+1}-1$.

Proof : Note that $[x] = \left[\frac{ia_{2n+2}}{a_{2n+3}}\right] = \left[\frac{ia_{2n}}{a_{2n+1}}\right]$ and $[y] = \left[\frac{(a_{2n+1}-i)a_{2n+2}}{a_{2n+3}}\right] = \left[\frac{(a_{2n+1}-i)a_{2n}}{a_{2n+1}}\right]$. Let $a = \frac{ia_{2n}}{a_{2n+1}}$ and $b = \frac{(2n-i)a_{2n}}{a_{2n+1}}$.

We have that $a+b = a_{2n}$, so $a+b$ is an even integer! We have that $a+b = [a]+[b]-\{a\} - \{b\} = [x]+[y] - \{a\}-\{b\}$. We write this as $$ a+b - [x]-[y] = -\{a\}-\{b\} $$

the LHS is an integer, therefore so is the RHS. But the RHS can only be $0$ or $-1$ : and it can't be zero because $a$ and $b$ are not integral (remember the gcd condition?) so indeed, $a+b = [x]+[y] - 1$ and thus $[x]+[y]$ is odd! We are done, since this implies they have distinct parities.

As a corollary, we obtain that $\sum_{i=1}^{a_{2n+1}-1} (-1)^{\left[i \frac{a_{2n+2}}{a_{2n+3}}\right]} = 0$. Indeed, note that if we split this sum in half , we get one half of some signs, and then the reversed negative of the same signs. So they all cancel out.

For $k=a_{2n+1}$, we use the Cassini identity for four indices. We have $\frac{a_{2n+1}a_{2n+2}}{a_{2n+3}} = a_{2n} + \frac 1{a_{2n+3}}$ following simplification, which is just above an even number and therefore has an even integral part. Hence $\sum_{i=1}^{a_{2n+1}} (-1)^{\left[\frac{ka_{2n+2}}{a_{2n+3}}\right]} = 1$.


Three lines lemma

Now we prove the following "three lines" lemma :

Let $1 \leq k \leq a_{2n+1}$. Then the integral parts of each of : $$ x = \frac{ka_{2n+2}}{a_{2n+3}},y= \frac{(k+a_{2n+1})a_{2n+2}}{a_{2n+3}} , z=\frac{(k+2a_{2n+1})a_{2n+2}}{a_{2n+3}} $$ have the same parity.

Proof : Note that $z-y = y-x = a_{2n} + \frac{2}{a_{2n+3}}$ using the Cassini four-index identity. Therefore, we have : $$ z-y = [z] -[y] + \{z\} - \{y\} = a_{2n} + \frac{2}{a_{2n+3}} \implies [z]-[y]-a_{2n} = \{y\}-\{z\} + \frac 2{a_{2n+3}} $$

and we also have by similar reasoning : $$ [y] - [x] - a_{2n} = \{x\} - \{y\} + \frac 2{a_{2n+3}} $$

The LHS is an integer, therefore the RHS is an integer in both of these equations. Note that the RHS can either be $0$ or $1$.

CLAIM : $\{z\} \geq \frac 2{a_{2n+3}}$ and $\{y\} \geq \frac 2{a_{2n+3}}$ for all $1 \leq k \leq a_{2n+1}$.

If this claim is true, then the RHS of both equations must equal zero, since they are at most $\{y\}$ and $\{x\}$ respectively (if the claims work).

Let's now prove this claim. To prove this claim, note that each of $y,z$ are non-integral. Therefore, if the fractional part of either $y$ or $z$ is less than $\frac 2{a_{2n+3}}$, then must in fact be equal to $\frac 1{a_{2n+3}}$, because $y$ and $z$ have denominator equal to $a_{2n+3}$.

Therefore, we need to look at the sequence $la_{2n+2} \mod a_{2n+3}$ for $1 \leq l \leq 2a_{2n+1}-1$ and prove that this sequence doesn't contain a $1$. It could contain anything else! Remarkably though, we actually KNOW when $la_{2n+2} \mod a_{2n+3} = 1$.

We use the three-variable Cassini identity : $a_{2n+4}a_{2n+2} - (a_{2n+3})^2 = -1$. Thus, we get that $a_{2n+4} a_{2n+2} \equiv -1 \pmod{a_{2n+3}}$. But now, $a_{2n+4} = 2a_{2n+3} + a_{2n+2} \equiv a_{2n+2} \pmod{a_{2n+3}}$. It follows that $$ (a_{2n+3}-a_{2n+2})a_{2n+2} \equiv 1 \pmod{a_{2n+3}} $$

therefore, the unique value of $k$ that should be avoided is $a_{2n+3} - a_{2n+2}$.

So all we need to show is that $1 \leq k \leq a_{2n+1}$ implies that $(k+a_{2n+1}) < (a_{2n+3}-a_{2n+2})$ and $k+2a_{2n+1} < (a_{2n+3}-a_{2n+2})$. If we show this, then $\{y\},\{z\}$ cannot be equal to $\frac 1{a_{2n+3}}$ and we are done.

These are equivalent to $2a_{2n+1} + a_{2n+2} < a_{2n+3}$ and $3a_{2n+1} + a_{2n+2} < a_{2n+3}$ respectively. The first follows from the second, and the second rearranges to $2a_{2n+1} < a_{2n+2}$ which is true.

So the claim has been proved. Therefore, $[z]-[y] = [y]-[x] = a_{2n}$, which is even. Thus, they all have the same parity!


Continuing the proof of the main theorem

In short, we are done with the reflection and "three-line" lemmas. But what are we going to get from this?

Well, we claim that we have proved that $$\sum_{i=1}^k (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} \geq 0 \text{ for all } 1 \leq k \leq 3a_{2n+1}$$. Indeed, we know that the sum is non-negative for $0 \leq k \leq a_{2n+1}$ by the induction hypothesis. Now, for $a_{2n+1} \leq k \leq 2a_{2n+1}$ we have $$ \sum_{i=1}^k (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} = s_{a_{2n+1}} + \sum_{i=a_{2n+1}+1}^k (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} \\ = 1 + \sum_{i=1}^{k-a_{2n+1}} (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} \geq 0 $$ (note that the above also gives $a_{2a_{2n+1}} = 2$ with $k = a_{2n+1}$) and finally for $2a_{2n+1} \leq k \leq 3a_{2n+1}$ : $$ \sum_{i=1}^k (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} = s_{2a_{2n+1}} + \sum_{i=2a_{2n+1}}^k (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} \\ = 2 + \sum_{i=1}^{k-2a_{2n+1}} (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} \geq 0 $$

But now, we've reached halfway till $a_{2n+3}$! That's because $6a_{2n+1} > a_{2n+3}$ (prove this using rearrangements and induction if needed : it's easy), and therefore $3a_{2n+1}$ is more than halfway up $a_{2n+3}$. Reflection beckons!

Applying the reflection property and summing over indices tells us now that $$ \sum_{i=1}^k (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} = \sum_{i=1}^{a_{2n+3} - 1 -k} (-1)^{[\frac{ia_{2n+2}}{a_{2n+3}}]} $$

for all $k \geq 1$. We know that the LHS is non-negative for $k=1,2,...,\frac{a_{2n+3}-1}{2}$ since $\frac{a_{2n+3}-1}{2} \leq 3a_{2n+1}$. From this statement we now get that the LHS is also non-negative for $a_{2n+3} - 1 - k =1,2,...,\frac{a_{2n+3}-1}{2}$ i.e. for $k = \frac{a_{2n+3}-1}{2} + 1 , ..., a_{2n+3} - 1$.

Using this and the reflection logic we used earlier, we know that $$\sum_{i=1}^{a_{2n+3}-1} (-1)^{\left[i \frac{a_{2n+2}}{a_{2n+3}}\right]} = 0$$. The term corresponding to $a_{2n+3}$ is just $\frac{a_{2n+3}a_{2n+2}}{a_{2n+3}} = a_{2n+2}$ which is even. So the sum $$ \sum_{i=1}^{a_{2n+3}} (-1)^{\left[i \frac{a_{2n+2}}{a_{2n+3}}\right]} = 1 $$

which means that the induction claim has been fully verified.


Completing the problem

With that, let $N$ be large enough, and consider $n$ so that $a_{2n+3} > N$ ($a_i$ is an increasing sequence of integers, so we know this is going to cross $N$ at some point). Then note that $s_N = \sum_{i=1}^N (-1)^{[i \frac{a_{2n+2}}{a_{2n+3}}]} \geq 0$ from the approximation lemma, and the main theorem. Thus, we are done!


Revision

  • Approximation of an irrational by continued fractions : the approximation lemma.

  • Usage of the recurrence relations and the Cassini-type identities for the purposes of induction.

  • Pattern finding using reverse relations and line identities and using the elements of the recurrence relations as markers to look for patterns.

  • Persistence : I used only paper to solve this. The $87$ term thing killed me : I'll post the lines here if anybody needs me to. I also spotted some other patterns and proved them by induction. None of these quite helped as much as the three line identity, though, so I'm sorry if I didn't show my failures.