Distance from $x^n$ to lesser polynomials

I am interested in the $L_1$ distance of $x^n$ to the $\mathbb R$-span of $\{1,x,\ldots,x^{n-1}\}$ over some interval. We can WLOG consider the interval $[0,1]$ (say) because scaling and shifting only affects the distance by a constant factor.


So far we have these basic results:

Theorem $x^n$ is not in the $\mathbb R$-span of $\{1,x,\ldots,x^{n-1}\}$

Proof Suppose it was: $x^n = \sum_{i=0}^{n-1} a_i x^i$. Differentiate $n$ times to get the contradiction $n! = 0$.

We can rephrase this theorem as saying that $\| x^n - p(x) \| > 0$ for all polynomials of degree $< n$.

Theorem There is no sequence of polynomials in the span, with limit $x^n$.

Proof We would have a sequence $p_n(x) = \sum_{i=0}^{n-1} a_{i,n} x^i$ where $x^n = \lim_{n \to \infty} p_n(x) = \sum_{i=0}^{n-1} (\lim_{n \to \infty} a_{i,n}) x^i$, but that is a contradiction.


Together with the previous result this implies there should be some positive $h(n)$ such that $\| x^n - p(x) \| > h$.

My question is about finding some $h$. How could we go about doing this? Are there relevant theorems?

My first idea was that we might use induction and find some way of saying: Since we can only get within $r$ of the derivative we can only get within $h(r)$ of the function itself.. but finding $h(r)$ seems just as hard as finding $h$ at all.


UPDATE I was right the first time after all.

I claim that the degree $n$ monic polynomial with minimal $L_1$ norm is $\prod_{k=1}^n (x-\cos^2(\pi k/(2n+2)))$.

By standard compactness arguments, there is such a minimal polynomial, although it is not clear whether or not it is unique. Let $f$ be a polynomial which achieves the minimum.

Lemma: The polynomial $f$ has $n$ distinct roots in the interval $(0,1)$.

Proof: Suppose to the contrary that $f$ had fewer roots. Let $\rho_1$, $\rho_2$, ..., $\rho_m$ be those roots of $f$ in $(0,1)$ that have odd multiplicity. Then $f$ has the same, or opposite, sign as $g(x) := \prod(x-\rho_i)$ everywhere on $[0,1]$. For $\epsilon$ sufficiently small, and the right choice of sign, $| f \pm \epsilon g(x) |_1 < |f|_1$. $\square$

Let the roots of $f$ be $0 < \rho_1 < \rho_2 < \cdots < \rho_n <1$, and formally define $\rho_0=0$ and $\rho_{n+1}=1$ for convenience.

As we learn in any calculus class, in order to minimize a function, we shold figure out how it changes when the input is perturbed. Let $g$ be any polynomial of degree $<n$. For small $\epsilon$, the implicit function theorem tells us that $f + \epsilon g$ has $n$ roots in $(0,1)$, given by smooth functions $\rho_i(\epsilon)$. (We are using the lemma to know that $f'(\rho_k) \neq 0$, so that we may apply the implicit function theorem.) We set $\rho_0(\epsilon)=0$ and $\rho_{n+1}(\epsilon) = 1$.

Then $$|f+\epsilon g|_1 = \sum_{k=0}^n (-1)^{n-k} \int_{\rho_k(\epsilon)}^{\rho_{k+1}(\epsilon)} f+\epsilon g$$ $$ = \sum_{k=0}^n (-1)^{n-k} \left( \int_{\rho_k}^{\rho_{k+1}} f+\epsilon g + \int_{\rho_k}^{\rho_k(\epsilon)} f+\epsilon g - \int_{\rho_{k+1}}^{\rho_{k+1}(\epsilon)} f+\epsilon g \right)$$ $$=|f|_1 + \epsilon \sum_{k=0}^n (-1)^{n-k} \int_{\rho_k}^{\rho_{k+1}} g + \mbox{error}$$ where the error comes from the latter two integrals.

Let's look at these integrals. Since $\rho_k$ is a smooth function, we have $\rho_k(\epsilon)= \rho_k + O(\epsilon)$, so the width of the interval is $O(\epsilon)$. Also, $f(\rho_k) =0$ and $f$ is smooth so $f(x) = O(\epsilon)$ for $x \in (\rho_k, \rho_k+O(\epsilon))$. Finally, $g$ is bounded, so $g=O(1)$. So our error terms are all of the form $$\int_{\rho_k}^{\rho_k + O(\epsilon)} \left( O(\epsilon) + \epsilon O(1) \right) = O(\epsilon^2).$$ So $$|f+\epsilon g|_1 = |f|_1 + \epsilon \sum_{k=0}^n (-1)^{n-k} \int_{\rho_k}^{\rho_{k+1}} g + O(\epsilon^2).$$

We see that the directional derivative of $| \ |_1$ in direction $g$ exists. So, if $f$ is to be a minimum of $| \ |_1$, then we must have
Criterion: For every polynomial $g$ of degree $< n$, we have $$\sum_{k=0}^n (-1)^{n-k} \int_{\rho_k}^{\rho_{k+1}} g=0$$


Uniqueness of solution: I will first show that there is a unique $n$-tuple $(\rho_1, \ldots, \rho_n) \in [0,1]^n$ obeying the criterion. Observe that the criterion is linear in $g$, so it is enough to check the criterion for the monomials of degree $<n$. To keep notation simple, I'll restrict myself to the case $n=2m$. It is convenient to rename $(\rho_1, \ldots, \rho_{2m})$ as $(\alpha_1, \beta_1, \alpha_2, \beta_2, \ldots, \beta_m)$.

Applying the Criterion with $g=x^{k-1}$ we see that, for $1 \leq k \leq n$, we have $$\frac{1}{k} \left( \alpha_1^k - (\beta_1^k-\alpha_1^k) + (\alpha_2^k-\beta_1^k) - \cdots +(1-\beta_m^k) \right)=0.$$ Equivalently $$2 \sum \frac{\alpha_i^k}{k} + \frac{1}{k} = 2 \sum \frac{\beta_i^k}{k} \ \mathrm{for} \ 1 \leq k \leq n.$$

Now comes the part I thought was clever. Organize the above $n$ equations into a generating function: $$\sum \frac{\alpha_i^k x^k}{k} + \frac{1}{2} \sum \frac{x^k}{k} = \sum \frac{\beta_i^k x^k}{k} + O(x^{n+1}).$$ This gives $$\sum \log \frac{1}{1-\alpha_i x} + \frac{1}{2} \sum \log \frac{1}{1-x} = \sum \log \frac{1}{1-\beta_i x} + O(x^{n+1})$$ or $$\prod (1-\alpha_i x) \sqrt{1-x} = \prod (1-\beta_i x) + O(x^{n+1}).$$ Set $\prod(1-\alpha_i x) = 1+a_1 x + a_2 x^2 + \cdots + a_m x^m$ and $\prod(1-\beta_i x) = 1+b_1 x + b_2 x^2 + \cdots + b_m x^m$. So $$\left( 1+a_1 x + a_2 x^2 + \cdots + a_m x^m \right) \sqrt{1-x} = \left( 1+b_1 x + b_2 x^2 + \cdots + b_m x^m \right) + O(x^{n+1}).$$

Writing our this power series an equating coefficients of $x^k$ for $k\leq n$, we get linear equations for the $a_i$ and $b_j$. In particular, the set of solutions is a linear space and it shouldn't be too hard (I didn't actually check this) to see that it is a zero dimensional space. So, subject to the linear algebra problem, the solution is unique.

The key practical point is that computers solve linear equations very quickly. So I had Mathematica go and solve these equations for me for several values of $n$. For $n=10$, I got $$a(x) = \frac{1}{1024} \left( 1024 - 2816 x + 2816 x^2 - 1232 x^3 + 220 x^4 - 11 x^5 \right)$$ $$b(x) = \frac{1}{1024} \left( 1024 - 2304 x + 1792 x^2 - 560 x^3 + 60 x^4 - x^5 \right).$$ So I typed $1024, -2816, 2816, -1232, 220, -11$ and $1024, -2304, 1792, -560, 60, -1$ into Sloane's encyclopedia and immediately learned that these were coefficients of Chebyshev polynomials of the first and second kind. It took a little while to get all the indexing issues right, but I eventually realized that this meant my solution was $$f(x) = \prod_{k=1}^n (x-\cos^2(\pi k/(2n+2))).$$


Once we know that the solution is unique, it is enough to check that $f$ obeys the Criterion. Let $h$ be the function $[0,1] \to \{ -1 , 1 \}$ which switches sign at the numbers $\cos^2(\pi k/(2n+2))$. We want to show:

Claim For any polynomial $g$ of degree $<n$, we have $$\int_0^1 h(x) g(x) dx=0.$$

The form of $h$ suggests putting $x=\cos^2 t$, so we want $$\int_0^{\pi/2} h(\cos^2 t) g(\cos^2 t) (2 \sin t \cos t) dt=0.$$ Now, $h(\cos^2 t)$ switches signs at $k \pi/(2n+2)$. Let $\sigma$ be the square wave function: $\sigma(u)$ is $\pm 1$ and switches signs at integer multiples of $\pi$. So we want $$\int_{0}^{\pi/2} \sigma( (2n+2) t) g((\cos (2t)+1)/2) \sin(2t) dt=0.$$ Setting $u=2 t$, and noticing that the integrand is an even function, we get $$\int_{-\pi}^{\pi} \sigma( (n+1) u) g((\cos u+1)/2) \sin u \ du=0.$$

Also, if $g$ is an polynomial of degree $<n$, then the polynomial $\hat{g}(w) := g((u+1)/2)$ is also has degree $<n$. So our goal is to show:

For any polynomial $\hat{g}$ of degree $<n$, we have $\int_{-\pi}^{\pi} \sigma((n+1) u) \hat{g}(\cos u) \sin u \ du=0$.

It is enough to check this for a basis of polynomials $\hat{g}_0$, ..., $\hat{g}_{n-1}$; we use the Chebyshev polynomials, so that $\hat{g}_m(\cos u) = \cos (mu)$. So our goal can be rewritten as showing that $$\int_{- \pi}^{\pi} \sigma((n+1) u) \cos (mu) \sin u \ du=0 \ \mathrm{for} \ 0 \leq m < n$$ or that $$\frac{1}{2} \int_{- \pi}^{\pi} \sigma((n+1) u) \left( \sin ((m+1) u) - \sin((m-1) u) \right) du \ \mathrm{for} \ 0 \leq m < n.$$

But $\sigma(v)$ has a well known Fourier series $\sin(v) + \frac{1}{3} \sin 3 v + \frac{1}{5} \sin (5v) + \cdots$, so $\sigma((n+1)u)$ has a Fourier series starting $\sin((n+1)u) + \cdots$. We see that $\sigma((n+1) u)$ is orthogonal to the functions $\sin (ru)$ for $r \leq n$, which immediately establishes the required claim. QED


To minimize $$ \int_0^1\left|x^n-a_{n-1}x^{n-1}-a_{n-2}x^{n-2}-\dots-a_0\right|\;\mathrm{d}x\tag{1} $$ we can differentiate with respect to each $a_k$ to get at the critical points $$ \int_0^1\operatorname{sgn}(x^n-a_{n-1}x^{n-1}-a_{n-2}x^{n-2}-\dots-a_0)\;x^k\tag{2}\;\mathrm{d}x=0 $$ for $k=0\dots n-1$.

The minimal $P(x)=x^n-a_{n-1}x^{n-1}-a_{n-2}x^{n-2}-\dots-a_0$ must have $n$ distinct roots in $(0,1)$. If not, then there is a polynomial $Q(x)$ of degree $< n$ that has the same sign as $P(x)$ for all $x\in[0,1]$. Then for some $\epsilon>0$, $\int_0^1|P(x)-\epsilon Q(x)|\mathrm{d}x<\int_0^1|P(x)|\mathrm{d}x$.

Let $0< x_1< x_2<\dots< x_n<1$ be the roots of $P(x)$. By $(2)$ we have that $$ \frac{1}{k+1}\left[(x_1^{k+1}-0)-(x_2^{k+1}-x_1^{k+1})+(x_3^{k+1}-x_2^{k+1})-\dots+(-1)^{n-1}(1-x_n^{k+1})\right]=0 $$ for each $k=0\dots n-1$, which is equivalent to $$ x_n^k-x_{n-1}^k+x_{n-2}^k-\dots+(-1)^{n-1}x_1^k=\tfrac12\tag{3} $$ for each $k=1\dots n$. Equation $(3)$ has roots $$ x_j=\sin^2\left(\frac{\pi}{2}\frac{j}{n+1}\right)\tag{4} $$ for $j=1\dots n$. That is, since $\displaystyle \operatorname{Re}\left(\frac{1}{e^{ix}+1}\right)=\frac12$, $$ \begin{align} &\sum_{j=1}^n(-1)^{n-j}\sin^{2k}\left(\frac{\pi}{2}\frac{j}{n+1}\right)\\ &=(-\tfrac14)^k\sum_{j=1}^n(-1)^{n-j}\sum_{m=0}^{2k}\binom{2k}{m}(-1)^m\exp\left(i\pi\frac{(k-m)j}{n+1}\right)\\ &=(-\tfrac14)^k(-1)^{n}\sum_{m=0}^{2k}\binom{2k}{m}(-1)^m\left(\frac{(-1)^{k-m+n}-1}{\exp\left(i\pi\frac{k-m}{n+1}\right)+1}-1\right)\\ &=(\tfrac14)^k\sum_{m=0}^{2k}\binom{2k}{m}(-1)^{k-m+n}\frac{(-1)^{k-m+n}-1}{\exp\left(i\pi\frac{k-m}{n+1}\right)+1}\\ &=(\tfrac14)^k\sum_{m=0}^{2k}\binom{2k}{m}\frac{1-(-1)^{k-m+n}}{\exp\left(i\pi\frac{k-m}{n+1}\right)+1}\\ &=(\tfrac14)^k\sum_{m=0}^{2k}\binom{2k}{m}\frac12\left(1-(-1)^{k-m+n}\right)\\ &=(\tfrac14)^k\sum_{m=0}^{2k}\binom{2k}{m}\frac12\\ &=\frac12\tag{5} \end{align} $$ Thus, the minimal monic polynomial is $$ P(x)=\prod_{j=1}^n\left(x-\sin^2\left(\frac{\pi}{2}\frac{j}{n+1}\right)\right)\tag{6} $$