Zero Polynomials: Help Me Get out of a Circular Argument

Solution 1:

The key thing it seems you're missing is that the factor theorem is a statement about formal polynomials, not just about the values of polynomial functions. That is, if $f(x)$ is a polynomial and $f(a)=0$, then there exists a polynomial $g(x)$ such that $$f(x)=(x-a)g(x)$$ and this equation means the two sides are equal as formal expressions, not just that they are equal for all values of $x$. By "equal as formal expressions" I mean that if $f(x)$ is the polynomial $\sum_{i=0}^n b_i x^i$ and $g(x)$ is the polynomial $\sum_{i=0}^{n-1} c_ix^i$, then $b_i=c_{i-1}-ac_i$ for $i=1,\dots,n$ and $b_0=-ac_0$. That is, the coefficients of $f(x)$ are what you get by formally multiplying out $(x-a)g(x)$ and combining terms with the same power of $x$. Indeed, if you perform polynomial long division to find the coefficients of $g(x)$ (which is how you prove the factor theorem), you are exactly solving for numbers $c_i$ which make the equations $b_i=c_{i-1}-ac_i$ true.

Thus when you iterate this and get that $$f(x)=(x-a_1)\cdots(x-a_n)h(x)$$ where $h(x)$ has degree $0$, this is also an equality of formal polynomials: the coefficients of $f(x)$ are what you get by expanding out the product on the right-hand side. Since $h(a_{n+1})=0$, this means $h(x)$ is the zero polynomial: all its coefficients are $0$. This means that when you expand out the right-hand side, all the coefficients are $0$, so all the coefficients of $f(x)$ are $0$.

Alternatively (but essentially equivalently), you can prove by induction on degree that any polynomial with infinitely many zeroes must be the zero polynomial. Given such a polynomial $f(x)$, let $a$ be one of the zeroes, and write $f(x)=(x-a)g(x)$ (an equation of formal polynomials!). Since $g(x)$ has lower degree but also has infinitely many roots, all its coefficients must be zero by induction on degree, and so all the coefficients of $f(x)$ are zero as well.


To remove any doubt, here's the complete argument, without any mention of the word "polynomial" to avoid any confusion.

Theorem: Let $(b_0,b_1,\dots,b_n)$ be a finite sequence of real numbers ($n\geq 0$). Suppose there exist $n+1$ distinct numbers $x\in\mathbb{R}$ such that $\sum_{i=0}^n b_ix^i=0$. Then $b_i=0$ for all $i$.

Proof: We use induction on $n$. The case $n=0$ is trivial, since $\sum_{i=0}^nb_ix^i$ is just $b_0$ for all $x$, and so if there exists a value of $x$ such that this is $0$, $b_0=0$.

Now suppose the result is true for $n-1$, and let $(b_0,\dots,b_n)$ be such that $\sum_{i=0}^nb_ix^i=0$ for $x=a_0,\dots,a_n$, where the $a_i$ are distinct. Define a sequence $(c_0,\dots,c_{n-1})$ by descending recursion as follows. Start with $c_{n-1}=b_n$. Given $c_i$, define $c_{i-1}=b_i+a_0c_i$. Let $r=b_0+a_0c_0$. [If this seems mysterious, all we are doing here is long division of $\sum_{i=0}^n b_ix^i$ by $x-a_0$: the $c_i$ are the coefficients of the quotient and $r$ is the remainder.]

Now observe that for any $x\in\mathbb{R}$, $$\sum_{i=0}^nb_ix^i=(x-a_0)\sum_{i=0}^{n-1}c_ix^i+r.$$ Indeed, if you expand out the right-hand side using the definitions of $c_i$ and $r$, you get that everything cancels except terms $b_ix^i$ for each $i=0,\dots,n$. Now if we set $x=a_0$, the left-hand side is $0$, and the right-hand side is $r$, so $r=0$. Moreover, if we set $x=a_i$ for $i>0$, the left-hand side is still $0$, and the only way the right-hand side can be $0$ is if $\sum_{i=0}^{n-1}c_ix^i=0$.

Thus the sequence $(c_0,\dots,c_{n-1})$ has the property that $\sum_{i=0}^{n-1}c_ix^i=0$ for $x=a_1,\dots,a_n$. By the induction hypothesis, this means $c_i=0$ for all $i$.

Since $b_n=c_{n-1}$, this means $b_n=0$. For $0< i<n$, we have $b_i=c_{i-1}-a_0c_i$, so $b_i=0$. Finally, we have $b_0=r-a_0c_0=0$ since $r=0$ and $c_0=0$. Thus $b_i=0$ for all $i$, as desired.

(As you alluded to, there are also many analytic ways to prove this result. But the proof above is purely algebraic, and in fact works with any integral domain in place of $\mathbb{R}$.)

Solution 2:

A polynomial function $p(x)=\sum_{j=0}^nA_jx^j$ with deg $(p)=n> 0$ must have $A_n\ne 0$. And then $p(x)$ cannot be $0$ for all $x\in \mathbb R.$

For $x\ne 0$ we have $$p(x)=A_nx^n\left(1+\sum_{j=0}^{n-1}T_j(x)\right)$$ where $T_j(x)=A_n^{-1}A_jx^{-(n+1-j)}.$

Now each $|T_j(x)|,$ (for $j=0,...,n-1$), can be as small as desired, by taking $|x|$ large enough. So if $|x|\ne 0$ and $ |x|$ is large enough that each $|T_j(x)|<1/2n$ then $$|p(x)|\geq |A_n x^n|\cdot (1-\sum_{j=0}^{n-1} |T_j(x)|)\geq \frac {1}{2}|A_nx^n|>0.$$

Remarks. (1). I used the notation $T_j(x)$ to save on typing.... (2). We must use some property of $\mathbb R$ other than just being a field. For example, in the finite field $\{0,1\}$ the function $p(x)=x^2+x$ is always $0$ but the co-efficients are not all $0.$