Two-variable polynomials over $\mathbb{Q}$ with finitely many roots in $\mathbb{R}^2$

I found this interesting exercise (not homework assignment) in real algebraic geometry:

Describe an algorithm which decides whether a given polynomial in $\mathbb{Q}[X, Y]$ has infinitely or finitely many roots in $\mathbb{R}^2$.

There's not much I've found already. It's easy to construct a polynomial with a given number of roots, for example $$ X^2 + \prod_{k=1}^n (Y - k)^2 $$ has exactly $n$ roots in $\mathbb{R}^2$. The idea used is that any system of polynomials in $\mathbb{Q}[X, Y]$ having finitely many solutions (in this case $\lbrace X = 0, \prod_{k=1}^n (Y - k) = 0 \rbrace$) can be converted into one polynomial in $\mathbb{Q}[X, Y]$ having exactly those solutions as roots using sums of squares. However, not all polynomials in $\mathbb{Q}[X, Y]$ can be written as a sum of squares (or minus a sum of squares), provided that they have finitely many roots. For example, it is well-known that the Motzkin polynomial $F = X^4Y^2 + X^2Y^4 - 3X^2Y^2 + 1$ cannot be written as a sum of squares in $\mathbb{R}[X, Y]$, but on the other hand the polynomial $$ (1 + X^2)F = (1 - X^2Y^2)^2 + X^2(1 - Y^2)^2 + X^2Y^2(1 - X^2)^2 $$ has exactly the same roots in $\mathbb{R}^2$ as $F$, namely $(1, 1), (1, -1), (-1, 1)$ and $(-1, -1)$. The above equality also shows that $F$ only takes positive values.

In fact, using the middle value theorem, one can easily see that if a polynomial in $\mathbb{R}[X, Y]$ can take both strictly negative and strictly positive values, it must have infinitely many zeroes in $\mathbb{R}^2$. Conversely, though, some polynomials in $\mathbb{Q}[X, Y]$ never take negative values and still have infinitely many zeroes.

Any thoughts on this? Both hints and full answers are welcome.


Solution 1:

As a model-theorist, here's how I would think about this problem. I'm sure there are more direct ways of solving it, but I'll describe a "logical" approach.

First, recall that the theory of real closed fields is decidable. This means that for any sentence $\varphi$ in the first-order language of ordered fields, there is an algorithm (albeit an incredibly inefficient algorithm) to determine whether $\varphi$ is true or false in $\mathbb{R}$. You can view this as a special case of effective quantifier elimination for the theory RCF of real closed fields: There is an algorithm which finds, for any first-order formula $\varphi(x_1,\dots,x_n)$, an equivalent quantifier-free formula $\psi(x_1,\dots,x_n)$. A sentence is just a formula with no free variables, and quantifier-free sentences are easily decidable.

OK, but the statement "there exist infinitely many $(x,y)$ such that $p(x,y) = 0$" isn't expressible as a first-order sentence - first-order logic doesn't have the quantifier "there exist infinitely many". So let's break down what this means.

Note that if $p$ has infinitely many roots, then either (1) there is some $a$ such that there are infinitely many $b$ with $p(a,b) = 0$, or (2) there are infinitely many $a$ such that there is some $b$ with $p(a,b) = 0$. So our condition is equivalent to $$(\exists x\,\exists^\infty y\,p(x,y) = 0)\lor (\exists^\infty x\,\exists y\,p(x,y)),$$ where $\exists^\infty$ stands for "there exist infinitely many".

Looking at case (1) first, note that for any fixed $x = a$, $p(a,y)$ is a polynomial in $y$, and $\exists^\infty y\,p(a,y) = 0$ if and only if $p(a,y)$ is the zero polynomial. Writing $p(x,y) = \sum_{i=0}^d q_i(x)y^d$, we see that $p(a,y) = 0$ if and only if $a$ is a common root of the polynomials $\{q_0(x),\dots,q_d(x)\}$. Thus, $\exists x\, \exists^\infty y\,p(x,y)$ is equivalent to $\exists x\, \bigwedge_{i=0}^d q_i(x) = 0$, and now we can apply the decision procedure for RCF to answer this question.

Turning to case (2), let $\varphi(x)$ be the formula $\exists y\, p(x,y)$. It is a consequence of quantifier elimination that any formula in one free variable defines a finite union of points and intervals (the is what it means to say that RCF is o-minimal), and a finite union of points and intervals is infinite if and only if it contains a nonempty open interval. Thus, $\exists^\infty x\, \varphi(x)$ is equivalent to $$\exists z_1\, \exists z_2\, ((z_1<z_2)\land (\forall w\, (z_1<w<z_2)\rightarrow \varphi(w))),$$ and now we can apply the decision procedure for RCF to answer this question.

Having given an algorithm to decide whether each of the cases is true, we're done.

You might want to dig deeper and see if you can replace the two applications of the decision algorithm black box with more concrete (and efficient!) algorithms. In part (1) we just asked whether a system of polynomials in one variable has a common root, which can be done using Gröbner bases. The application in part (2) was a bit more complicated, but it can be broken down as follows. First, we could apply quantifier elimination to $\varphi(x)$ (which was $\exists y\, p(x,y) = 0$). This amounts to coming up with a quantifier-free condition on $a$ which is equivalent to the polynomial $p(a,y)$ having a root, which can be done by working carefully through an application of Sturm's theorem to $p(a,y)$, and keeping track of how properties of $a$ affect the answer. Now letting $\psi(x)$ be the quantifier-free equivalent to $\varphi(x)$, we can put $\psi(x)$ in disjunctive normal form as $\bigvee_{k=1}^m \psi_k(x)$ and ask whether any of the $\psi_k(x)$ contain an open interval. Each of the $\psi_k$ is a conjunction of atomic and negated atomic conditions, which is equivalent to a system of polynomial inequalities, each of the form $q_i(x) \leq 0$ or $q_i(x) \geq 0$. You can test whether the solution set of this system contains a nonempty open interval by checking to see whether any of the polynomials involved share roots, and then approximating the roots sufficiently precisely.


Bonus note: Despite the fact that first-order logic doesn't have the quantifier $\exists^\infty$ ("there exist infinitely many"), one can still talk about "elimination of the quantifier $\exists^\infty$". A theory $T$ eliminates $\exists^\infty$ if for every formula $\varphi(x_1,\dots,x_n,y)$, there is a formula $\psi(x_1,\dots,x_n)$ such that for any tuple $\overline{a}$ from a model $M$ of $T$, $M\models \psi(a_1,\dots,a_n)$ if and only if there are infinitely many $b$ such that $M\models \varphi(a_1,\dots,a_n,b)$. This elimination is effective if there's an algorithm which produces $\psi$ from $\varphi$.

Now RCF eliminates $\exists^\infty$ (in fact, every o-minimal theory does, see the Finiteness Lemma 3.1.7 in Tame topology and o-minimal structures by van den Dries). And I'm confident that it does so effectively, which would answer the question directly. But I don't know a reference off the top of my head, and writing down the algorithm would involve similar reasoning to what I wrote above, but in much greater generality.

Solution 2:

This question has been answered already, but it is possible to give more explicit algorithms. For $F\in\mathbb Q[X,Y]$, I will denote the partial derivatives by $F_X,F_Y$. Using Euler's algorithm (or similar) to factor out the greatest common factor of $F,F_X,F_Y$, we can suppose that $F$ is square-free in $\mathbb Q[X,Y]$.

Lemma Suppose that $F\in\mathbb{Q}[X,Y]$ is square-free and has degree $d$. The following are equivalent.

  1. $F$ has finitely many roots.
  2. $F$ has at most $d(d-1)$ roots.
  3. Either $F\ge0$ everywhere or $F\le0$ everywhere.

By the second statement, we just need to show that $F$ has $d^2-d+1$ roots to conclude that it has infinitely many. By the third condition, we can alternatively try to show that $F$ takes strictly positive and strictly negative values (it was noted in the question that this is a sufficient condition but, when $F$ is square-free, it is also necessary). So the following first-order statement is equivalent to $F$ having infinitely many roots $$ (\exists x\exists y\,F(x,y) > 0)\wedge(\exists x\exists y\,F(x,y) < 0).\qquad{\rm(1)} $$ The lemma is easy enough to prove.

1 implies 3: As noted in the question, this follows from the intermediate value theorem.

3 implies 2: At each root of $F$ we have $F=F_X=F_Y=0$, otherwise $F$ would attain strictly positive and negative values in the neighbourhood of the root. For each irreducible factor $f$ of $F$, the square-free condition implies that $f$ does not divide both $F_X$ and $F_Y$. Bezout's theorem then states that $f=F_X=F_Y=0$ can have no more than ${\rm deg}(f)(d-1)$ solutions. Adding together the numbers of roots of the irreducible factors of $F$ gives the bound $d(d-1)$.

2 implies 1: Obvious.

The simplest algorithm I can think of to determine the truth of (1) is simply to evaluate $F$ at a dense enough set of points and check if it attains both positive and negative values. A suitable set of points can be determined as follows. Write, $$ F(X,Y)=\sum_{k=0}^nc_kY^k=g(Y)\in\mathbb Q[X][Y]. $$ Viewing $g$ as a polynomial in indeterminate $Y$ and coefficients in $\mathbb Q[X]$, compute its discriminant, $D(X)\in\mathbb Q[X]$. The squarefree condition implies that $D$ will not be identically zero, so has finitely many roots. As the number of real roots of $F(x,Y)$ (wrt $Y$, for each fixed $x$) does not change as $x$ varies across each open interval in the complement of the set zeros of $D$, we just need to check $F(x,Y)$ for at least one value of $x$ in each of these intervals.

Finding an upper bound $K$ on the (absolute values of) zeros of $d$ in terms of its coefficients is standard. We can also find a lower bound on the gaps between the zeros of $d$. First, factoring out any repeated roots, let $D_{SF}(X)$ be square-free with the same zeros as $D$. Apply Euler's algorithm to write $p(X)D_{SF}(X)+q(X)D_{SF}^\prime(X)=1$. Between any two zeros of $d$ there will be a solution to $D_{SF}^\prime(x)=0$, so $p(x)D_{SF}(x)=1$. Letting $A,B$ be upper bounds of $\lvert p\rvert,\lvert D_{SF}^\prime\rvert$ respectively, on the interval $[-K,K]$, $\lvert D_{SF}(x)\rvert\le1/A$. So, $x$ is a distance at least $1/(AB)$ from each root of $D$. Hence, the roots of $D$ are separated by at least $2/(AB)$.

So, choosing a set of points for $x$ covering the interval $[-K,K]$ and separated by less than $2/(AB)$, we are guaranteed to select points between each consecutive pair of roots of $D$. Next, for each such $x$, we can use the same method to find a grid of values for $y$ which includes points between each pair of zeros of $F(x,Y)$.

Evaluating $F(x,y)$ at the points described is enough to determine whether $F$ attains both positive and negative values. In practise, for efficiency, it would make sense to approximately locate the roots of $D(X)$ and $F(x,Y)$ to reduce the number of gridpoints.

The algorithm just described is theoretically quite simple, but does not work in the general theory of real-closed fields. The number of steps is not uniformly bounded by the degree of $F$, so cannot be represented by a single quantifier-free statement in the first order theory of the reals. It is possible to obtain an algorithm by directly eliminating the quantifiers in (1), but it is a bit more difficult to describe.