Is $g(x)= \frac{x_1^2}{r_1^2}+\frac{x_2^2}{r_2^2}- c$ a unique solution of $E[g(Z)\mid M=\mu]=0, \forall \mu \in \text{ellipse } C$ for $Z$ Gaussian

Let $Z \in \mathbb{R}^2$ be an i.i.d. Gaussian vector with mean $M$ where $P_{Z\mid M}$ is its distribution.

Let $g: \mathbb{R}^2 \to \mathbb{R}$ and consider the following equation: $$ E[g(Z)\mid M=\mu]=0, \forall \mu \in C, $$ where $C=\{\mu: \frac{\mu_1^2}{r_1^2}+\frac{\mu_2^2}{r_2^2}=1 \}$ for some given $r_1,r_2 > 0$. That is, $C$ is an ellipse.

It is not difficult to verify (see this question, see also Edit 3) that a solution to this equation is given by $$ g(x)= \frac{x_1^2}{r_1^2}+\frac{x_2^2}{r_2^2}- c, $$ where $c=\frac{1}{r_1^2}+\frac{1}{r_2^2}+1$. In fact any function $g_a(x)= a g(x)$ for any $a \in \mathbb{R}$ is a solution.

Question: Is $g$ a unique solution up to a multiplicative constant?

Edit: If we need to make an assumption on the class of allowed functions $g$. Let us assume that $g$'s are bounded by a quadratic monomial (i.e., for every $g$ there exists $a$ and $b$ such that $g(x) \le a \|x \|^2 +b$).

Edit 2: If you want to avoid expectation notation. Everything can be alternatively written as $$ \iint g(z)\frac{1}{2 \pi} e^{-\frac{\|z-m\|^2}{2}} \, {\rm d} z=0, \, m\in C. $$ From here, one can see that this question is about a convolution.

Edit 3: To see that $g(x)$ is a solution we use that the second moment of Gaussian is given by $E[Z_i^2\mid M_i=\mu_i]=1+\mu_i^2$, which leads to \begin{align} E\left[\frac{Z_1^2}{r_1^2}+\frac{Z_2^2}{r_2^2}- c\mid M=\mu \right]&= \frac{E[Z_1^2\mid M_1=\mu_1]}{r_1^2}+\frac{ E[Z_2^2\mid M_2=\mu_2]}{r_2^2}-c\\[6pt] &=\frac{1+\mu_1^2}{r_1^2}+\frac{ 1+\mu_1^2}{r_2^2}-c\\[6pt] &=\frac{1}{r_1^2}+\frac 1 {r_2^2}+1-c, \end{align} where in the last step we used that $\mu$ is on the ellipse.

Edit 4: The comment below shows that the solution is not unique when an ellipse is a circle. However, I would still like to know the answer for a general ellipse.


Solution 1:

That was a very natural question to me, and I really enjoyed the journey of exploring it! Let me give three different answers, which for me correspond to three different levels of understanding of the problem.

(Major edit on June 18th, 2021)

Three approaches

The three answers are based respectively on

  1. functions with circular symmetry: this approach gives solutions for the case of a circle;
  2. rescaling quasi-symmetry of gaussians: this approach produces new solutions from known ones, and it works for general ellipses;
  3. normal convolutional inverse: this approach, which I prefer to carry out using Hermite polynomials, in principle works for very general subsets, not only for ellipses.

Here I provide, briefly, the three solutions. More details on each of them are postponed to the end of this post.


Solution 1: The case of a circle and circular symmetry

Here I discuss the case of a circle ($r_1=r_2=R^2$).

Proposition 1. Let $g_1,g_2$ be two distinct functions on $\mathbb R^2$ with circular symmetry. Let $W$ be an i.i.d. standard Gaussian vector with mean $M=(R,0)$. Let $E_i$ be the expectation of $g_i(W)$. Then $g(x) = E_1 g_2(x) - E_2 g_1(x_1)$ is a solution.

Here is the idea: if the ellipse is a circle, since the standard gaussian distributions look the same from every point of view, we have that the problem has rotational symmetry. Then it is natural to search for solution that are rotationally symmetric themselves.

Remark 1: In Proposition 1 we already see, at least in the case of a circle, that the conditions of the problem (namely, those of vanishing expectation when the mean varies along an ellipse) gives a system of equations that is not enough restrictive to cut out a one-dimensional set of solutions.


Solution 2: A family of solutions for the general case of an ellipse

Let now $r_1,r_2>0$ be arbitrary, so we consider the case of an ellipse of equation $\frac {x_1^2}{r_1} + \frac {x_2^2}{r_2} =1.$

We have already seen (in the question itself) that the function $$g_r(x)= \left(\frac {x_1^2}{r_1} + \frac {x_2^2}{r_2}\right)- \left(\frac 1 {r_1} + \frac 1 {r_2} +1\right)$$ is a solution to the problem. We now construct a one-parameter family of solutions.

Proposition 2: For every $t>-1/2$, the function $$ g_{r,t}(x) = \left ((1+2t)^2 \left(\frac {x_1^2}{r_1} + \frac {x_2^2}{r_2}\right)- (1+2t)\left(\frac {1}{r_1} + \frac {1}{r_2}\right) -1\right) e^{-t|x|^2 }$$ is a solution to the problem.

Here is the idea: Gaussians behave well with respect to multiplications with other gaussians. Moreover, if a function can be written as $g(x)=h(x)e^{-t|x|^2}$ for some reasonably growing factor $h(x)$, then it should be automatically behave mildly at infinity, as required at some point in the question. Then, it makes sense to look for solutions in this form. In this particular case, it turns out that the factor $h(x)$ can be derived from a solution to the problem associated with another smaller ellipse.

Remark 2: With Proposition 2, we see that there are infinitely many solutions which are bounded by a quadratic monomial. In fact, there are many solutions that tend to zero at infinity: those corresponding to parameters $t>0$.


Solution 3: Inverse of the Gaussian convolution

Given an integrable function $h:\mathbb R^2\to\mathbb{C}$ we define, for every $m\in\mathbb R^2$, the shifted gaussian average operator $$ E_m[h] = \frac 1 {2\pi}\iint h(x) e^{-|x-m|^2/2} dx_1 dx_2 .$$

We also define, for every $x\in\mathbb R^2$, the dual operator $\hat E_x[\cdot]$ given by $$ \hat E_x[h] = \frac 1 {2\pi}\iint h(ix) e^{-|m+ix|^2/2} dm_1 dm_2 .$$

Proposition 3: Let $f\in\mathbb C[m_1,m_2]$ be a polynomial function that vanishes identically on the ellipse. Then the function $g(x)=\hat E_x[f]$ is a solution to the problem.

Here is the idea: The operator $E_m[\cdot]$ corresponds to the operation of convolutions with a standard gaussian distribution centered at $m\in \mathbb R^2$. The dual operator $\hat E_x[\cdot]$ is basically an inverse operation that "unwinds" the convolution.

Remark 3.1: The functions in Proposition 3 do not necessarily have to be polynomials, but I decided to restrict to this case, which is actually very natural to me.

Remark 3.2: I preferred to treat this case using the theory of Hermite polynomials (more details below, in the appropriate section) and I hope not to have done any mistake. The main advantage is that it makes the theory practical for computations (see the next section). Another advantage is that the proofs are derived directly from known properties of Hermite polynomials.


Examples for solution 3: Hermite polynomials

I strongly believe that abstract solutions should come with practical examples, whenever possible.

Example 1: Let $f(m) = m_1^2 /r_1 + m_2^2/r_2 - 1$. This is a function that vanish identically on the ellipse. Now let $g(x) = \hat E_x [f]$.

At the end of the post, I will show that the operator $\hat E_x[\cdot]$ can be computed explicitly as follows on monomials $$\hat E_x[m_1^pm_2^q] = He_p(x_1) He_q(x_2), $$ where $He_n(t)$ denote the (probabilyst's) Hermite polynomials $$ He_0(t) = 1 ,$$ $$ He_1(t) = t ,$$ $$ He_2(t) = t^2 -1, $$ $$ He_3(t) = t^3 - 3t, $$ $$ He_4(t) = t^4 - 6t^2 + 3 , etc.$$

Therefore we have that $$ g(x) = He_2(x_1) / r_1 + He_2(x_2)/ r_2 - 1 $$ is the function found by Boby already in the question of this problem.

Example 2: Let $f(m) = \left( m_1^2 /r_1 + m_2^2/r_2 - 1\right)^2$. This function vanishes on the ellipse identically.

We have $$ f(m) = \frac {m_1^4} {r_1^2} + \frac {m_2^4} {r_2^2} + \frac {2m_1^2 m_2^2} {r_1r_2} - \frac {2m_1^2} {r_1} - \frac {2m_2^2} {r_2} + 1.$$ Therefore if we let $g(x) = \hat E_x[f]$, we have that $$ g(x) = \frac {He_4(m_1)} {r_1^2} + \frac {He_4(m_2)} {r_2^2} + \frac {2He_2(m_1)He_2(m_2)} {r_1r_2} - \frac {2He_2(m_1)} {r_1} - \frac {2He_2(m_1)} {r_2} + 1$$ is another solution to the original problem.


Proof of Proposition 1

Since the two functions $g_1, g_2$ have circular symmetry, it means that if we compute the averages with respect to a Gaussian centered in any point of the circle, then we get the same results $E_1$ and $E_2$. If we want to see it explicitly, we can write down the change of variables given by the rotation of the plane that sends a point of the circle $m$ to $(R,0)$. Now, since the averages of $g_i$ are $E_i$, we see that the expectation of their combination $g=E_1 g_2 - E_2 g_1$ is simply $E_1E_2−E_2E_1=0$. This shows that $g$ is a solution to the problem posed. The key point here is that the expectation of a function with circular symmetry doesn't change if we rotate the probability measure .


Proof of Proposition 2

Let $f_r(x) = \frac {x_1^2}{r_1} + \frac {x_2^2}{r_2} $ and $ c_r= \frac 1 {r_1} + \frac 1 {r_2} +1 $.

We make the $u$-substitution $$(u_1,u_2) = (\sqrt {1+2t} x_1,\sqrt {1+2t} x_2) ,$$ so that $e^{-t|x|^2}e^{-|x|^2/2} = e^{-|u|^2/2} $.

Then, given $m=(m_1,m_2)$, the expression $e^{-t|x|^2 }e^{-|x-m|^2/2}$ is equal to: $$ = e^{-\frac 1 2 \left((1+2t)(x_1^2+x_2^2) -2m_1x_1-2m_2x_2 + m_1^2+m_2^2\right)} $$ $$ = e^{-\frac {1} {2} |u-\frac m {\sqrt{1+2t}}|^2 } e^{ -\frac {1} {2} |m|^2+\frac {1} {2} \frac{|m|^2}{1+2t}} .$$

Now, let $$d_{r,t}=(1+2t)\left(\frac {1}{r_1} + \frac {1}{r_2}\right) +1.$$ The convolution of the gaussian with $g_{r,t}$ is $$ \frac 1{2\pi} \iint g_{r,t}(x) e^{-|x-m|^2/2} dx$$ $$ = \frac 1{2\pi} \iint \left( (1+2t)^2f_r(x) -d_{r,t}\right) e^{-t|x|^2}e^{-|x-m|^2/2} dx $$ $$ = \frac 1 {2\pi} \frac 1 {1+2t} e^{-\frac t {1+2t} |m|^2} \iint \left( (1+2t)^2f_r(u/\sqrt{1+2t}) - d_{r,t}\right) e^{-\frac 1 2 \left|u-\frac m{\sqrt {1+2t}}\right|^2} du $$ $$ = \frac 1 {2\pi} \frac 1 {1+2t} e^{-\frac t {1+2t} |m|^2} \iint \left( (1+2t)f_r(u) - d_{r,t}\right) e^{-\frac 1 2 \left|u-\frac m{\sqrt {1+2t}}\right|^2} du .$$

Note that $m/\sqrt{1+2t}$ belongs to the ellipse of equation $f_{r/\sqrt{1+2t}} (x)= 1$. Moreover, we have $$(1+2t)f_r(u) - d_{r,t} = f_{r/\sqrt{1+2t}}(u) - c_{r/\sqrt{1+2t}}.$$ In fact, we designed the function $g_{r,t}$ precisely to get to this point.

Finally, we observe that the double integral in the displayed equation is equal to zero, because it is an instance of the solution originally given by Boby in the question, for the shrinked ellipse of equation $f_{r/\sqrt{1+2t}} (x)= 1$.


Proof of Proposition 3

First, let me recall that the normal moments in one variable are given by "dual Hermite polynomials" (as mentioned in an answer to this math.SE question): $$ E_m[x_1] = m_1, \quad \quad E_m[x_2] = m_2,$$ $$ E_m[x_1^2] = m_1^2+1, \quad \quad E_m[x_2^2] = m_2^2+1,$$ $$ E_m[x_1^3] = m_1^3+3m_1, \quad \quad E_m[x_2^3] = m_2^3+3m_2,$$ $$ E_m[x_1^3] = m_1^4+6m_1^2+3, \quad \quad E_m[x_2^4] = m_2^4+6m_2^2 +3.$$

Compare with the first few (probabilyst's) Hermite polynomials $$ He_0(t) = 1 ,$$ $$ He_1(t) = t ,$$ $$ He_2(t) = t^2 -1, $$ $$ He_3(t) = t^3 - 3t, $$ $$ He_4(t) = t^4 - 6t^2 + 3 .$$

For the record Hermite polynomials are explicitly given by the formula (here the signs alternate) $$He_n(t) = n! \sum_{k=0} ^{\lfloor n/2\rfloor} \frac {(-1)^k t^{n-2k}}{2^kk!(n-2k)!},$$ and the mixed moments of the shifted standard gaussian are $$ E_m[x_1^p x_2^q] = i^{-p-q} He_p(im_1) He_q(im_2),$$ where $i=\sqrt {-1}$ is a choice of imaginary unit.

On the other hand, the operator $\hat E_x[\cdot]$ is given on monomials by $$\hat E_x[m_1^pm_2^q] = E_{-ix}[(im_1)^p(im_2)^q] = He_p(x_1) He_q(x_2).$$

Finally, we have the following property of Hermite polynomials (compare with Remark 4 below): $$ E_m[He_n(x_1)] = m_1^n, \quad \quad E_m[He_n(x_2)] = m_2^n . $$

By linearity, this implies that $E_m$ and $\hat E_x$ are inverse operators, in the following sense: $$ E_m [\hat E_x [f(m_1,m_2)]] = f(m_1,m_2) $$ for every polynomial function $f$.

Remark 4: For the record, the Hermite polynomials form a basis of the set of polynomials. This is explicitly given by the dual summation formula (the signs are all positive here!) $$ t^n = n! \sum_{k=0} ^{\lfloor n/2\rfloor} \frac { He_{n-2k}(t)}{2^kk!(n-2k)!}.$$

Remark 5: To treat more general functions, one may use the following facts:

  1. The shifted gaussians are generating functions of Hermite functions: $$e^{-|x-y|^2/2} = e^{-x^2/2} \sum_{n=0} ^\infty He_k(x) \frac {y^k}{k!};$$

  2. The Hermite polynomials satisfy the following orthogonality relation: $$ \int_{-\infty} ^\infty He_n(t) He_n(t) \frac {e^{-t^2}/2}{k!\sqrt 2\pi} dt = \delta_{k,n}$$

and so $$ \int_{-\infty} ^\infty He_n (x) \frac {e^{-(x-y)^2/2}}{\sqrt {2\pi}} dx = y^n.$$

Solution 2:

This will not answer the question as stated, but whoever asks such a question may be interested in what I say here.

First, a random variable with a "Gaussian" or "normal" distribution need not have variance $1,$ so I would have stated what the variance is.

In standard terminology, the question is asking which unbiased estimators of zero this parameterized family of probability distributions has (where the ellipse is the parameter space).

A family of probability distributions that admits no unbiased estimators of zero is called "complete", so the question is how far this family is from being complete.

The family of all univariate Gaussian distributions, whose parameter space may be taken to be the half-plane $\{(\mu,\sigma) : \mu\in\mathbb R,\, \sigma>0\},$ is complete, and that fact is essentially the same thing as the one-to-one nature of the two-sided Laplace transform.

A corollary of the completeness of this family of all Gaussian distributions is that with an i.i.d. sample, the sample mean has a smaller variance than every other unbiased estimator of the population mean. For example, think about how to prove that the sample median has a smaller variance than the sample mean. With this result, you get that without computing the variance of the sample median.