Solving a peculiar system of equations

I have the following system of equations where the $m$'s are known but $a, b, c, x, y, z$ are unknown. How does one go about solving this system? All the usual linear algebra tricks I know don't apply and I don't want to do it through tedious substitutions.

\begin{aligned} a + b + c &= m_0 \\ ax + by + cz &= m_1 \\ ax^2 + by^2 + cz^2 &= m_2 \\ ax^3 + by^3 + cz^3 &= m_3 \\ ax^4 + by^4 + cz^4 &= m_4 \\ ax^5 + by^5 + cz^5 &= m_5 \end{aligned}

Solution 1:

Let $\displaystyle x,y,z$ be roots of $\displaystyle t^3 + pt^2 + qt + r = 0$.

i.e.

$\displaystyle x^3 + px^2 + qx + r = 0$ ----- (1)

$\displaystyle y^3 + py^2 + qy + r = 0$ ----- (2)

$\displaystyle z^3 + pz^2 + qz + r = 0$ ----- (3)

Multiply the (1) by $\displaystyle a$, (2) by $\displaystyle b$ and (3) by $\displaystyle c$ and adding gives

$\displaystyle m_3 + p m_2 + q m_1 + rm_0 = 0$ ----- (4)

Multiply the (1) by $\displaystyle ax$, (2) by $\displaystyle by$ and (3) by $\displaystyle cz$ and adding gives

$\displaystyle m_4 + pm_3 + qm_2 + rm_1 = 0$ ----- (5)

Multiply the (1) by $\displaystyle ax^2$, (2) by $\displaystyle by^2$ and (3) by $\displaystyle cz^2$ and adding gives

$\displaystyle m_5 + pm_4 + qm_3 + rm_2 = 0$ ----- (6)

Now (4), (5), (6) is a set of 3 linear equations in 3 variables ($\displaystyle p,q,r$) and can be solved easily.

This give us the cubic which $\displaystyle x,y,z$ satisfy (because we can find out that $\displaystyle p,q,r$ are) which can be solved using Cardano's Method, or more simply by the Trigonometric and Hyperbolic Method.

Once you know $\displaystyle x,y,z$ you can solve for $\displaystyle a,b,c$, as those become just linear equations.

Solution 2:

It seems that I am a wee bit late to this particular party, but I thought I'd show yet another way to resolve the OP's algebraic system, which I'll recast here in different notation:

$$\begin{aligned} w_1+w_2+w_3&=m_0\\ w_1 x_1+w_2 x_2+w_3 x_3&=m_1\\ w_1 x_1^2+w_2 x_2^2+w_3 x_3^2&=m_2\\ w_1 x_1^3+w_2 x_2^3+w_3 x_3^3&=m_3\\ w_1 x_1^4+w_2 x_2^4+w_3 x_3^4&=m_4\\ w_1 x_1^5+w_2 x_2^5+w_3 x_3^5&=m_5 \end{aligned}$$

and the problem is to find the $x_i$ and the $w_i$ satisfying the system of equations.

The key here is to recognize that this is exactly the problem of recovering the nodes $x_i$ and weights $w_i$ of an $n$-point Gaussian quadrature rule with some weight function $w(u)$ and some support interval $[a,b]$, given the moments $m_j=\int_a^b w(u)u^j \mathrm du,\quad j=0\dots2n-1$. Recall that $n$-point rules are designed to exactly integrate functions of the form $w(u)p(u)$, where $p(u)$ is a polynomial of degree at most $2n-1$, and this is the reason why we have $2n$ equations.

As is well known, the nodes and the weights of a Gaussian quadrature can be obtained if we know the orthogonal polynomials $P_k(u)$ associated with the weight function $w(u)$. This is due to the fact that the nodes of an $n$-point Gaussian rule are the roots of the orthogonal polynomial $P_n(u)$. The first phase of the problem, now, is determining the set of orthogonal polynomials from the moments.

Luckily, in 1859(!), Chebyshev obtained a method for determining the recursion coefficients $a_k$, $b_k$ for the orthogonal polynomial recurrence

$$P_{k+1}(u)=(u-a_k)P_k(u)-b_k P_{k-1}(u)\quad P_{-1}(u)=0,P_0(u)=1$$

when given the $m_j$. Chebyshev's algorithm goes as follows: initialize the quantities

$$\sigma_{-1,l}=0,\quad l=1,\dots,2n-2$$ $$\sigma_{0,l}=m_l,\quad l=0,\dots,2n-1$$ $$a_0=\frac{m_1}{m_0}$$ $$b_0=m_0$$

and then perform the recursion

$$\sigma_{k,l}=\sigma_{k-1,l+1}-a_{k-1}\sigma_{k-1,l}-b_{k-1}\sigma_{k-2,l}$$

for $l=k,\dots,2n-k+1$ and $k=1,\dots,n-1$, from which the recursion coefficients for $k=1,\dots,n-1$ are given by

$$a_k=\frac{\sigma_{k,k+1}}{\sigma_{k,k}}-\frac{\sigma_{k-1,k}}{\sigma_{k-1,k-1}}$$ $$b_k=\frac{\sigma_{k,k}}{\sigma_{k-1,k-1}}$$

I'll skip the details of how the algorithm was obtained, and will instead tell you to look at this paper by Walter Gautschi where he discusses these things.

Once the $a_k$ and $b_k$ have been obtained, solving the original set of equations can be done through the Golub-Welsch algorithm; essentially, one solves a symmetric tridiagonal eigenproblem where the $a_k$ are diagonal entries and the $b_k$ are the off-diagonal entries (the characteristic polynomial of this symmetric tridiagonal matrix is $P_n(x)$). The $x_i$ are the eigenvalues of this matrix, and the $w_i$ can be obtained from the first components of the normalized eigenvectors by multiplying the squares of those quantities with $m_0$.

I have been wholly theoretical up to this point, and you and most other people would rather have code to play with. I thus offer up the following Mathematica implementation of the theory discussed earlier:

(* Chebyshev's algorithm *)

chebAlgo[mom_?VectorQ, prec_: MachinePrecision] := 
 Module[{n = Quotient[Length[mom], 2], si = mom, ak, bk, np, sp, s, v},
  np = Precision[mom]; If[np === Infinity, np = prec];
  ak[1] = mom[[2]]/First[mom]; bk[1] = First[mom];
  sp = PadRight[{First[mom]}, 2 n - 1];
  Do[
   sp[[k - 1]] = si[[k - 1]];
   Do[
    v = sp[[j]];
    sp[[j]] = s = si[[j]];
    si[[j]] = si[[j + 1]] - ak[k - 1] s - bk[k - 1] v;
    , {j, k, 2 n - k + 1}];
   ak[k] = si[[k + 1]]/si[[k]] - sp[[k]]/sp[[k - 1]];
   bk[k] = si[[k]]/sp[[k - 1]];
   , {k, 2, n}];
  N[{Table[ak[k], {k, n}], Table[bk[k], {k, n}]}, np]
  ]

(* Golub-Welsch algorithm *)

golubWelsch[d_?VectorQ, e_?VectorQ] := 
 Transpose[
  MapAt[(First[e] Map[First, #]^2) &, 
   Eigensystem[
    SparseArray[{Band[{1, 1}] -> d, Band[{1, 2}] -> Sqrt[Rest[e]], 
      Band[{2, 1}] -> Sqrt[Rest[e]]}, {Length[d], Length[d]}]], {2}]]

(I note that the implementation here of Chebyshev's algorithm was optimized to use two vectors instead of a two-dimensional array.)

Let's try an example. Let $m_j=j!$ and take the system given earlier ($n=3$):

{d, e} = chebAlgo[Range[0, 5]!]
{{1., 3., 5.}, {1., 1., 4.}}

xw = golubWelsch[d, e]
{{6.2899450829374794, 0.010389256501586145}, {2.2942803602790467, 0.27851773356923976}, {0.4157745567834814, 0.7110930099291743}}

We have here the equivalence xw[[i, 1]]$=x_i$ and xw[[i, 2]]$=w_i$; let's see if the original set of equations are satisfied:

Chop[Table[Sum[xw[[j, 2]] xw[[j, 1]]^i, {j, 3}] - i!, {i, 0, 5}]]
{0, 0, 0, 0, 0, 0}

and they are.

(This example corresponds to generating the three-point Gauss-Laguerre rule.)

As a final aside, the solution given by Aryabhata is an acceptable way of generating Gaussian quadrature rules from moments, though it will require $O(n^3)$ effort in solving the linear equations, as opposed to the $O(n^2)$ effort required for the combination of Chebyshev and Golub-Welsch. Hildebrand gives a discussion of this approach in his book.

Here is Aryabhata's proposal in Mathematica code (after having done the elimination of the appropriate variables in the background):

gaussPolyGen[mom_?VectorQ, t_] := 
 Module[{n = Quotient[Length[mom], 2]}, 
  Expand[Fold[(#1 t + #2) &, 1, Reverse[LinearSolve[
   Apply[HankelMatrix, Partition[mom, n, n-1]], -Take[mom, -n]]]]]]

and compare, using the earlier example:

gaussPolyGen[Range[0, 5]!, t]
-6 + 18*t - 9*t^2 + t^3

% == -6 LaguerreL[3, t] // Simplify
True

Having found the roots of the polynomial generated by gaussPolyGen[], one merely solves an appropriate Vandermonde linear system to get the weights.

nodes = t /. NSolve[gaussPolyGen[Range[0, 5]!, t], t, 20]
{0.4157745567834791, 2.294280360279042, 6.2899450829374794}

weights = LinearAlgebra`VandermondeSolve[nodes, Range[0, 2]!]
{0.711093009929173, 0.27851773356924087, 0.010389256501586128}

The results here and from the previous method are comparable.

Solution 3:

Look at the first three equations as a system of linear equations determining $a$, $b$ and $c$ in terms of $x$, $y$, $z$, $m_0$, $m_1$, $m_2$; the matrix of coefficients is a van der Monde matrix for the triple $x$, $y$, $z$, so it not hard to use Cramer's rule to solve it when no two of them are equal.

Now replace the values you got for $a$, $b$ and $c$ in the second group of three equations. Magically, after simplifying and canceling things common to numerators and denominators, it is a system of polynomial equations in $x$, $y$ and $z$. Moreover, the three left hand sides are symmetric polynomials in $x$, $y$ and $z$, so after expressing those left hand sides in terms of the elementary symmetric polynomials on $x$, $y$ and $z$, you end up with a not too difficult polynomial system.

Solve it.

Now you have the values of $x+y+z$, $xy+xz+yz$ and $xyz$. Solving a cubic equation now gives you $x$, $y$ and $z$ and, from the first part, also $a$, $b$ and $c$.

NB: Following these steps is easily doable in Mathematica and you get an answer. The answer is simply too huge, and I doubt it can be massaged into something sensible---and I hence doubt there is a sensible way to solve the system of equations by hand.

Mma, on the other hand, did not solve the system as it is originally given in the time I gave it.

NB2: For comparison, the system $$\left\{\begin{aligned}a + b = m_0\\ax+by=m_1\\ax^2+by^2=m_2\\ax^3+by^3=m_3\end{aligned}\right.$$ has as solution, according to Mma (and this one it does compute it by itself): $$\begin{aligned} &a= \frac{-2 m_1^3+3 m_0 m_2 m_1+m_0 \left(\sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+4 m_0 m_2^3+m_0^2 m_3^2}-m_0 m_3\right)}{2 \sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+m_0 \left(4 m_2^3+m_0 m_3^2\right)}} \\ &b = \frac{2 m_1^3-3 m_0 m_2 m_1+m_0 \left(m_0 m_3+\sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+4 m_0 m_2^3+m_0^2 m_3^2}\right)}{2 \sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+m_0 \left(4 m_2^3+m_0 m_3^2\right)}} \\ &x = \frac{-m_1 m_2+m_0 m_3+\sqrt{\left(m_1 m_2-m_0 m_3\right){}^2+4 \left(m_1^2-m_0 m_2\right) \left(m_1 m_3-m_2^2\right)}}{2 \left(m_0 m_2-m_1^2\right)} \\ &y = \frac{m_1 m_2-m_0 m_3+\sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+4 m_0 m_2^3+m_0^2 m_3^2}}{2 m_1^2-2 m_0 m_2} \end{aligned} $$ and $$\begin{aligned} &a=\frac{2 m_1^3-3 m_0 m_2 m_1+m_0 \left(m_0 m_3+\sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+4 m_0 m_2^3+m_0^2 m_3^2}\right)}{2 \sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+m_0 \left(4 m_2^3+m_0 m_3^2\right)}} \\ &b=\frac{-2 m_1^3+3 m_0 m_2 m_1+m_0 \left(\sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+4 m_0 m_2^3+m_0^2 m_3^2}-m_0 m_3\right)}{2 \sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+m_0 \left(4 m_2^3+m_0 m_3^2\right)}} \\ &x=-\frac{m_1 m_2-m_0 m_3+\sqrt{\left(m_1 m_2-m_0 m_3\right){}^2+4 \left(m_1^2-m_0 m_2\right) \left(m_1 m_3-m_2^2\right)}}{2 \left(m_0 m_2-m_1^2\right)} \\ &y=-\frac{-m_1 m_2+m_0 m_3+\sqrt{4 m_3 m_1^3-3 m_2^2 m_1^2-6 m_0 m_2 m_3 m_1+4 m_0 m_2^3+m_0^2 m_3^2}}{2 \left(m_1^2-m_0 m_2\right)} \end{aligned} $$

Solution 4:

@Aryabhata's solution in insightful, but I'll present the (tedious, but thoughtlessly mechanical) methodology I tend to use to approach such problems: resultants.

The resultant of two polynomials is a polynomial with one variable eliminated. Therefore, you can proceed to solve a polynomial system in roughly the same way you would with a linear system (in theory).

First, convert your equations to polynomials, $p_i$, so that the roots of these polynomials are the solutions to your equations.

$$\begin{eqnarray} p_0 &=& a+b+c-m_0\\ p_1 &=& a x + b y + c z - m_1\\ p_2 &=& a x^2 + b y^2 + c z^2 - m_2\\ p_3 &=& a x^3 + b y^3 + c z^3 - m_3\\ p_4 &=& a x^4 + b y^4 + c z^4 - m_4\\ p_5 &=& a x^5 + b y^5 + c z^5 - m_5 \end{eqnarray}$$

Pick your favorite one (say, $p_0$), and use it to compute a bunch of resultants (using Mathematica's ${\tt Resultant[]}$ function) that eliminate one variable (say, $a$):

$$\begin{eqnarray} r_0 &=& {\tt Resultant[} p_0, p_1, a {\tt ]} = -m_1 - b x - c x + m_0 x + b y + c z\\ r_1 &=& {\tt Resultant[} p_0, p_2, a {\tt ]} = -m_2 - b x^2 - c x^2 + m_0 x^2 + b y^2 + c z^2 \\ r_2 &=& {\tt Resultant[} p_0, p_3, a {\tt ]} = -m_3 - b x^3 - c x^3 + m_0 x^3 + b y^3 + c z^3\\ r_3 &=& {\tt Resultant[} p_0, p_4, a {\tt ]} = -m_4 - b x^4 - c x^4 + m_0 x^4 + b y^4 + c z^4\\ r_4 &=& {\tt Resultant[} p_0, p_5, a {\tt ]} = -m_5 - b x^5 - c x^5 + m_0 x^5 + b y^5 + c z^5\\ \end{eqnarray}$$

(Of course, the above amounts to simply solving $p_0 = 0$ for $a$ and substituting into the other $p_i$, since the system is linear in $a$. The same will be true when eliminating $b$ and $c$, so using resultants here is a little bit of overkill, but the process is mechanical enough to be about as easy to (not-)think about as, say, Gaussian elimination.)

Now, we use, say, $r_0$ to compute resultants that eliminate $b$, and so forth. For your special system, the polynomials at each step happen to have common factors that give rise to possible solutions; you might need to consider those separately, but they tend to represent "special case" scenarios. In the following, I base the next resultant step on the uncommon factors (indicated with upper-case variables).

$$\begin{eqnarray} s_i &=& {\tt Resultant[} r_0, r_i, b {\tt ]} = S_i \cdot (x-y), \hspace{0.2in}i = 1, 2, 3, 4\\ t_i &=& {\tt Resultant[} S_0, S_i, c {\tt ]} = T_i \cdot (y-z)(z-x), \hspace{0.2in}i = 1, 2, 3 \\ u_i &=& {\tt Resultant[} T_0, T_i, x {\tt ]} = U_i \cdot (m_2-m_1 y - m_1 z + m_0 y z ), \hspace{0.2in}i = 1, 2 \\ v_1 &=& {\tt Resultant[} U_0, U_1, y {\tt ]} \end{eqnarray}$$

The final polynomial, $v_1$, has just one variable, $z$.

$$\begin{eqnarray}v_1 &=& \left(m_2^2 - m_1 m_3 - m_1 m_2 z + m_0 m_3 z + m_1^2 z^2 - m_0 m_2 z^2\right)^4 \\ && \left(-m_3^3 + 2 m_2 m_3 m_4 - m_1 m_4^2 - m_2^2 m_5 + m_1 m_3 m_5 \right.\\ &&\left. + z\left(m_2 m_3^2 - m_2^2 m_4 - m_1 m_3 m_4 + m_0 m_4^2 + m_1 m_2 m_5 - m_0 m_3 m_5 \right) \right. \\ &&\left. +z^2\left(- m_2^2 m_3 + m_1 m_3^2 + m_1 m_2 m_4 - m_0 m_3 m_4 - m_1^2 m_5 + m_0 m_2 m_5\right) \right. \\ &&\left. + z^3\left( m_2^3 - 2 m_1 m_2 m_3 + m_0 m_3^2 + m_1^2 m_4 - m_0 m_2 m_4\right)\right)^2\end{eqnarray}$$

The first factor is another "special case", so that $z$, in general, should be a root of the cubic polynomial in the second factor. To get $y$, and $x$, and the rest, you could back-substitute into polynomials from earlier in the resultant chain (just as you would with a linear system), or you could just compute resultant chains that eliminate every variable except $y$, then every variable except $x$, etc.

It turns out that (as one might expect) your particular system is such that the "final" polynomials for $x$, $y$, and $z$ are equivalent; that is, $x$, $y$, and $z$ are roots of the cubic in the second factor of $v_1$ above. (This is, in fact, @Aryabhata's cubic.)

I'll note that resultants in general are computationally expensive (especially in symbolic form). Very often, the process can bog down completely in Mathematica, even for fairly modest systems, and/or the final polynomial can be of enormous degree. Sometimes, a bit of finesse in the order of elimination helps. My first attempt at finding (a chain of resultants that eliminated everything but) $a$ got boggy, but eliminating in the order $b$, $c$, $z$, $y$, $x$ gave me a cubic equation (and some "special case" factors) ... which is huge, so I won't present it here. A more-practical approach here might be to find $x$, $y$, and $z$ from the cubic polynomial, substitute these values back into three of the $p_i$, and solve the linear system for $a$, $b$, and $c$.

Solving a peculiar system of equations

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Related

Recent Posts