Why would I want to multiply two polynomials?

I'm hoping that this isn't such a basic question that it gets completely laughed off the site, but why would I want to multiply two polynomials together?

I flipped through some algebra books and have googled around a bit, and whenever they introduce polynomial multiplication they just say 'Suppose you have two polynomials you wish to multiply', or sometimes it's just as simple as 'find the product'. I even looked for some example story problems, hoping that might let me in on the secret, but no dice.

I understand that a polynomial is basically a set of numbers (or, if you'd rather, a mapping of one set of numbers to another), or, in another way of thinking about it, two polynomials are functions, and the product of the two functions is a new function that lets you apply the function once, provided you were planning on applying the original functions to the number and then multiplying the result together.

Elementary multiplication can be described as 'add $X$ to itself $Y$ times', where $Y$ is a nice integer number of times. When $Y$ is not a whole number, it doesn't seem to make as much sense.

Any ideas?

Solution 1:

You have two questions, the explicit one about why you would want to multiply polynomials, and an implicit one in your final paragraph about what multiplication by a non-integer might mean or why we would care to multiply by a non-integer in the first place.

To address the last one first: once you have multiplication by integers, multiplication by fractions will very quickly rear its head. What does multiplying by "one and a half" mean, if multiplying by 2 means "add to itself", etc? Well, imagine you have a chocolate bar, those that are made up of smaller squares. You can imagine breaking the bar in half, and then figuring out what three times that half will be; that will be multiplying by "three halves" (a.k.a one and a half). You are really multiplying by an integer, after suitably modifying $X$.

In general, if you need to multiply $X$ by a fraction, $\frac{p}{q}$, imagine dividing $X$ into $q$ equal parts, and then multiplying such a $q$th part of $X$ by $p$ in the sense you have above. That is the same as "multiplying by $\frac{p}{q}$". So multiplying by a fraction is like "abbreviated addition": it means "break up into $q$ equal parts, and then add a $q$th part repeatedly $p$ times."

So at least, multiplying by fractions makes just as much "natural sense" as multiplying by integers does.

Why bother with numbers other than fractions then? Well, in one sense you don't have to: you can try to stick to fractions and nothing more complicated than that, and you can go very far. But as the Greeks discovered a long time ago, you also run into very big walls very quickly. For instance, if you draw a square which is $1$ foot long on each side, and you try to measure how long its diagonal is (say, for construction purposes), then it turns out that the diagonal is not a number that can be expressed as a fraction; it is an irrational number. So very soon you end up having to consider numbers that are not fractions, and if they are lying around sooner or later you are going to have to multiply them to compute stuff.

So you end up having to find some way of multiplying irrationals as well, even though they no longer seem to fit with that same "natural" meaning they had back when we started with integers. One solution is that every irrational can be approximated by a suitable sequence of fractions (think about computing the decimals one at a time; every time you stop, what you have so far as a rational; for example, $\sqrt{2} = 1.4142\ldots$, and you get that $1.4 = \frac{14}{10}$, $1.41=\frac{141}{100}$, $1.414=\frac{1414}{1000}$, etc.) We know what it means to multiply $X$ by each of those fractions in a sensible way, so we say that multiplying $X$ by $\sqrt{2}$ is the number you get by doing the successive multiplications, just like $\sqrt{2}$ is the number you get by doing the successive fractional approximations.

This no longer makes sense as "abbreviated addition", but it turns out that it is very, very necessary and very, very useful, in order to make sense of things and be able to compute things that we need to be able to compute (areas, productivity, interest, etc).

As for multiplying polynomials...

One answer: multiplying functions lets you construct more complicated functions out of simpler ones. Or more to the point, it lets you express more complicated functions in terms of simpler ones. This is particularly important if you want to perform complex computations, as then you my be able to "get away" with performing much simpler computations and then multiplying the results, rather than do the really complicated expression instead.

For instance, say you have a single polynomial like $p(x) = x^2-7x+10$. If you realize that $p(x)$ is the result of multiplying the simpler polynomial $x-2$ by the (also simpler) polynomial $x-5$, then whenever you need to evaluate $p(x)$ at a number, say $17$, instead of having to square $17$, then multiply it by $7$, subtract that form the square you computed, and then adding $10$ (three multiplications and two additions/subtractions), you can instead take $17$, subtract $2$ to get $15$; then take $17$, subtract $5$ to get $12$; and then multiply $15$ by $12$ (one multiplication and two additions/subtractions), because $x^2-7x+10 = (x-2)(x-5)$, so $(17)^2 - 7(17) + 10 = (17-2)(17-5)$. Much simpler to do.

Another: it is usually very hard to find a value $x$ for which the result of doing some complex series of operations will be a desired quantity, $d$. For example, you want to know how much money to put in the bank so that, at the end of five months at a particular interest rate, you will have exactly the amount of money you need to buy that new wide-screen TV. This involves solving equations. Many natural equations can be written down in the form $p(x)=c$ where $p(x)$ is a polynomial expression in the unknown quantity $x$, and $c$ is the desired value. Solving such equations can be dificult in general. If you don't know the quadratic formula, then figuring out the values of $x$ for which the polynomial above $x^2-7x+10$ is equal to zero can be pretty difficult. Or think about something like $x^4 + x^3 - 120x^2 - 121x = 121$.

On the other hand, figuring out when a product is equal to $0$ is very easy, because the only way for a product to be zero is if one of the two factors is equal to zero. So if you take the equation above and you write it as $x^4+x^3-120x^2-121x-121 =0$, then you are trying to find when a certain polynomial is equal to $0$. If you can write $q(x)=x^4+x^3-120x^2-121x-121$ as a product, $q(x) = p(x)r(x)$, then you have that $q(x)=0$ if and only if either $p(x)=0$ or $r(x)=0$. With some luck, $p$ and $r$ will be "easier" than $q$, so you can solve them. (In the above case, $q(x) = (x^2-121)(x^2+x+1)=(x-11)(x+11)(x^2+x+1)$, so the only way you can get $q(x)=0$ is if $x=11$ or $x=-11$).

In fact, this is one way to figure out the quadratic formula (did you ever wonder where it came from?). Why are the solutions to $ax^2 + bx+c = 0$ given by $x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}$? You can factor out $a$ and get $a(x^2 + Bx + C) = 0$, with $B=\frac{b}{a}$ and $C=\frac{c}{a}$. For this to be zero, you need $x^2+Bx+C=0$. Now, imagine you could write it as a product, $$x^2 + Bx+C = (x-r_1)(x-r_2).$$ What would $r_1$ and $r_2$ be? If you know how to multiply polynomials, you get that $(x-r_1)(x-r_2)=x^2 - (r_1+r_2)x + r_1r_2$, so you need $r_1r_2=C$ and $r_1+r_2 = -B$. Squaring the latter you get $(r_1+r_2)^2 = B^2$; but $(r_1+r_2)^2 = r_1^2 +2r_1r_2 + r_2^2$. On the other hand, $$(r_1-r_2)^2 = r_1^2 - 2r_1r_2 + r_2^2 = (r_1^2+2r_1r_2+r_2^2) - 4r_1r_2 = B^2 - 4C.$$ So $(r_1-r_2)^2 =B^2-4C$. Taking square roots, you have that $r_1-r_2 = \pm \sqrt{B^2-4C}$. And you already know that $r_1+r_2 = -B$. Adding them you get $$2r_1 = -B\pm\sqrt{B^2-4C}\qquad\text{or}\qquad r_1 = \frac{-B\pm\sqrt{B^2-4C}}{2}$$ and taking the difference between $r_1+r_2 = -B$ and $r_1-r_2 = \pm\sqrt{B^2-4C}$ you get $$2r_2 = -B\mp\sqrt{B^2-4C}\qquad\text{or}\qquad r_2 = \frac{-B\mp\sqrt{B^2-4C}}{2}.$$ So you get that $r_1 = \frac{-B+\sqrt{B^2-4C}}{2}$ and $r_2 = \frac{-B-\sqrt{B^2-4C}}{2}$, and plugging in $B=\frac{b}{a}$ and $C=\frac{c}{a}$ gives the usual quadratic formula. No way to find it without knowing how to multiply polynomials!

When you get to Calculus (added: I'm assuming you will "get to Calculus" because you tagged the question as being (algebra-precalculus), so presumably you are taking a course labeled as 'precalculus'; but this may not be the case. If you are not going to "get to Calculus", then this paragraph will not tell you anything useful), you will find that there is a particular operation (differentiation, taking derivatives) which is very useful and very important. It tells you how fast a certain quantity is changing, and it can be used to find all sorts of useful things, like what production level will maximize profit in a factory, how big a dose of medicine and how often you should give to a patient based on how fast they metabolize it, and many other applications. Computing derivatives from first principles with an arbitrary function is pretty labor-intensive; but by reconizing a function as being "made up" (through sums, products, quotients, and compositions) of other, simpler, functions, makes it a very straighforward and easy job.

But in order to be able to recognize that a function is a product of two other functions, you first need to know how to multiply two functions together. Polynomials are one case.

Another situation occurs when the polynomials are measuring different things, and their product is somehow meaningful; maybe one polynomial gives you the length and the other polynomial gives you the width of a certain figure? Their product will be the area, which may be something you need to compute.

And more generally, you can think of polynomials as "abbreviations" for more complicated operations that you are doing with numbers, just like you are thinking of multiplication as "abbreviated addition". In that case, multiplying the two polynomials represents another complicated operation that you need to express in terms of the two simpler ones (addition and multiplication).

Solution 2:

Suppose you want to make error-correcting codes (to do things like communicating with spacecraft, manufacturing CD's, and designing computer memory systems). Encoding and decoding BCH codes (as well as many other error correcting codes) involves multiplying polynomials.

Of course, nowadays the best error correcting codes are polar codes, turbo codes, or LPDC codes, which aren't based on polynomials. But for nearly 40 years many digital communications systems were based on multiplying polynomials.

Solution 3:

While some of the other answers look good, there are two particular related uses for polynomial multiplication which I didn't see while skimming.

One is related to probabilities. We can use a polynomial to represent the possible outcomes of a random process (so long as there are finitely many), where there is a variable for each possible outcome, and the coefficient of that variable is the probability of it occurring. (We'll just go with linear polynomials for now.)

So for example, a toss of a fair coin might be represented as $\frac 12 H + \frac 12 T$. An unfair coin might be $\frac 23 H + \frac 13 T$. Someone's plays in the game of rock/paper/scissors might be summarised as $\frac 35 \text{ rock} + \frac 15 \text{ paper} + \frac 15 \text{ scissors}$. (Dude likes rock.)

Now, what would it mean to multiply two such polynomials? Well, supposing that the random events are independent of each other (that is, the outcome of one doesn't affect the probabilities involved in the other), the product of two such polynomials tells us the likelihood of various pairs of outcomes. If we flip that unbalanced coin, and have that guy throw rock/paper/scissors, the combined possibilities are

$$\left(\frac 23 H + \frac 13 T\right) \left(\frac 35 \text{ rock} + \frac 15 \text{ paper} + \frac 15 \text{ scissors}\right) = \frac 25 H \text{ rock} + \frac 15 T \text{ rock} + \frac 2{15} H \text{ paper} + \frac 1{15} T \text{ paper} + \frac 2{15} H \text{ scissors} + \frac 1{15} T \text{ scissors}.$$

So we can read off that, for instance, the combined probability of getting a head on the coin and having the guy throw scissors is the coefficient of $H$ scissors, which is $\frac 2{15}$.

So that's a decent way to organise information about probabilities in a way that lets us combine multiple independent random processes. Raising a polynomial to the power of n in this setting gives us the probabilities when the process is repeated $n$ times, assuming again that the outcome of the previous trials doesn't affect the next one.

For instance, let's flip the unbalanced coin three times in a row:

$$\left(\frac 23 H + \frac 13 T\right)^3 = \frac 8{27} H^3 + \frac 49 H^2 T + \frac 29 H T^2 + \frac 1{27} T^3$$

This tells us that the probability that we get two heads and one tail is $\frac 49$, and so on. (If we didn't allow $H$ and $T$ to commute with each other, we would get the probability of each possible sequence.)

It's also worth noting that it's quite effective to use polynomials in keeping track of proportions when forming mixtures of things, which are then mixed together (say, in chemistry), using substitution of one variable representing a mixture with the polynomial representing its contents. Multiplication doesn't immediately have such a useful interpretation there though.

Another thing we can do is to use the coefficients of polynomials to count the number of things of a given type. If we have a (finite) bunch of things with integer "sizes" or "weights", and we want to record the number of things with a given weight, we can use the coefficient of $x^n$ in a polynomial to record the number of objects with weight $n$. The product of two such polynomials will then count the number of ways to form a pair of objects with total weight $n$, for each $n$.

For example, suppose we want to know the number of ways to put change for $n$ cents into a vending machine using $5$ coins, for each possible $n$. (And we'll assume that the coins have values $\{1, 5, 10, 25, 100, 200\}$). We can use the polynomial $c(x) = x^1 + x^5 + x^{10} + x^{25} + x^{100} + x^{200}$ to represent our set of coins. Sequences of $5$ coins are then represented by $(c(x))^5$. This is a polynomial with a fairly large number of terms (representing the fact that there are lots of ways to choose a sequence of $5$ coins in order, specifically, $6^5 = 7776$, and moreso the fact that there are many different possible sums that those coins could have).

This polynomial is:

$x^5 + 5 x^9 + 10 x^{13} + 5 x^{14} + 10 x^{17} + 20 x^{18} + 5 x^{21} + 30 x^{22} + 10 x^{23} + x^{25} + 20 x^{26} + 30 x^{27} + 5 x^{29} + 5 x^{30} + 30 x^{31} + 10 x^{32} + 20 x^{33} + 10 x^{35} + 20 x^{36} + 30 x^{37} + 20 x^{38} + 10 x^{40} + 25 x^{41} + 60 x^{42} + 10 x^{45} + 60 x^{46} + 30 x^{47} + 21 x^{50} + 60 x^{51} + 10 x^{53} + 30 x^{55} + 20 x^{56} + 30 x^{57} + 20 x^{60} + 30 x^{61} + 30 x^{62} + 15 x^{65} + 60 x^{66} + 30 x^{70} + 30 x^{71} + 30 x^{75} + 10 x^{77} + 10 x^{80} + 20 x^{81} + 10 x^{85} + 20 x^{86} + 20 x^{90} + 10 x^{95} + 5 x^{101} + 5 x^{104} + 5 x^{105} + 20 x^{108} + 5 x^{110} + 30 x^{112} + 20 x^{113} + 20 x^{116} + 60 x^{117} + 5 x^{120} + 60 x^{121} + 30 x^{122} + 21 x^{125} + 60 x^{126} + 20 x^{128} + 30 x^{130} + 20 x^{131} + 60 x^{132} + 20 x^{135} + 60 x^{136} + 60 x^{137} + 25 x^{140} + 120 x^{141} + 60 x^{145} + 60 x^{146} + 60 x^{150} + 30 x^{152} + 20 x^{155} + 60 x^{156} + 30 x^{160} + 60 x^{161} + 60 x^{165} + 30 x^{170} + 20 x^{176} + 20 x^{180} + 20 x^{185} + 5 x^{200} + 10 x^{203} + 5 x^{204} + 30 x^{207} + 20 x^{208} + 30 x^{211} + 60 x^{212} + 20 x^{213} + 10 x^{215} + 80 x^{216} + 60 x^{217} + 35 x^{220} + 90 x^{221} + 30 x^{222} + 50 x^{225} + 60 x^{226} + 30 x^{227} + 20 x^{228} + 40 x^{230} + 80 x^{231} + 60 x^{232} + 50 x^{235} + 120 x^{236} + 60 x^{237} + 85 x^{240} + 120 x^{241} + 90 x^{245} + 60 x^{246} + 60 x^{250} + 30 x^{251} + 30 x^{252} + 50 x^{255} + 60 x^{256} + 60 x^{260} + 60 x^{261} + 60 x^{265} + 30 x^{270} + 10 x^{275} + 20 x^{276} + 20 x^{280} + 20 x^{285} + 5 x^{300} + 10 x^{302} + 20 x^{303} + 20 x^{306} + 60 x^{307} + 10 x^{310} + 80 x^{311} + 60 x^{312} + 40 x^{315} + 120 x^{316} + 70 x^{320} + 60 x^{321} + 60 x^{325} + 20 x^{326} + 60 x^{327} + 40 x^{330} + 120 x^{331} + 80 x^{335} + 120 x^{336} + 120 x^{340} + 60 x^{345} + 10 x^{350} + 60 x^{351} + 60 x^{355} + 60 x^{360} + 20 x^{375} + 5 x^{401} + 30 x^{402} + 10 x^{403} + 5 x^{405} + 60 x^{406} + 30 x^{407} + 35 x^{410} + 90 x^{411} + 30 x^{412} + 70 x^{415} + 60 x^{416} + 60 x^{420} + 30 x^{421} + 35 x^{425} + 60 x^{426} + 30 x^{427} + 70 x^{430} + 60 x^{431} + 90 x^{435} + 60 x^{436} + 60 x^{440} + 30 x^{445} + 30 x^{450} + 30 x^{451} + 30 x^{455} + 30 x^{460} + 10 x^{475} + x^{500} + 20 x^{501} + 30 x^{502} + 20 x^505 + 60 x^{506} + 50 x^{510} + 60 x^{511} + 60 x^{515} + 30 x^{520} + 20 x^{525} + 60 x^{526} + 60 x^{530} + 60 x^{535} + 30 x^{550} + 5 x^{600} + 30 x^{601} + 10 x^{602} + 30 x^{605} + 20 x^{606} + 40 x^{610} + 20 x^{611} + 20 x^{615} + 10 x^{620} + 30 x^{625} + 20 x^{626} + 20 x^{630} + 20 x^{635} + 10 x^{650} + 10 x^{700} + 20 x^{701} + 20 x^{705} + 20 x^{710} + 20 x^{725} + 10 x^{800} + 5 x^{801} + 5 x^{805} + 5 x^{810} + 5 x^{825} + 5 x^{900} + x^{1000}$

I just used my computer to calculate it.

This tells us, for instance, that there are $120$ distinct ways to put coins in a vending machine to make $\$1.41$, since the coefficient of $x^{141}$ is $120$. There is of course, only one way to make change for $10$ dollars (which is to put in all $2$ dollar coins), so the coefficient of $x^{1000}$ is $1$. Note also that if we were to evaluate this polynomial at $x = 1$, we would get the total number of possible sequences, which is $7776$. It's also maybe interesting to see which terms are missing from this polynomial (have a coefficient of $0$), telling us that there's no way to make change for that amount using $5$ coins.

In fact, both of these approaches generalise to power series (which are like polynomials, but which have infinite sequences of terms), and the effectiveness of the approach really becomes more apparent in that more general setting. This technique is called generating series (or the somewhat unfortunate, but perhaps more common name "generating functions").

Anyway, I think that gives some impression of some of the variety of things that one can do with polynomials. (I'll just leave that pun there for the algebraic geometers. ;)

Solution 4:

This is really several questions. In increasing order of depth:

Why multiply polynomials? Because they are functions, and multiplication is a natural or desirable operation on functions.
Why multiply functions? Because the values of the functions are integers, or rational numbers, or real numbers, or objects of some other more complicated kind that can be multiplied (such as matrices or rotations). If multiplication is natural for these numbers or number-like objects, it is natural for functions whose values are such objects.
Why multiply integers, rational numbers, real numbers, etc? Here is where explanation becomes difficult. The natural operation is not multiplication but the more structured operation of "tensor multiplication" that retains information about the factors that are multiplied. One can represent an Area as a product of type Length x Length that remembers its two-dimensionality, instead of a one-dimensional numerical value of that product that has forgotten its origins. In the same way, for integer multiplication it is most direct to multiply 5 and 6 by drawing 5x6 as a rectangular 2-dimensional array of dots instead of the "numerical evaluation" of that array as a one-dimensional string of 30 dots. The latter is less natural in that it requires a method of enumerating the dots in the grid and there is no preferred order in which to count them. This is also reflected in the ability to naturally multiply 5 gadgets by 6 gizmos, without there being any given ordering of the gadgets or gizmos.

Polynomials, especially polynomials in several variables ($x +3x^2y + 2 z^{10} x$ and such), have more inner structure than numbers and thus can reflect -- in fact, they can be defined and derived from -- the tensor structure of multiplication. So it is not that polynomials exist and we might want to multiply them; it is that multiplication of numbers (or of finite sets) naturally involves more than just numerical information and polynomials are an enhancement of numbers that more directly embody that information so that the multiplication can be performed in a way that retains more of the inner structure.

A hint of this is that integers in base 10 are values of polynomials at $x = 10$ (and those polynomials can be thought of as a "liberation" or "upgrading" of the numbers). Multiplication of integers can sometimes be seen to replicate the patterns in the coefficients when the polynomials are multiplied for general $x$, e.g., compare powers of $102$ and $x^2 + 0x + 2$. Later there are things like generating functions and convolutions that directly exploit polynomials as carriers of information that is enriched compared to using numbers alone.

(This is glossing over some technical questions about commutativity. Also, areas should be sums of LxL products, and one should explain the role of sums as well as products. But these are details that do not affect the main point.)

In applications, multiplication represents interaction or correlation between different effects. Situations where there are several independent processes isolated from each other and contributing to some outcome lead to sums of functions of the different variables, such as $f(x) + g(y) + h(z)$, the terms in the sum accounting for the separate effects. Situations where different parts of the mechanism can interact with each other in producing the outcome, or where there is correlation between different effects (e.g., one intensifies or suppresses the other), almost always involve multiplication when expressed mathematically. For example, if you count the number of handshakes in a group of $n$ people the answer will be of order $n^2$, and if you count the number of possible 3-person interactions in this group it will be of order $n^3$. Nonlinearities reflect the ability to organize the people into pairs, triples etc, and this again reflects the fact that the natural multiplication of two finite sets $A$ and $B$ is the set of ordered pairs $(a,b)$ of elements, one from each set, and the same for triples and higher numbers of sets.

A somewhat faddish term for related ideas is (de- or re-)categorification.

Solution 5:

When solving real-world problems with simple algebra, it is not uncommon to have polynomials scattered throughout the equation. Given that 1/(3x+4)=2x-1, which isn't an extraordinarily complex equation, the first thing you might want to do is multiply both sides by 3x+4, which means you will need to be able to deal with (2x-1)*(3x+4).