Etymology of "chain rule" (calculus)
In calculus there is a formula known as the chain rule, used for differentiating composite functions. What is the origin of this expression?
Today chain rule normally refers to the operation in calculus. But it seems likely that the term got its meaning in calculus by an analogy with the chain rule in arithmetic. The OED says:
chain-rule n. a rule of arithmetic, by which is found the relation of equivalence between two numbers for which a chain of intervening equivalents is given, as in Arbitration of Exchanges.
Here's an example of its use from The Popular Educator of 1869:
If the equivalent of any amount of one quantity is given in terms of another, that in terms of a third, and so on, it is requied to find the equivalent of a certain amount of the first quantity in terms of the last.
EXAMPLE 1.—40 lbs. Troy of standard gold are coined into 1869 sovereigns, and standard gold contains 11 parts in 12 fine gold. Calculate the value of the money which can be coined out of 1 oz. of fine gold.
A convenient way of the arranging the operation in questions of this kind is called the Chain Rule. It is especially useful in all questions connected with Exchange, and is the method generally used by merchants.
There follows a calculation which we would write today as:
where the intermediate units have been arranged so as to cancel, leaving us with:
That is, with 11214/11 pence = £4 4s 114/11d.
(Note: the Coinage Act 1816 had defined the value of one Troy pound of standard gold as £46 14s 6d, which is 11214 pence.)
The chain rule in calculus is the same as the chain rule in arithmetic, except that instead of arranging a product of ratios so that intermediate units cancel, we arrange a product of derivatives so that intermediate variables cancel:
The term chain rule was first applied to differentiation in the early-to-mid 20th century. Before that, there seems to have been no short term in English: for example, William Anthony Granville's Elements of the Differential and Integral Calculus (1904) introduces the rule like this (page 28):
This Rule is known as the Rule for differentiating a "function of a function."
Jeff Miller's "Earliest Known Uses of Some of the Words of Mathematics" presents evidence that the term came from German:
Peter Flor has found Kettenregel in Höhere Mathematik (1921) by Hermann Rothe, where it is used in a slightly different way from modern practice, viz. only for composites of three or more functions. Flor writes, "Here the word chain (Kette, in German) is suggestive. I tried, rather perfunctorily, to pursue the term further back in time, without success. It seems that around 1910, most authors of textbooks as yet saw no problem in computing dz/dx = (dz/dy)×(dy/dx). On the other hand, when I was a student in Vienna and Hamburg (1953 and later), the word Kettenregel was a well-established part of elementary mathematical terminology, in German, for the rule on differentiating a composite of two functions. I guess that its use must have become general around 1930."
One of the German works using the term Kettenregel for differentiating a composite of two functions was Richard Courant's Vorlesungen über differential- und integralrechnung (1927). The section on "Die Differentation der zusammengesetzten Funktionen" (p. 122) contains a treatment of "die Kettenregel". In 1934 the book was translated into English by E. J. McShane as Differential and Integral Calculus and Kettenregel became the chain rule for differentiating a compound function. The book was widely circulated and James A. Landau suggests that it was this translation that established the German expression with English readers.
Here's the passage from Courant as it appears in a modern edition (page 71):
Further, we shall prove that a differentiable function of differentiable functions is itself differentiable. This statement is formulated more precisely in the following theorem, which at the same time gives the rule for the differentiation of compound functions, or so-called chain rule