In differential calculus, why is dy/dx written as d/dx ( y)?

It is productive to regard $D = \frac{d}{dx}$ as a linear operator, say from the space of smooth functions on $\mathbb{R}$ to itself, for several reasons. The simplest reason I can think of is that it makes the theory of linear homogeneous differential equations very simple. For a linear homogeneous differential equation is nothing more than an attempt to find a nullspace of the operator $p(D)$ where $p$ is some polynomial.

To do this we need to find the spectrum of $D$. It's not hard to see that there is a unique eigenvector with eigenvalue $\lambda$ given by $e^{\lambda x}$, and from here it follows that the nullspace of $p(D)$ at least contains (and, if $p$ has distinct roots, is entirely made of) the functions $e^{\lambda x}$ where $p(\lambda) = 0$.

Said another way, if $p(x) = \prod_{i=1}^n (x - \lambda_i)$ then we can factor the operator $p(D)$ as $\prod_{i=1}^n (D - \lambda_i)$, and it's not hard to see that $f$ is in the nullspace of this operator whenever $(D - \lambda_i) f = 0$, or $f(x) = e^{\lambda_i x}$ (up to initial conditions). In fact, we get a solution $f$ whenever $(D - \lambda_i)^{e_i} f = 0$ where $e_i$ is the multiplicity of $\lambda_i$, and studying this condition readily leads to the complete set of solutions.

In other words, thinking of $D$ as an operator in its own right essentially reduces the study of linear homogeneous differential equations to linear algebra (modulo some existence and uniqueness arguments), specifically the study of the Jordan decomposition.

Of course one can go much, much further with this idea: for example we can factor differential operators in more than one variable in the same way. The Laplacian $D_x^2 + D_y^2$ where $D_x$ is the derivative with respect to $x$ and $D_y$ the derivative with respect to $y$ factors as $\left( D_x + D_y i \right) \left( D_x - D_y i \right)$ and this immediately gives the connection between harmonic functions and holomorphic functions via the Cauchy-Riemann equations. And the Dirac equation in quantum mechanics was discovered through a similar factorization process, but with matrix rather than merely complex coefficients.


The way you're phrasing it, $x$ and $y$ play similar roles, and the question naturally arises why they should be treated differently, as in $\mbox{d}/\mbox{d}x (y)$. However, in calculus, one usually considers functions of variables, such as $f (x)$ or $y (x)$ -- here the symbols for the independent variable and the function play quite different roles, and in order to be able to think of differentiation more abstractly as an operation applied to functions (and yielding new functions), it is helpful to "factorize" the notation so that the function stands alone at the right and "what is being done to it", the operator, is separate and applied from the left -- hence this notation.