How to motivate the axioms for the inner product

Symmetry, bilinearity, and positive definiteness are exactly the properties used to prove the Cauchy-Schwarz inequality. Well, there are a zillion proofs of the Cauchy-Schwarz inequality; I mean the one that proceeds by observing $0\le \|x-ty\|^2$ for all $t\in\mathbb R$, expanding to obtain a quadratic in $t$, and concluding that the discriminant of that quadratic is nonpositive (and then you fiddle with definiteness to get the equality case).

In other words, an inner product is just a map for which that proof is correct.

We want to obtain the Cauchy-Schwarz inequality in other spaces because it's a cornerstone of the linear-algebraic treatment of Euclidean geometry — you use it to prove the triangle inequality, to show that orthogonal projections are metric projections (which gets you everything you want to know about tangent planes to spheres), etc. (The equation (1) is part of all that: in this treatment, it's essentially the definition of angle. You need Cauchy-Schwarz to show that it's well-defined.)


(Based on Qiaochu Yuan's comment above) It is natural to study normed spaces as by imparting distance we get additional structure in a vector space which allows us to explore its geometric properties. However the class of normed spaces is by itself also somewhat large. We may also be motivated to add additional structure to have orthogonality in normed spaces (so that a lot of nice things can happen: for example one can find the coordinates of a vector more efficiently).

To do so one takes a generalization of the Pythagoras theorem (parallelogram law) and isolates those normed spaces which satisfy it. The polarization identity now imparts the necessary orthogonality structure. By the Fréchet–von Neumann–Jordan theorem these are precisely the spaces isolated by the inner product axioms.