How do you formally prove that rotation is a linear transformation?
Maybe my first answer was not enough "formal", since I didn't say explicitely what I was assuming a "rotation" is -nor "angle". (Exercise: (1) find in which places I was using those unsaid definitions of "rotation" and "angle" and (2) which definitions are they. :-) )
Well, it was a sort of "high-school proof". The problem about the definition of "rotation", as some answers have already pointed out, or implied, is that, if you really want to be formal, you end saying that a rotation is some kind of linear transformation by definition. Full stop.
Nevertheless, I think I can produce an "undergraduate" proof of this fact, avoiding circular arguments. It has the merit that is valid in any dimension. For which I need the following
Convention. Whatever "rotation" means,
- It is a map of some vector space $V$,
- Which has a way of measuring "lengths" and "angles" of its vectors, and
- "Rotations" preserve those "lengths" and "angles".
Now, a fancy way to have "lengths" and "angles" in a vector space is to have a dot product in it. So let's assume that $V$ is an euclidian vector space; for instance, a real vector space such as $\mathbb{R}^n$ with the standard dot product.
Then, in such a $V$, "length" means "norm", which is $\| v \| = +\sqrt{v\cdot v}$.
We have to be more careful with "angles" because the standard definition already involves rotations. To avoid a circular argument, we define the (non-oriented) angle determined by two vectors $v, w$ as the unique real number $\theta = \widehat{vw} \in [0, \pi ]$ such that
$$ \cos \theta = \frac{v\cdot w}{\|v\| \|w\|} \ . $$
Notice that this definition makes sense since, because of the Cauchy-Schwarz inequality, we always have $-1 \leq \frac{v\cdot w}{\|v\| \|w\|} \leq 1$ and $\cos : [0, \pi] \longrightarrow [-1,1]$ is bijective.
So, "rotation" is some kind of map $f: V \longrightarrow V$ which preserves norms and angles. Since the dot product can be expressed in terms of norms and (cosines of) angles,
$$ v\cdot w = \|v\| \|w\| \cos\widehat{vw} $$
rotations preserve dot products:
$$ f(v) \cdot f(w) = v\cdot w \ . $$
Now, let's show that a map that preserves the dot product is necessarily linear:
$$ \begin{align} \| f(v+w) - f(v) -f(w) \|^2 &= \left( f(v+w) -f(v) -f(w) \right) \cdot \left( f(v+w) -f(v) -f(w) \right) \\\ &= \left( f(v+w)\cdot f(v+w) - f(v+w) \cdot f(v) - f(v+w)\cdot f(w) \right) \\\ &{} \qquad + \left( -f(v)\cdot f(v+w) + f(v) \cdot f(v) + f(v)\cdot f(w) \right) \\\ &{} \qquad + \left( -f(w)\cdot f(v+w) + f(w)\cdot f(v) + f(w)\cdot f(w) \right) \\\ &= \left( (v+w)\cdot (v+w) - (v+w) \cdot v - (v+w)\cdot w \right) \\\ &{} \qquad + \left( -v\cdot (v+w) + v \cdot v + v\cdot w \right) \\\ &{} \qquad + \left( -w\cdot (v+w) + w\cdot v + w\cdot w \right) \\\ &= \| v+w - v -w \|^2 = 0 \ . \end{align} $$
So
$$ f(v+w) = f(v) + f(w) \ . $$
A similar computation shows that
$$ \|f(\lambda v ) - \lambda f(v) \|^2 = \|\lambda v - \lambda v\|^2 = 0 \ . $$
Thus
$$ f(\lambda v) = \lambda f(v) \ . $$
Hence, our rotation $f$ is a linear transformation.
The problem comes in defining rotation so that this whole thing doesn't wind up circular, doing so fully would undoubtedly tie us all in knots, but one can prove the linearity of rotations by observing a few choice properties:
- Rotations preserve angles
- They fix the origin
- And they preserve lengths
Now just draw a parallelogram $OACB$ (spanned by vectors $\vec{OB}, \vec{OA}$- with $\vec{OC}=\vec{OB}+\vec{OA}$), rotate and chase the angles and lengths to observe that $R(\vec{OB})+R(\vec{OA})= R(\vec{OC})= R(\vec{OB}+ \vec{OA})$.
It's sort of Euclidianish but it's the only decent way to play it (especially if we want higher dimensions) without the 'lucky guess' approach championed by Qiaochu.
Addendum:
If you feel pushed, you can turn the angle and length chase into norms and inner products but that would be a tragic waste of an isomorphism- here we are dealing with a geometric object in the language of vectors and to treat it non-geometrically would put us at an unnecessary disadvantage.
A good deal of higher mathematics depends on finding 'the best place' to solve a given problem, but the best place need not be the most abstract one: translating something about PDE into hilbert space theory may help find a solution, but it may also rob us of our geometric techniques and intuition.
A great example is something called the h-cobordism theorem, which can (and does- Milnor's take is a gem) take up a small book, using formal techniques (morse theory) to prove it, but can be done in half the time with some geometric (handlebody) jazz.
Vector spaces are another string to our bow, a new place to do math, and at times a more rigorous place- but they are not (at least from my perspective) the only place.
This depends on your definitions of "rotation" and "angle", but, if you are willing to accept the sum and difference formulas of your link How can I understand and prove the "sum and difference formulas" in trigonometry? , then you may follow the reasoning in Quiochu Yuan's answer there the other way around: given a vector $(x,y) \in \mathbb{R}^2$, let $\alpha$ be the angle determined by $(x,y)$ and the $x$-axis and $r= \sqrt{x^2 + y^2}$ its length. Then, of course,
\begin{eqnarray*} x &=& r \cos \alpha \\\ y &=& r \sin \alpha \end{eqnarray*}
If you rotate $(x,y)$ an angle $\theta$, you'll obtain the vector $(x',y')$
\begin{eqnarray*} x' &=& r \cos (\alpha + \theta ) \\\ y' &=& r \sin (\alpha + \theta ) \ . \end{eqnarray*}
Now you apply those sum and difference formulas and get
\begin{eqnarray*} x' &=& r \cos\alpha\cos\theta - r \sin\alpha\sin\theta &=& \cos\theta \cdot x - \sin\theta \cdot y\\\ y' &=& r \sin\alpha\cos\theta + r \cos\alpha\sin\theta &=& \sin\theta \cdot x + \cos\theta\cdot y \ . \end{eqnarray*}
Which is the same as saying that $(x',y')$ is obtained from $(x,y)$ multiplying with the matrix
$$ \begin{pmatrix} \cos\theta & -\sin\theta \\\ \sin\theta & \cos\theta \end{pmatrix} $$
Since the rotation $(x,y) \mapsto (x',y')$ is the same as multiplication by a matrix, it is a linear transformation.
If you want to be formal and avoid Euclidean geometry, it seems to me that the right way to go is the other way around: first define certain linear transformations (the ones which preserve the dot product and have determinant $1$) and then prove that they have the properties you expect them to have. More precisely, the group $\text{SO}(2)$ is abelian with Lie algebra $\mathbb{R}$, and the parameterization $\theta \mapsto \left[ \begin{array}{cc} \cos \theta & \sin \theta \\ - \sin \theta & \cos \theta \end{array} \right]$ is the exponential map from the Lie algebra to the group. The methods for doing this should be covered in any book on Lie theory.