Consider the following construction of vectors and points.

Let's start with a vector space, or more specifically a coordinate space $F^N$ over a field $F$ and of $N$ dimensions. The elements of this vector space are vectors $\vec{v}$ and we can express their coordinates using a basis $(\vec{e_1}, \vec{e_2}, ... \vec{e_N})$ of the vector space (and any coordinate space comes with a standard basis).

Using this vector space, one can construct the related affine space $A$, whose elements are points. As there is no origin in an affine space, to locate a point one need a coordinate system which can be specified by an origin and a basis. (First question: does this construction make sense ?)

My problem, is that I am not sure to understand what is fundamentally a coordinate system... My problem is the following: in 3D cartesian coordinates, it is pretty simple: the vector space is $\mathbb{R}^{3}$, so we can construct the related affine space with points, and by specifying an origin $O$ plus the standard basis, we can identify the position of the point $P$ by the coordinates of the vector going from $O$ to $P$. But with spherical coordinates I am lost and I am not sure to really understand what mathematical objects is/are $(r, \theta, \phi)$ and $(\vec{e_r}, \vec{e_\theta}, \vec{e_\phi})$: in the contrary of cartesian coordinates, the basis $(\vec{e_r}, \vec{e_\theta}, \vec{e_\phi})$ will change from one point to another. Why is that ? Is the vector space underlying $(r, \theta, \phi)$ and linked with $(\vec{e_r}, \vec{e_\theta}, \vec{e_\phi})$ the same as the one we use in the case of cartesian coordinates. I think I am missing a basic thing...


Classically, if $X \subseteq \mathbf{R}^{3}$ is an open set, then a $3$-dimensional coordinate system on $X$ is nothing but an injective, continuously-differentiable mapping $\xi:X \to \mathbf{R}^{3}$ whose differential has rank three at each point. Conceptually, a coordinate system assigns an ordered triple of real numbers to each point of $X$ in such a way that distinct points are given distinct coordinates, satisfying some technical conditions of smoothness. The inverse mapping $\xi^{-1}:\xi(X) \to X$ is sometimes called a parametrization of $X$.

The inclusion map $i:X \hookrightarrow \mathbf{R}^{3}$ is a coordinate system on $X$. If $A$ is an invertible $3 \times 3$ real matrix and $b$ is a vector in $\mathbf{R}^{3}$, then the affine map $T(x) = Ax + b$ defines a coordinate system on $X$. Cylindrical and spherical coordinates are parametrizations of portions of $\mathbf{R}^{3}$; for example, the spherical coordinates mapping (with geographic angles) $$ S(r, \theta, \phi) = (r\cos\theta\cos\phi, r\sin\theta\cos\phi, r\sin\phi),\qquad 0 < r,\quad |\theta| < \pi,\quad |\phi| < \pi/2 $$ parametrizes $X = \mathbf{R}^{3}\setminus \{y = 0, x \leq 0\}$, the complement of a closed half-plane. The spherical coordinates of a point $(x, y, z)$ of $X$ are the numbers $(r, \theta, \phi)$ such that $(x, y, z) = S(r, \theta, \phi)$.

In linear algebra (over an arbitrary field $F$), a "coordinate system" for an $n$-dimensional vector space $V$ takes values in $F^n$, and is usually defined by choosing a basis $B = \{v_{i}\}_{i=1}^{n}$ and mapping a vector $v$ in $V$ to its coordinate vector $[v]_{B}$ in $F^n$. Similarly, in affine geometry one can construct a coordinate system as you describe (by picking an origin and a basis for the resulting vector space).


In the modern point of view, coordinate systems play a secondary role to "overlap maps". For concreteness, let $X$ be an open subset of $\mathbf{R}^{3}$. First, we weaken the criteria for a mapping to be a coordinate system, requiring only that $\xi:X \to \mathbf{R}^{3}$ be continuous and injective. Now assume $\xi_{1}$ and $\xi_{2}$ are "allowable" coordinate systems on $X$ (for some value of "allowable", yet to be determined). An overlap map is a composition $$ \xi_{1} \circ \xi_{2}^{-1}:\xi_{2}(X) \to \xi_{1}(X). $$ The "structure" of $X$ is encoded in the properties of the overlap maps. For example, if the overlap maps are diffeomorphisms, then "smoothness" makes sense for functions $f:X \to \mathbf{R}$: We pick an arbitrary coordinate system $\xi_{1}$ and declare $f$ to be smooth if the composition $f \circ \xi_{1}^{-1}$ is smooth as a function on $\mathbf{R}^{3}$. This definition does not depend on $\xi$, since $$ f \circ \xi_{2}^{-1} = (f \circ \xi_{1}^{-1}) \circ (\xi_{1} \circ \xi_{2}^{-1}), $$ and the overlap map is a diffeomorphism. Similarly, if the overlap maps are affine, then (for example) "line segments" in $X$ make sense: A subset $\ell$ of $X$ is a "segment" if in some (hence every) coordinate system $\xi$, the set $\xi(\ell)$ is a line segment.

Philosophically, a coordinate system is merely how one "transfers" objects and functions on a space $X$ to a objects and functions on a "standard" space, such as $\mathbf{R}^{3}$. The interesting "structure" of $X$ is encoded by the overlap maps, which determine properties of $X$ that are independent of the coordinate system.

When mathematicians speak of a manifold having a smooth structure, they mean that some collection of coordinate systems has been fixed so that the overlap maps are diffeomorphisms. A manifold having an affine structure has coordinate systems whose overlaps are affine (a much more stringent condition). Similarly, one hears of holomorphic, piecewise-linear, and conformal structures, among many others.

To address the question about coordinate vector fields in spherical coordinates: Though the spherical coordinates map parametrizes part of $\mathbf{R}^{3}$ (which is a vector space), the spherical coordinates mapping is not a linear transformation, and therefore does not belong to linear algebra, but instead to multivariable calculus.

The "proper framework" requires some explanation. If $X \subseteq \mathbf{R}^{3}$ is an open set, define the tangent bundle $TX$ to be $X \times \mathbf{R}^{3}$. An element $(\mathbf{x}, \mathbf{v})$ of $TX$ should be viewed as consisting of an element $\mathbf{x}$ of $X$ together with a vector $\mathbf{v}$ "based at" $\mathbf{x}$. It makes sense to take linear combinations of vectors only if they're based at the same point.

In this picture, a vector field on $X$ is a mapping $\Xi:X \to TX$ that assigns, to each point $\mathbf{x}$ of $X$, a vector $\Xi(\mathbf{x})$ based at $\mathbf{x}$. The Cartesian coordinate fields $\mathbf{e}_{i}$ are constant vector-valued functions because Cartesian coordinates change by additive constants under translation (!). By contrast, the spherical coordinate fields are non-constant, because spherical coordinates do not change in a simple way under translation.


A more geometric point of view is to consider the coordinates as scalar fields $x^1, x^2, \ldots$ on the vector space. These scalar fields then determine two bases of vector fields: the tangent basis vectors $e_i = \partial \vec x/\partial x^i$, and the cotangent basis vectors $e^i = \nabla x^i$. These are fields, in that they may vary with respect to position.

Cartesian coordinates are somewhat special in that their tangent (and cotangent) basis vector fields are constant. Hence, you can use them directly to describe positions.

For more general coordinates, that may not be the case. Because the basis tangent and cotangent vector fields are fields, evaluating them at a specific point gives you bases for two vector spaces at that point. For example, in spherical coordinates, you can't use the vectors $e_r, e_\theta, e_\phi$ in linear combinations to describe positions, but if you have a curve through an arbitrary point, you can use the values of $e_r, e_\theta, e_\phi$ at that particular point to talk about the direction of that curve. You can do this at all points along the curve, but only using each point's set of tangent basis vectors.

More specifically, you could put a circle in the $xy$-plane centered on the origin, and its tangent vector would be $e_\phi$ everywhere, but you know in terms of Cartesian coordinates and basis vectors, it would be $e_y \cos \phi - e_x \sin \phi$.

While the positions in a flat space do belong to a vector space, it's not generally true that those vectors look like linear combinations of tangent basis vectors because, as you said, those tangent vectors vary with position. You can write $\vec x = r\hat x(\theta, \phi)$ for instance in spherical coordinates, and then define the unit vector function $\hat x$ as having rotational symmetry and so on and so forth. It's certainly not as simple as what you would have in Cartesian where $\vec x = x e_x + y e_y + z e_z$. And that's unavoidable.