Why is the Riemann curvature tensor the technical expression of curvature?
In chapter 2, when talking about Riemann Normal Coordinates, Carroll shows that you can cunningly pick coordinates such that at a given point the metric has its standard form $$\begin{pmatrix} -1 & & &\\ & 1& & \\ & &1 & \\ & & & 1\\ \end{pmatrix}$$ and so that the first derivatives all vanish $$\partial_\rho g_{\mu\nu}=0.$$ He then tries to also make all the second derivatives vanish, but he finds that there are $100$ degrees of freedom of which we can only control $80$.
So in fact we cannot make the second derivatives vanish; the deviation from flatness must therefore be measured by the 20 coordinate-independent degrees of freedom representing the second derivatives of the metric tensor field. We will see later how this comes about, when we characterize curvature using the Riemann tensor, which will turn out to have 20 independent components.
In chapter 3, after defining the Riemann tensor, he calculates its number of degrees of freedom and finds that there are $20$ of them.
In four dimensions, therefore, the Riemann tensor has 20 independent components. (In one dimension it has none.) These twenty functions are precisely the 20 degrees of freedom in the second derivatives of the metric which we could not set to zero by a clever choice of coordinates. This should reinforce your confidence that the Riemann tensor is an appropriate measure of curvature.
So I think the Riemann tensor tells you everything there is to say about the metric up to second order. i.e. I think the following is a theorem:
Let $\mathcal M$ and $\mathcal N$ be two dim-$n$ Lorentzian manifolds with points $x\in\mathcal M$ and $y\in\mathcal N$. Then you can pick local coordinates on $\mathcal M$ and $\mathcal N$ such that the expressions for $g_{\mu\nu}$ agree to second order at $x$ and $y$ iff you can find bases at $x$ and $y$ such that the Riemann tensors are equal.
I don't actually have a proof or a reference for the above theorem though, so I've asked for a proof here.
It seems to me this question presents an interesting and significant challenge, insofar as it concerns it self, as I read it, with the borderline between the intuitive and the formal views of a mathematical concept (in this case curvature), and asks us to investigate how the intuitive gives rise to the formal.
If we accept the standard definition that a geodesic is a curve $\gamma(t)$ in $M$ which is auto-parallel, that is, it transports its own tangent vector $\vec v(t) = \dot \gamma(t)$ in a parallel manner,
$\nabla_{\vec v} \vec v = 0, \tag{1}$
then we can ask how this auto-parallelism effects the relative separation of nearby geodesics. It is quite a challenge to concisely express the formal, technical meaning and effects of curvature in a way which appeals to basic geometrical intuition, but it is not so difficult to visualize examples which are based upon the convergence and/or divergence (or lack thereof!) of families of geodesics on surfaces, and in so doing we can discover some of the things a formal definiton of curvature should express.
The formula (1) expresses the concept of a curve which is locally a "straight line" in the space it inhabits. That is, (1) formalizes the intuitive notion that, as $t$ increases, the point $\gamma(t)$ continues to move in exactly the same direction as it already has: $\vec v$ does not change, in the covariant sense, as we progress along $\gamma(t)$. It is as if a bug were crawling along guided only by the principle that he should ever continue in the same direction; the bug can't see very far ahead, he only knows to each step in exactly the same direction as the last. (I didn't make this analogy up; I read it in volume II of R.P. Fenyman's famous Lectures on Physics, where he goes into gravitation;re some readers have probably seen it there.)
Of course, (1) says that $\vec v$ is covariantly constant along $\gamma(t)$; this means that the relative differences in the bases or frames of the tangent spaces $T_pM$ are taken into account as we compute derivatives of vectors. For even though the components $Y^i$ of a given vector field $\vec Y = \sum Y^i (\partial/\partial x_i)$ may be subject to "plain and ordinary" differentiation as we move along a curve, that is, on a curve such as
$\gamma(t) = (x_1(t), x_2(t), \ldots, x_n(t)) \tag{2}$
we have
$\dfrac{\partial Y^i(\gamma(t))}{\partial t} = \sum \dfrac{\partial Y^i}{\partial x_i} \dfrac{dx_i}{dt} \tag{3}$
by the chain rule, the derivatives of the vector field $\vec Y$ must take into account possible derivatives of the coordinate vector fields $(\partial / \partial x_i)$, and if the derivative of a vector field is to be a vector field, then for any local framing, that is, any set of $n = \dim M$ linearly independent vector fields $\vec e_i$ in a neighborhood of any point $p \in M$, we have
$\nabla_i \vec e_j \equiv \nabla_{\vec e_i} \vec e_j = \Gamma_{ji}^k \vec e_k, \tag{4}$
for a suitable set of functions $\Gamma_{ji}^k$ defined nead $p$. These $\Gamma_{ji}^k$ are of course the Christoffel symbols or connection coefficients for the covariant derivative differential operator $\nabla$; together with certain other postulates such as the generalized Leibniz rule
$\nabla_{\vec X} (f \vec Y) = \vec X[f] \vec Y + f \nabla_{\vec X} \vec Y \tag{5}$
and so forth (see here) they completely specify a given covariant derivative. Note that everything we have defined is independent of coordinates, although the $\Gamma_{ji}^k$ are not themselves form a tensor, as is well-known. (Note: many authors use the term Christoffel symbols only in the case that the $\vec e_i$ for a coordinate basis, that is, they are tangent to a system of coordinate curves near $p \in M$. But here I am adopting a somewhat more general usage and allowing the term to apply to any local framing of $TM$.) Now if we write
$\vec Y = \sum_i Y \vec e_i \tag{6}$
then by the rules given here and it the linked citing we have
$\nabla_{\vec X} \vec Y = \nabla_{\vec X} (\sum_i Y^i \vec e_i) = \sum_i (\nabla_{\vec X} Y^i \vec e_i + Y^i \nabla_{\vec X} \vec e_i), \tag{7}$
and with
$\vec X = \sum_j X^j \vec e_j \tag{8}$
(7) may be expanded to
$\nabla_{\vec X} \vec Y = \nabla_{\sum_j (\vec X^j \vec e_j)} (\sum_i Y^i \vec e_i) = \sum_j X^j (\nabla_{\vec e^j}\sum_i Y^i \vec e_i)$ $= \sum_j X^j (\sum_i(\nabla_j(Y^i) \vec e_i + Y^i \nabla_j \vec e^i) = \sum_j X^j (\sum_i(\vec e_j[Y^i] \vec e_i + Y^i \sum_k \Gamma_{ij}^k \vec e_k); \tag{9}$
and picking off the $l$-th component,
$(\nabla_{\vec X} \vec Y)^l = \sum_j X^j \vec e_j[Y^l] + \sum_{i,j} X^j Y^i \Gamma_{ij}^l = \vec X[Y^l] + \sum_{i, j} X^j Y^i \Gamma_{ij}^l; \tag{10}$
we see the components of $\nabla_{\vec X} \vec Y$ contain in general an extra term $\sum_j X^j Y^i \Gamma_{ij}^l$ not present in the simple derivatives of the components of $\vec Y$, e.g. (3). A vector field $\vec Y$ is defined to be parallel along $\vec X$ precisely when
$\nabla_{\vec X} \vec Y = 0; \tag{11}$
of course, the notion of "parallel" takes on a new meaning when the $\Gamma_{ij}^k \ne 0$; now it accomodates the relative changes in local frames, one to another.
We can apply (7), (10), (11) to (1), obtaining
$(\nabla_{\vec v} \vec v)^l = \vec v[v^l] + \sum_{i, j} v^j v^i \Gamma_{ij}^l = 0; \tag{12}$
with
$\vec v = \dot {\gamma}(t) = (\dot x_1(t), \dot x_2(t), \ldots, \dot x_n(t))^T \tag{13}$
we find (12) yields
$(\nabla_{\vec v} \vec v)^l = \vec v [\dot x_l] + \sum_{i, j} \Gamma_{ij}^l \dot x_i(t) \dot x_j(t); \tag{14}$
finally, $\vec v[\dot x_l(t)]$ is the derivative of $\dot x_l(t)$ along $\gamma(t)$, whence
$\vec v[\dot x_l(t)] = \ddot x_l(t), \tag{15}$
so the component equations for an auto-parallel curve thus become
$(\nabla_{\vec v} \vec v)^l = \ddot x_l(t) + \Gamma_{ij}^l \dot x_i(t) \dot x_j(t) = 0. \tag{16}$
(16) is the equation for the components of an auto-parallel curve; Feynman's bug, ever taking one step after another, always in the same direction, on curved surface, would trace out a path satisfying (16).
I have belabored this point, the derivation of the geodesic equations in terms of the $\Gamma_{ij}^k$, in order to emphasize that the connection coefficients are intimately related to the more intuitive geometrical concept of geodesics, and provide a way to formalize, make technical and subject to calculation, the essential notion of an auto-parallel curve. Geodesics possess a number of important and useful properties which make them central objects to differential geometric analysis. For example, it can be show (see Milnor's book *Morse Theory, The Large Scale Structure of Space Time by Hawking and Ellis, or of course Misner, Thorne and Wheeler's Gravitation) that such curves are, locally at least, extremals of the functional $\int \langle \dot x, \dot x \rangle dt$ which measures the squared distance along a curve. This is true as long as the $\Gamma_{ij}^k$ are *compatible with the metric tensor $\langle \cdot, \cdot \rangle$ in the sense that
$\vec v[\langle \vec X, \vec Y \rangle] = \nabla_{\vec v}\langle \vec X, \vec Y \rangle = \langle \nabla_\vec v \vec X, \vec Y \rangle + \langle \vec X, \nabla_{\vec v} \vec Y \rangle \tag{17}$
for any vector fields $\vec v$, $\vec X$, $\vec Y$. It may be shown that there is exactly one set (in any given framing) of $\Gamma_{ij}^k$ for which (17) holds, and that the extremal nature of auto-parallel curves holds for $\langle \cdot, \cdot \rangle$ of any signature, as long as the metric form is non-degenerate; these facts apply on both Lorentzian space-times and Riemannian manifolds, though the extremals in the Lorentzian case maximize $\int \langle \vec v, \vec v \rangle dt$ for timelike curves, wheres the geodesices (locally) minimize it when the metric is positive definite. This stuff is all explained in great detail in the references I have given.
Just as the Lie derivative or bracket $L_{\vec X} \vec Y = [\vec X, \vec Y]$ measures the commutativity of the vector fields $\vec X$, $\vec Y$ on a differentrial topological level, the Riemann tensor $R(\cdot, \cdot, \cdot)$ measures the geometrical commutaivity of the operators $\nabla_{\vec X}$, $\nabla_{\vec Y}$ as is clear from the defintion
$R(X, Y, Z) = \nabla_{\vec X} \nabla_{\vec Y}Z - \nabla_{\vec Y} \nabla_{\vec Z}Z - \nabla_{[\vec X, \vec Y]}Z. \tag{18}$
One "so special" aspect of the Riemann tensor which not only makes it useful for geometric analysis but also indicates how it may be intuitively interpreted occurs in the study of geodesic deviation, the way geodesics converge or diverge in a given space. If we let $\gamma(t, s)$ denote a family of geodesics each given by a fixed value of the parameter $s$, then since $\gamma(t, s)$ may be construed as mapping a coordinate neighborhood in $\Bbb R^2$ into $M$, if we set
$\vec v = \dfrac{\partial}{\partial t}_{(t, s)} \tag{19}$
and
$\vec w = \dfrac{\partial}{\partial s}_{(t, s)} \tag{20}$
along the curves $\gamma(t, s)$, we have, as in (1),
$\nabla_{\vec v} \vec v = 0 \tag{21}$
for any value of $s$. This means we may write
$\nabla_{\vec w} \nabla_{\vec v} \vec v = 0; \tag{22}$
the order of the covariant derivatives may be reversed using (18), recalling that $[\vec v, \vec w] = 0$ since these vector fields are tangent to coordinate lines, hence the term $\nabla_{[\vec v, \vec w] } \vec v = 0$ as well; thus
$\nabla_{\vec v} \nabla_{\vec w} \vec v + R(\vec v, \vec w, \vec v) = 0; \tag{23}$
furthermore, using the property
$\nabla_{\vec v} \vec w - \nabla_{\vec w} \vec v = [\vec v, \vec w], \tag{24}$
we may write (24) as the second-order system
$\nabla_{\vec v} \nabla_{\vec v} \vec w + R(\vec v, \vec w, \vec v) = 0; \tag{25}$
this equation, expressing as it does the rate of change of $\vec w = (\partial/\partial s)$ along the geodesics $\gamma(t, s)$ with tangent vectors $(\partial/\partial t)$, shows how, formally, the curvature tensor $R(\cdot, \cdot, \cdot)$ measures the rate of separation of geodesic curves, and hence how it is intimately related to the intuitive notions of local "straight lines" which occur both in Riemmanian and Lorentzian geometries.
It seems to me that the question is not simply technical, but is about the concept of ''curvature'' in a Lorentz manifold. This concept is not so simple to define and the only intuition that I have is from my physical formation. So I start form a physical point of view about this concept, using the intuition proposed in the classical book ''Gravitation'' of Misner, Thorne and Wheeler.
Here I sum up the content of his section 8.7.
The starting point is that:
" curvature shows up in the deviation of one geodesic from a nearby geodesic"
Now let $P(\lambda, n)$ be a family of geodesics with affine parameter $\lambda$. For a geodesic of fixed $n$ the tangent vector is $\vec u=\dfrac{\partial P}{\partial \lambda}$ and the separation between two points with the same $\lambda$ on two geodesics is $\vec n= \dfrac{\partial P}{\partial n}$.
So, an observer freely falling along the geodesic $n=0$ observe a test particle falling on a geodesic $n$ as moving with a relative acceleration $\mathbf{R}=\nabla _{\vec u} \nabla _{\vec u} \,\vec n$.
Note that this acceleration is a tensor and is $0$ if the spacetime is flat, but if it is not flat then nearby geodesics can ''diverge'' and $\mathbf{R} \ne 0$.
So we can take $\mathbf{R}$ as a ''measure'' of how the spacetime is not flat.
This ''intuition'' define the tensor $\mathbf{R}$ as the best way to represent the curvature of the spacetime and can be used as a definition of the Riemann tensor and, expressing this tensor with components, we find the expression in OP ( see chapter 11 on the same book).