Linear Dependence Lemma

I. Why does (a) need to be justified

Because every claim made needs to be. The essence of (a) is that if there is a linear dependence, then there is a specific index $j$ such that the blame of linear dependence can be given to $v_j$, for its failing to be independent of its predecessors. A different way to prove this is to consider the list of sequences $()$, $(v_1)$, $(v_1,v_2)$, $\ldots$, $(v_1,v_2,\ldots,v_m)$; since the first one is linearly independent and the last one is linearly dependent (by assumption) there must be a first one in the list the is linearly dependent; if this is $(v_1,\ldots,v_j)$ then one can show $v_j\in span(v_1,\ldots,v_{-1})$. The argument is similar to the proof given in the text (a linear dependence relation among $(v_1,\ldots,v_j)$ must involve $v_j$ with nonzero coefficient). An advantage of this approach is that it does not depend essentially on any choice (the linear dependency taken is unique up to a scalar multiple) and it puts the blame on the very first $v_j$ that causes linear dependence.

II. My assumption is that (b) basically means that if we remove this extra vector, then we still have the same list of linear combinations.

More precisely, the removal of $v_j$ does not affect the set of vectors that can be written as linear combinations, even though the linear combinations themselves look different (since $v_j$ is no longer allowed to appear).

III. I will fill in... $span(v_{1},\ldots,v_{m}) = 0$ for $j \in \{2,\ldots,m\} = a_{1}v_{1} + \cdots + a_{j}v_{j} = 0$. Here I just solved for $v_j$, and got the result $v_j = -\frac{a_1}{a_j}v_1 - \cdots - \frac{a_{j-1}}{a_j}v_{j-1} ,$ which corresponds to the above. $a_{j} \neq 0$ because we have $a_j^{-1}$ for each term,

You've got that a bit backwards. You could only solve for $v_j$ under the assumption $a_j\neq0$; you cannot conclude that from the fact that you just solved for $v_j$.

and $v_1 \neq 0$ because if we have $a_{1}v_{1}+\cdots+a_{j}v_{j} = 0$ then all the scalars $a_{2},\ldots,a_{m} \in \mathbb{F}$ could be equal to $0$, if that was the case.

This is doubly beside the point. $v_1\neq0$ is given, you don't need to prove that. On the other hand there is nothing absurd in the scalars $a_{2},\ldots,a_{m} \in \mathbb{F}$ all being equal to $0$, except that the text had just proved it is not the case using the fact that $v_1\neq0$. But the author could have avoided the whole mess about $v_1$ by observing that $span()=\{0\}$.

I think I have an idea, but how exactly does this prove that $v_j$ is contained in the span of $(v_{1},\ldots,v_{j-1})$? Is it because $ -\frac{a_1}{a_j}v_1 - \cdots - \frac{a_{j-1}}{a_j}v_{j-1}$, is just a linear combination of vectors that is equal to $v_j$?

Precisely.

In the equation above, we replace $v_j$ with the right side of 2.5, which shows that $u$ is in the span of the list obtained by removing the $j^{th}$ term from $(v_1,\ldots,v_m)$. Thus (b) holds. $\Box$

IV. So how exactly does this work?

If you write down a linear combination of $v_1,\ldots,v_m$ it contains a single occurrence of $v_j$. If you replace that occurrence (within parentheses, as it gets multiplied by a scalar) by the right hand side of 2.5, then there is no longer any occurrence of $v_j$. You don't directly get a linear combination of the remaining $v_i$, but once you work out the multiplication and collect like terms, you do get such a linear combination. For instance if $v_3=-5v_1+\frac32v_2$ then $$\begin{align} av_1+bv_2+cv_3+dv_4 &= av_1+bv_2+c(-5v_1+\frac32v_2)+dv_4 \\&= av_1+bv_2-5cv_1+\frac32cv_2+dv_4 \\&= (a-5c)v_1+(b+\frac32c)v_2+dv_4. \end{align} $$