Why is a dual space a vector space?
Let's go back further:
Let $\mathbf{V}$ and $\mathbf{W}$ be any two vector spaces over the same field $\mathbf{F}$. Let $\mathcal{L}(\mathbf{V},\mathbf{W})$ be the set of linear transformations $T\colon \mathbf{V}\to\mathbf{W}$.
We will make $\mathcal{L}(\mathbf{V},\mathbf{W})$ into a vector space over $\mathbf{F}$. In order to do this, we need to define an "addition of linear transformations" and a "scalar multiplication of elements of $\mathbf{F}$ by linear transformations" (that is, our "vectors" will be linear transformations from $\mathbf{V}$ to $\mathbf{W}$; remember that a vector space is just a set with a "vector addition" and a "scalar multiplication" that satisfy certain properties, and we call the elements of the set "vectors"; they don't have to be "tuples" in the usual sense).
So, given two linear transformations $T,U\colon \mathbf{V}\to\mathbf{W}$, we need to define a new linear transformation that is called the "sum of $T$ and $U$". I'm going to write this as $T\oplus U$, to distinguish the "sum of linear transformations" from the sum of vectors. Since we want $T\oplus U$ to be a linear transformation (which is a special kind of function) from $\mathbf{V}$ to $\mathbf{W}$, in order to specify it we need to say what the value of $T\oplus U$ is at every $\mathbf{v}\in \mathbf{V}$. My definition is: $$(T\oplus U)(\mathbf{v}) = T(\mathbf{v}) + U(\mathbf{v}),$$ where the sum on the right is taking place in $\mathbf{W}$. This makes sense, because $T$ and $U$ are already functions from $\mathbf{V}$ to $\mathbf{W}$, so $T(\mathbf{v})$ and $U(\mathbf{v})$ are vectors in $\mathbf{W}$, which we can add.
Is $T\oplus U$ a linear transformation from $\mathbf{V}$ to $\mathbf{W}$? First, it is a function from $\mathbf{V}$ to $\mathbf{W}$. Now, to check that it is a linear transformation, we need to check that for all $\mathbf{v}_1,\mathbf{v}_2\in\mathbf{V}$ and all $\alpha\in \mathbf{F}$, we have $$(T\oplus U)(\mathbf{v}_1+\mathbf{v}_2) = (T\oplus U)(\mathbf{v}_1)+(T\oplus U)(\mathbf{v}_2)\quad\text{and}\quad (T\oplus U)(\alpha\mathbf{v}_1) = \alpha((T\oplus U)(\mathbf{v}_1)).$$ Indeed, since $T$ and $U$ are themselves linear transformations, we have: $$\begin{align*} (T\oplus U)(\mathbf{v}_1+\mathbf{v}_2) &= T(\mathbf{v}_1+\mathbf{v}_2) + U(\mathbf{v}_1+\mathbf{v}_2) &\text{(by definition of }T\oplus U\text{)}\\ &= T(\mathbf{v}_1)+T(\mathbf{v}_2) + U(\mathbf{v}_1)+U(\mathbf{v}_2) &\text{(by linearity of }T\text{ and }U\text{)}\\ &= T(\mathbf{v}_1)+U(\mathbf{v}_1) + T(\mathbf{v}_2)+U(\mathbf{v}_2)\\ &= (T\oplus U)(\mathbf{v}_1) + (T\oplus U)(\mathbf{v}_2) &\text{(by definition of }T\oplus U\text{)}\\ (T\oplus U)(\alpha\mathbf{v}_1) &= T(\alpha\mathbf{v}_1) + U(\alpha\mathbf{v}_1) &\text{(by definition of }T\oplus U\text{)}\\ &= \alpha T(\mathbf{v}_1) + \alpha U(\mathbf{v}_1) &\text{(by linearity of }T\text{ and }U\text{)}\\ &= \alpha(T(\mathbf{v}_1) + U(\mathbf{v}_1))\\ &= \alpha((T\oplus U)(\mathbf{v}_1)) &\text{(by definition of }T\oplus U\text{)} \end{align*}$$ so $T\oplus U$ is indeed an element of $\mathcal{L}(\mathbf{V},\mathbf{W})$.
I'll let you verify that $(S\oplus T)\oplus U = S\oplus (T\oplus U)$ for all $S,T,U\in\mathcal{L}(\mathbf{V},\mathbf{W})$ (since this is an equality of functions, you need to check that they have the same value at every $\mathbf{v}\in \mathbf{V}$). That $T\oplus U=U\oplus T$ for all $T,U\in\mathcal{L}(\mathbf{V},\mathbf{W})$; that if $\mathbf{0}$ is the linear transformation that sends every $\mathbf{v}\in\mathbf{V}$ to $\mathbf{0}\in\mathbf{W}$, then $T\oplus\mathbf{0}=T$ for all $T$; and that given $T\in\mathcal{L}(\mathbf{V},\mathbf{W})$, and we define $-T$ to be the function $(-T)(\mathbf{v}) = -(T(\mathbf{v}))$, then $T\oplus (-T) = \mathbf{0}$.
Now we define a scalar multiplication, which I will denote by $\odot$ (again, to avoid confusion with the scalar multiplication from $\mathbf{V}$ and $\mathbf{W}$. Given $T\colon \mathbf{V}\to\mathbf{W}$ and $\alpha\in\mathbf{F}$, define $(\alpha\odot T)$ to be the function $$(\alpha\odot T)(\mathbf{v}) = \alpha T(\mathbf{v}).$$ I will let you verify that this definition works, in that $\alpha\odot T$ is a linear transformation when $T$ is a linear transformation; and that it satisfies the necessary properties:
- $\alpha\odot(\beta\odot T) = (\alpha\beta)\odot T$;
- $1\odot T = T$;
- $(\alpha + \beta)\odot T = (\alpha\odot T)\oplus (\beta\odot T)$;
- $\alpha\odot(T\oplus U) = (\alpha\odot T)\oplus (\alpha\odot U)$.
So $(\mathcal{L}(\mathbf{V},\mathbf{W}),\oplus,\odot)$ is a vector space over $\mathbf{F}$ whenever $\mathbf{V}$ and $\mathbf{W}$ are vector spaces over $\mathbf{F}$.
So now, dual spaces: Note that $\mathbf{F}$ is always a vector space over itself, by defining vector addition to be the same as the addition of $\mathbf{F}$, and scalar multiplication to be the same as multiplication in $\mathbf{F}$.
So if $\mathbf{V}$ is any vector space over $\mathbf{F}$, then we can consider $\mathcal{L}(\mathbf{V},\mathbf{F})$: this makes sense, because both $\mathbf{V}$ and $\mathbf{F}$ are vector spaces over $\mathbf{F}$; and this is itself a vector space over $\mathbf{F}$ with vector addition $\oplus$ and scalar multiplication $\odot$ as defined above.
This vector space, $\mathcal{L}(\mathbf{V},\mathbf{F})$, is called the dual space of $\mathbf{V}$. We write $\mathbf{V}^*$ instead of $\mathcal{L}(\mathbf{V},\mathbf{F})$, and the elements of $\mathbf{V}^*$ are called "functionals".
By abuse of notation, we usually write $+$ instead of $\oplus$ (just like we use the same symbol for the addition of $\mathbf{V}$ and the addition of $\mathbf{W}$), and $\cdot$ or just juxtaposition instead of $\odot$.
The equation you have, $$y(\alpha_1 x_1 + \alpha_2x_2) = \alpha_1y(x_1) + \alpha_2y(x_2)$$ is just telling you that the function $y$ is a linear transformation from $\mathbf{V}$ to $\mathbf{F}$.
It is traditional to use boldface lower case letters like $\mathbf{f}$, $\mathbf{g}$, $\mathbf{h}$ to represent functionals. This to remind us that even though they are vectors in the vector space $\mathbf{V}^*$, they are "really" functions (when they are at home).
In fact, you could go back even further. If $\mathbf{W}$ is a vector space over $\mathbf{F}$, and $X$ is any set, then we can look at $$\mathcal{F}(X,\mathbf{W}) = \{f\colon X\to\mathbf{W}\mid f\text{ is a function}\}.$$ Then $\mathcal{F}(X,\mathbf{W})$ is a vector space, with addition $(f\oplus g)(x) = f(x)+g(x)$ and scalar multiplication $(\alpha\odot f)(x) = \alpha f(x)$. The case of $\mathcal{L}(\mathbf{V},\mathbf{W})$ corresponds to looking at a subspace of $\mathcal{F}(\mathbf{V},\mathbf{W})$ consisting of linear transformations.
This is a standard construction in abstract algebra. Whenever $A$ is an algebra (in the sense of General Algebra; a group, semigroup, ring, vector space, lattice, etc), and $X$ is a set, the collection of all function $f\colon X\to A$ becomes an algebra of the same type under "pointwise operations". In fact, this is nothing more than a "direct power" (a direct product in which every factor is the same) indexed by $X$.