Solution 1:

The group law on an elliptic curve was not discovered in a vacuum. It came up in the context of abelian integrals.

Let $y^2 = f(x)$, where $f(x)$ is a cubic in $x$ be an elliptic curve; call it $E$.

Elliptic integrals are integrals of the form $$\int_{a}^x dx/y = \int_a^x \frac{d x}{\sqrt{ f(x)}}.$$ (Here $a$ is some fixed base-point.) They come up (in a slightly transformed manner) when computing the arclength of an ellipse (whence their name).

If was realized at some point (in the 1600s or 1700s) (at least in special cases) that if you apply certain substitutions to $x$, you can double the value of the integral, or that if you apply certain substitusions of the form $x = \phi(x_1,x_2)$, the integral you compute is the sum of the individual integrals for $x_1$ and $x_2$.

Real understanding came from the work of Abel and Jacobi. (Unfortunately I don't know the precise history or attributions here.)

What they realized (in modern terms) is that, if we fix the base-point $a$ and let $b$ vary, then the elliptic integral is giving a multivalued map from the elliptic curve $E$ to $\mathbb C$, and that the formula $\phi(x_1,x_2)$ mentioned above shows us a way to add points on the elliptic curve, so that this map is a (multi-valued) group homomorphism.

Taking the inverse of this multi-valued map gives a single-valued map (which is how we are more used to thinking about it) $$\mathbb C \to E,$$ which is a homomorphism when we give $\mathbb C$ its addivite structure and $E$ the group law coming from $(x_1,x_2)\mapsto \phi(x_1,x_2)$. The kernel of this map turns out to be a lattice $\Lambda$, so that we get an isomorphism $\mathbb C/\Lambda \cong E.$

The formula $\phi(x_1,x_2)$ turns out to be precisely the formula describing addition on the elliptic curve via chords and tangents, and there are lots of theoretical explanations for it, as you can find on the linked MO page.

For a higher genus curve $C$, it turns out that there is not just the one holomorphic differential $dx/y$, but $g$ linearly independent such (if the curve has genus $g$), say $\omega_1, \ldots,\omega_g$. Furthermore, if we fix a particular differential $\omega_i$, then there is no formula $\phi(x_1,x_2)$ such that the sum of the integrals of $\omega$ for $x_1$ and $x_2$ is equal to the integral of $\omega$ for $\phi(x_1,x_2)$.

However, what Abel and Jacobi found is that, if we consider the map $$(x_1,\ldots,x_g) \mapsto (\sum_{i = 1}^g \int_a^{x_i} \omega_1, \ldots,\sum_{i = 1}^g \int_a^{x_i} \omega_g),$$ which gives a multi-valued map $$Sym^g C \to \mathbb C^g$$ (here $Sym^g C$ denotes the $g$th symmetric power of $C$, so it is the product of $g$ copies of $C$, modulo the action of the symmetric group on the $g$ factors), then we can find a formula $\phi(x_{1,1},\ldots,x_{1,g},x_{2,1},\ldots,x_{2,g})$ such that $\phi$ defines a group law on $Sym^g C$ (at least generically) and such that this map is a homomorphism.

Again, this map becomes well-defined and (generically) single valued if we quotient out the target by an appropriate lattice $\Lambda$ (the period lattice), to get a birational map $$Sym^g C \to \mathbb C^g /\Lambda.$$

The target here is called the Jacobian of $C$, and can be identified with $Pic^0(C)$.

In summary, to generalize the addition law on an elliptic curve to higher genus curves, you have to consider unordererd $g$-tuples of points (where $g$ is the genus), and add those (not just individual points).

(The relationship with $Pic^0(C)$ is that if $x_1,\ldots,x_g$ are $g$ points on $C$, then $x_1 + \ldots + x_g - g a$ is a degree zero divisor on $C$, and every degree zero divisor is linearly equivalent to a divisor of this form, and, generically, to a unique such divisor.)

Solution 2:

I agree that from the chord-and-tangent definition of the group law it is far from clear why associativity should hold. This definition is the one presented in Silverman and Tate because all the other ways you might define the group law require much more background; however, these are the definitions from which associativity makes more sense.

In one approach an elliptic curve is a group of the form $\mathbb{C}/\Lambda$ where $\Lambda$ is a lattice, e.g. a discrete subgroup of $\mathbb{C}$ isomorphic to $\mathbb{Z}^2$. (The motivation for looking at such quotients comes basically from the uniformization theorem). Such a quotient forms a compact Riemann surface, topologically a torus, hence it admits no nonconstant holomorphic functions; therefore we would like to describe the meromorphic functions. A natural way to do this is to write down a function and then try to average over $\Lambda$, which (if you do it right) will lead you to the definition of the Weierstrass elliptic function $\wp(z)$. By examining its poles and the poles of its derivative $\wp'(z)$ we can deduce the differential equation

$$\wp'(z)^2 = 4 \wp(z)^3 - g_2 \wp(z) - g_3$$

where $g_2, g_3$ are certain constants associated to $\Lambda$. In fact the map from $\mathbb{C}/\Lambda$ to the elliptic curve $y^2 = 4x^3 - g_2 x - g_3$ given by $z \mapsto (\wp(z), \wp'(z))$ is an isomorphism. This forms the connection between the complex-analytic picture and the algebraic picture in terms of Weierstrass normal forms. The group law here is just regular addition, e.g. the sum of the points $(\wp(a), \wp'(a))$ and $(\wp(b), \wp'(b))$ is just $(\wp(a+b), \wp'(a+b))$, and so it is obvious that associativity holds. The fact that if $a + b + c = 0$ then the corresponding points are collinear is then a relatively simple computation.

(Historically, this is not how the first group laws on elliptic curves were discovered. For the elliptic-function perspective I recommend Stevenhagen's notes on the subject.)

In another approach an elliptic curve (over $\mathbb{C}$, for simplicity) is a smooth projective curve of genus $1$. The category of smooth projective curves over $\mathbb{C}$ is known to be equivalent to the category of compact Riemann surfaces, which is part of the connection between this definition and the above definition. To any smooth projective curve $C$ one can associate its Jacobian variety, which is a group of the form $\mathbb{C}^g/\Lambda$ where $g$ is the genus which naturally occurs when one thinks about integration over such a curve. The Abel-Jacobi theorem asserts that the Jacobian variety of a curve is isomorphic to its Picard group (the group of divisors modulo principal divisors), and the points of the Picard group of a curve of genus $1$ can be put into natural bijection with the curve itself (once one chooses an identity), so the group law on the Jacobian variety (which is perfectly natural) then gives a group law on the original curve (again once one chooses an identity). The connection to collinear points comes from the fact that lines determine principal divisors, which are zero in the Picard group.

As for your last question, elliptic curves are, as it turns out, the only projective curves which can be given the structure of algebraic groups; see the Wikipedia article on abelian varieties.