Optimum solution to a Linear programming problem
Solution 1:
In two dimensional case the linear optimization (linear programming) is specified as follows: Find the values $(x, y)$ such that the goal function $$g(x, y) = a x + b y \;\;\; (Eq. 1)$$ is maximized (or minimized) subject to the linear inequalities $$a_1 x + b_1 y + c_1 \ge 0 \;\; (or \le 0) $$ $$a_2 x + b_2 y + c_2 \ge 0 \;\; (or \le 0) $$ $$ ... $$
Each of these linear inequalities defines a half plane bounded by the line obtained by replacing the inequality by equality. The solution $(x, y)$ that maximizes the goal function must lie in the intersection of all these halfplanes which is obviously a convex polygon. This polygon is called the feasible region. Let the value of the goal function at a point $(x, y)$ of the feasible region be $m$ $$g(x, y) = a x + b y = m \;\;\; (Eq. 2)$$ The value $m$ of the goal function will obviously not change when we move $(x, y$) on the line defined by (Eq. 2). But the value of $g()$ will be increased when we increase $m$. This leads to a new line which is parallel to (E.q. 2). We can do this as long as the line contains at least one point of the feasible region. We conclude that the maximum of the goal function is achieved at an extreme point of the feasible region which - for a convex polygon - is a vertex (or an edge when the goal line is parallel to the restriction line going through the extreme vertex).
Solution 2:
Jiri's answer gives the intuitive explanation. Formally, the fact that an optimal solution lies at an extreme point is a consequence of the representation theorem for polyhedra and the fact that the feasible region of a linear program is a polyhedron.
The representation theorem says that a polyhedron is the convex sum of its vertices plus the nonnegative linear combination of its extreme directions. More formally, if $S$ is a polyhedron, $\{{\bf v}_1, {\bf v}_2, \ldots, {\bf v}_k\}$ is the set of extreme points of $S$, and $\{{\bf d}_1, {\bf d}_2, \ldots, {\bf d}_m\}$ is the set of extreme directions of $S$ then ${\bf x} \in S$ if and only if ${\bf x} = \sum_{i=1}^k \alpha_i {\bf v}_i + \sum_{i=1}^m \beta_i {\bf d}_i$, $\sum_{i=1}^k \alpha_i = 1$ and $\alpha_i, \beta_i \geq 0$.
Then we have
Theorem: If $S$ is the feasible region of some linear program with objective function $z = {\bf c}^T {\bf x}$ then 1) $z$ attains its optimal value at some vertex of $S$, 2) the linear program is infeasible, or 3) the linear program is unbounded.
Proof: First, assume, without loss of generality, that the LP wants to maximize $z$. If the LP is infeasible, then $S$ is empty. Otherwise, there is at least one point ${\bf x} \in S$. If $S$ has an unbounded extreme direction ${\bf d}_i$ such that ${\bf c}^T {\bf d}_i > 0$, then the LP is unbounded, as we can make the value of ${\bf c}^T \left(\sum_{i=1}^k \alpha_i {\bf v}_i + \beta_i {\bf d}_i\right)$ as large as we wish by increasing $\beta_i$. Assume, then, that ${\bf c}^T {\bf d}_i \leq 0$ for each $i$. Let ${\bf v}^*$ be the vertex with the largest objective function value. Then, for ${\bf x} \in S$, $${\bf c}^T {\bf x} = {\bf c}^T \left(\sum_{i=1}^k \alpha_i {\bf v}_i + \sum_{i=1}^m \beta_i {\bf d}_i \right) = \sum_{i=1}^k \alpha_i ({\bf c}^T {\bf v}_i) + \sum_{i=1}^m \beta_i ({\bf c}^T {\bf d}_i) \leq \sum_{i=1}^k \alpha_i ({\bf c}^T {\bf v}^*) = {\bf c}^T {\bf v}^*.$$ Thus $z$ must attain its optimal value at ${\bf v}^*$.
Solution 3:
It's a special instance of the following general theorem:
The maximum of a convex function $f$ on a compact convex set $S$ is attained in an extreme point of $S$.
An extreme point of a convex set $S$ is a point in $S$ which does not lie in any open line segment joining two points of $S$. In your case, the "corner points of the graphical solution" are the only extreme points of the feasible region. It's easy to see that the feasible region of a LPP is convex. It's not always compact, and some LPP indeed have no solution despite having a nonempty feasible region. The linear objective function is clearly convex. If it is minimized instead of maximized, this can be reformulated as maximizing the negative objective function.
I quite like the cited theorem, because it highlights that optimization can lead to problematic results for a large class of situations (because a solution at the boundary of the feasible region will become infeasible under perturbations, so it is not a robust solution). It's also similar to the bang-bang principle in optimal control.