Simple explanation of lagrange multipliers with multiple constraints

Solution 1:

Consider a point $p$ in the common domain $\Omega\subset{\mathbb R}^n$ of $f$ and the constraints $$g_k(x)=0\qquad(1\leq k\leq r)\ .\tag{1}$$ The gradients $\nabla g_k(p)$ define a subspace $U$ of allowed directions when walking away from $p$. In fact a direction $X$ is allowed only if it belongs to the tangent planes of all level surfaces $(1)$. This means that $X$ is perpendicular to all $\nabla g_k(p)$, or is a solution of the homogeneous system of equations $$\nabla g_k(p)\cdot X=0\qquad(1\leq k\leq r)\ .\tag{2}$$

Now comes an important technical condition for the application of Lagrange's method: We have to assume that the $r$ gradients $\nabla g_k(p)$ are linearly independent, i.e. that $p$ is a regular point of the manifold defined by $(1)$. In this case the $\nabla g_k(p)$ span an $r$-dimensional subspace $V$, and the system $(2)$ has full rank. It follows that ${\rm dim}(U)= n-r$. Therefore we not only have $U\subset V^\perp$, but in fact $$U=V^\perp\ .$$

When $\nabla f(p)\cdot X\ne0$ for some allowed direction $X$ then the function $f$ is not conditionally stationary at $p$. For a constrained local extremum of $f$ at $p$ we therefore need $$\nabla f(p)\cdot X=0$$ for all directions $X\in U$, in other words: It is necessary that $$\nabla f(p)\in U^\perp=V\ .\tag{3}$$ When $\nabla f(p)\in V=\langle\nabla g_1(p),\ldots,\nabla g_r(p)\rangle$ then there are numbers $\lambda_k$ $\>(1\leq k\leq r)$ such that $$\nabla f(p)=\sum_{k=1}^r \lambda_k\>\nabla g_k(p)\ .\tag{4}$$ Solving $(4)$ (with $x$ in place of $p$) together with $(1)$ will bring all regular constrained extrema of $f$ to the fore.