Intuition behind functional dependence

What is the intuition behind functional independence ?

(This is defined in the following way: Let $k\leq n$. The $C^1$ functions $F_1,\ldots,F_k:\mathbb{R}^n\rightarrow \mathbb{R}$ are functionally independent if the matrix whose columns are the gradients $\nabla F_1,\ldots,\nabla F_k$ has full rank, i.e. rank $k$, on the whole domain of definition.

From what I gather from this answer, this is the same as saying that $F:=(F_1,\ldots,F_k):\mathbb{R}^n\rightarrow \mathbb{R}^k$ is submersion, but that doesn't help me much either, because I also don't have any intuition concerning submersions.)

So what does it really mean if the functions are functional indepedent - or conversely, dependent ? Is there, in the latter case, then also a relationship like $g(\nabla F_1,\ldots,\nabla F_k)=0$ -- or maybe like $g(F_1(x),\ldots F_k (x))=0$ for some $x$ -- for $g$ ranging in some specific set, similar to the case of linear independence ( in which $g$ would be from the set $\{g:\mathbb{R}^k\rightarrow \mathbb{R}:g(x_1,\ldots,x_k)=\sum \lambda_i x_i \text{ for some nonzero } \lambda_i \in \mathbb{R}\}$).


Functional dependence of $k$ given functions $F_i:\>\Omega\to{\mathbb R}$ $\>(1\leq i\leq k)$ with common domain $\Omega\subset{\mathbb R}^n$ means, intuitively, that there is a nontrivial function $$g:\quad{\mathbb R}^k\to{\mathbb R},\qquad y\mapsto g(y)$$ such that $$g\bigl(F_1(x),F_2(x),\ldots, F_k(x)\bigr)\equiv 0\qquad \forall x\in\Omega\ .\tag{1}$$ "Nontrivial" for $g$ means that $\nabla g(y)\ne0$ for all $y\in{\mathbb R}^k$. Taking the derivative of $(1)$ we see that $$dg\bigl(F(x)\bigr)\cdot dF(x)\equiv 0\in{\cal L}({\mathbb R}^n,{\mathbb R}^k)\qquad\forall x\in\Omega\ .$$ In terms of matrices this says that the rows of the matrix $\bigl[dF(x)\bigr]$ are linearly dependent with coefficients given by $\nabla g\bigl(F(x)\bigr)\ne0$, for each $x\in\Omega$.

Now the rows of the matrix $\bigl[dF(x)\bigr]$ are nothing else but the gradients $\nabla F_i(x)$. Therefore functional dependence of the $F_i$ in the above sense implies that the gradients $\nabla F_i(x)$ are linearly dependent, at each point $x\in\Omega$.

In your definition of "functional independence" you don't want even a hint of such a thing. Therefore you insist that at all points $x\in\Omega$ the $k$ gradients $\nabla F_i(x)$ should be linearly independent.


Functional independence between two functions means that the level set of each of the functions intersects transversely the level set of the other function. In the plane, this means that a function $f$ functionally independent of another function $g$ cannot be written as $f=F(g)$ where $F$ is another function, because in this case, the level sets would be the same curves, for both $f$ and $g$. Similarly, in $k$ dimensions, functional independence means that the intersection between the $k$ $(k-1)$-dimensional level sets of the $k$ functions is a point, not a line or a plane, i. e., the functions $F_1$, $F_2$,...,$F_k$ contains $k$ independent informations of your ambient space, and can be taken locally as coordinates.


The derivative of $F$ is $DF=[\nabla F_1,\cdots,\nabla F_k]^T$ ; let $a\in\mathbb{R}^n$, $b=F(a)$ and $V=F^{-1}(b)$. $V$ is the intersection of the $k$ hypersurfaces $F_i(x)=b_i$. The normal vector in $a$ to such a hypersurface is $\nabla F_i(a)\in \mathbb{R}^n$. Here $rank(DF_a)=k$, that is $W=span(\nabla F_1(a),\cdots,\nabla F_k(a))$ has dimension $k$. According to the implicit function theorem, in a neighborhhood of $a$, $V$ is a variety of dimension $n-k$, that is $V$ is $C^1$-isomorphic to an open subset of $\mathbb{R}^{n-k}$. Moreover the tangent space of $V$ in $a$ is the orthogonal of $W$.

For instance, let $n=3,k=2$. $V$ is the intersection of $2$ surfaces in the standard space. The normal vectors $u_1,u_2$ in $a$ are not parallel ; then $V$ is locally a line and the cross product $u_1\times u_2$ is tangent to this line.

EDIT 1. In other words, an approximation of the equation of $V$ is $DF_a(x-a)=0$, that is, for every $i\leq k$, $<\nabla F_i(a),x-a>=0$.

@ user36772 , I just read your last four lines ! You speak about the case when (in my instance) the $2$ previous normals are parallel. Then the surfaces are tangent in $a$ and we know nothing about the intersection. In other words, when the hypothesis of a theorem are not satisfied, then (is it surprising ?) the theorem does not work.

EDIT 2. (answer to user36772). A level set is a subvariety of codimension $1$, that is an hypersurface.

The tangent space to an hypersurface is the hyperplane that is orthogonal to a normal vector ; then the tangent space to $V$ is the intersection of these hyperplanes. From the geometrical point of view, to say that the $C^1$ functions are independent in a neighborhood of $a$ is equivalent to identify each hypersurface with its tangent hyperplane in $a$ and to say that the linear equations associated to these hyperplanes are linearly independent. This property is stable in the following sense: if we move slightly our hypersurfaces, then the intersection remains similar to the original one.

If these equations are not linearly independent, then the instability comes at a gallop. For instance, consider, in $\mathbb{R}^3$, $a=0, F_1=y-x^2,F_2=y-x^4$. Then, locally, the intersection of the surfaces is the line $Oz$. Yet, if you move one surface, then the intersection may be locally void.

Think also to the GPS ; we need $5$ satellites. Geometrically, the intersection of $3$ spheres suffice. Yet a fourth measure allows synchronization of clocks. Why the fifth ? Because if two among the satellites are "close", then the associated spheres are nearly tangent and the intersection is unstable.