Rewritten on 2016-11-12. The OP raised very good questions in the comments. Note that this is not intended as an exhaustive answer (as one might expect from, say, a mathematician?), but more like observations from someone who routinely uses bilinear interpolation for numerical data.

How can a bilinear interpolation be defined for an arbitrary quadrilateral, i.e. without running into singularities?

Bilinear interpolation is usually defined as $$f(u,v) = (1-u) (1-v) F_{00} + u (1 - v) F_{01} + (1-u) v F_{10} + u v F_{11}$$ where $0 \le u, v \le 1$ and $$\begin{array}{} f(0,0) = F_{00} \\ f(0,1) = F_{01} \\ f(1,0) = F_{10} \\ f(1,1) = F_{11} \\ f(\frac{1}{2},\frac{1}{2}) = \frac{F_{00}+F_{01}+F_{10}+F_{11}}{4} \end{array}$$

If we use notation $$p(t; p_0, p_1) = (1-t) p_0 + t p_1 = p_0 + t (p_1 - p_0)$$ for the simplest form of linear interpolation, with $0 \le t \le 1$, $p(0;p_0,p_1) = p_0$, $p(1;p_0,p_1) = p_1$, then bilinear interpolation can be written as $$f(u,v) = p(u; p(v; F_{00}, F_{01}), p(v; F_{10}, F_{11}))$$ so this simply extends the single-variable linear interpolation to two variables and $2^2 = 4$ samples.

Bilinear interpolation is not often used for arbitrary quadrilaterals. After pondering the questions OP posed in the comments, I realized that the typical form used for interpolation, $$\begin{cases} x(u,v) = x_{00} + u ( x_{10} - x_{00}) + v ( x_{01} - x_{00} ) \\ y(u,v) = y_{00} + u ( y_{10} - y_{00}) + v ( y_{01} - y_{00} ) \\ f(u,v) = (1-v) \left ( (1-u) f_{00} + u f_{10} \right ) + (v) \left ( (1-u) f_{10} + u f_{11} \right ) \end{cases}$$ is not applicable to arbitrary quadrilaterals, as it assumes it to be a parallelogram, i.e. with $$\begin{cases} x_{11} = x_{10} + x_{01} - x_{00} \\ y_{11} = y_{10} + y_{01} - y_{00} \end{cases}$$ Solving $x = x(u,v)$, $y = y(u,v)$ for $u$ and $v$ yields $$\begin{cases} A = x_{00} (y_{01} - y_{10}) + x_{01} (y_{10} - y_{00}) + x_{10} (y_{00} - y_{01}) \\ u = \frac{ (x_{01} - x_{00}) y - (y_{01} - y_{00}) x + x_{00} y_{01} - y_{00} x_{01} }{A} \\ v = \frac{ (x_{00} - x_{10}) y - (y_{00} - y_{10}) x - x_{00} y_{10} + y_{00} x_{10} }{A} \end{cases}$$ where $$A = \left(\vec{p}_{10} - \vec{p}_{00}\right) \times \left(\vec{p}_{01} - \vec{p}_{00}\right)$$ where $\times$ signifies the 2D analog of vector cross product, so $\lvert A \rvert$ is the area of the parallelogram. Thus, exactly one solution exists for all non-degenerate parallelograms.

For the most common use case, a regular rectangular axis-aligned grid of samples $p_{ji}$, $0 \le j, i \in \mathbb{Z}$, we have $$\begin{cases} x = a_x + b_x i \\ y = a_y + b_y j \end{cases}$$ with $b_x \ne 0$, $b_y \ne 0$, corresponding to interpolation parameters $$\begin{cases} i = \left\lfloor \frac{x - a_x}{b_x} \right\rfloor \\ j = \left\lfloor \frac{y - a_y}{b_y} \right\rfloor \\ u = \frac{x - a_x}{b_x} - i \\ v = \frac{y - a_y}{b_y} - j \end{cases}$$ so that $$p(x,y) = (1-v) \left ( (1-u) p_{j,i} + (u) p_{j,i+1} \right ) + (v) \left ( (1-u) p_{j+1,i} + (u) p_{j+1,i+1} \right )$$


To apply bilinear interpolation to an arbitrary quadrilateral, we need to use $$\begin{cases} x(u,v) = (1-u)(1-v) x_{00} + (u)(1-v) x_{10} + (1-u)(v) x_{01} + (u)(v) x_{11} \\ y(u,v) = (1-u)(1-v) y_{00} + (u)(1-v) y_{10} + (1-u)(v) y_{01} + (u)(v) y_{11} \\ f(u,v) = (1-u)(1-v) f_{00} + (u)(1-v) f_{10} + (1-u)(v) f_{01} + (u)(v) f_{11} \end{cases}$$ In some cases it is sufficient to produce additional samples, for example so that each quadrilateral can be split into four sub-quadrilaterals, doubling the resolution. Then, we do not need to solve for $x$ and $y$, and only need to compute $$\begin{array}{cc} x\left(\frac{1}{2},0\right), & y\left(\frac{1}{2},0\right), & f\left(\frac{1}{2},0\right) \\ x\left(\frac{1}{2},1\right), & y\left(\frac{1}{2},1\right), & f\left(\frac{1}{2},1\right) \\ x\left(0,\frac{1}{2}\right), & y\left(0,\frac{1}{2}\right), & f\left(0,\frac{1}{2}\right) \\ x\left(1,\frac{1}{2}\right), & y\left(1,\frac{1}{2}\right), & f\left(1,\frac{1}{2}\right) \end{array}$$

However, solving $(u,v)$ for some specific $(x,y)$ is quite complicated. Indeed, I was surprised how complicated it turns out to be! (I apologize for misrepresenting this case as "easy" in a previous edit. Mea culpa.)

In practice, we first try to solve $u$ or $v$, and then the other by substituting into one of the equations above. If we decide we wish to solve $u$ first, we need to solve $$U_2 u^2 + U_1 u + U_0 = 0$$ where $$\begin{cases} U_2 = (y_{00}-y_{01}) (x_{10}-x_{11}) - (x_{00}-x_{01}) (y_{10}-y_{11}) \\ U_1 = (y_{00}-y_{01}-y_{10}+y_{11}) x - (x_{00}-x_{01}-x_{10}+x_{11}) y + (x_{11}-2 x_{10}) y_{00} + (2 x_{00}-x_{01}) y_{10} + y_{01} x_{10} - y_{11} x_{00} \\ U_0 = (y_{10}-y_{00}) x - (x_{10}-x_{00}) y + y_{00} x_{10} - x_{00} y_{10} \end{cases}$$ The possible solutions are $$\begin{cases} u = \frac{-U_1 \pm \sqrt{ U_1^2 - 4 U_2 U_0}}{2 U_2}, & U_2 \ne 0 \\ u = \frac{-U_0}{U_1}, & U_2 = 0, U_1 \ne 0 \\ u = 0, & U_2 = 0, U_0 = 0 \end{cases}$$ If we find $0 \le u \le 1$, we solve for $v$ by substituting into $X(u,v) = x$, $$v = \frac{ (y_{00} - y_{10}) u + y - y_{00} }{ (y_{00} - y_{01} - y_{10} + y_{11}) u - y_{00} + y_{01} }$$ or into $Y(u,v) = y$, $$v = \frac{ (x_{00} - x_{10}) u + x - x_{00} }{ (x_{00} - x_{01} - x_{10} + x_{11}) u - x_{00} + x_{01} }$$

If we find no solutions, we try to solve for $v$ in $$V_2 v^2 + V_1 v + V_0 = 0$$ where $$\begin{cases} V_2 = (x_{00}-x_{01}) (y_{10}-y_{11}) - (y_{00}-y_{01}) (x_{10}-x_{11}) \\ V_1 = (x_{00}-x_{01}-x_{10}+x_{11}) y - (y_{00}-y_{01}-y_{10}+y_{11}) x + (y_{11}-2 y_{10}) x_{00} + (2 y_{00}-y_{01}) x_{10} + x_{01} y_{10} - x_{11} y_{00} \\ V_0 = (x_{10}-x_{00}) y - (y_{10}-y_{00}) x + x_{00} y_{10} - y_{00} x_{10} \end{cases}$$ The possible solutions are similar to those for $u$: $$\begin{cases} v = \frac{-V_1 \pm \sqrt{ V_1^2 - 4 V_2 V_0}}{2 V_2}, & V_2 \ne 0 \\ v = \frac{-V_0}{V_1}, & V_2 = 0, V_1 \ne 0 \\ v = 0, & V_2 = 0, V_0 = 0 \end{cases}$$ If you find $0 \le v \le 1$, you solve for $u$ by substituting into $X(u,v) = x$, $$u = \frac{(x_{00} - x_{01}) v + x - x_{00} }{ (x_{00} - x_{01} - x_{10} + x_{11}) v - x_{00} + x_{10} }$$ or into $Y(u,v) = y$, $$u = \frac{(y_{00} - y_{01}) v + y - y_{00} }{ (y_{00} - y_{01} - y_{10} + y_{11}) v - y_{00} + y_{10} }$$

It is also possible to solve $(u,v)$ numerically, by calculating $X(u,v)$ and $Y(u,v)$ repeatedly with different $u$, $v$, until $\lvert X(u,v) - x \rvert \le \epsilon$ and $\lvert Y(u,v) - y \rvert \le \epsilon$, where $\epsilon$ is the maximum acceptable error in $x$ and $y$ (maximum distance to correct $(x,y)$ being $\sqrt{2}\epsilon$).

There are a number of different methods for the numerical search. Some of the following observations may come in handy, when implementing a numerical search: $$\begin{array}{rl} \frac{d \, X(u,v)}{d\,u} = & x_{10} - x_{00} + v ( x_{11} - x_{01} - x_{10} + x_{00} ) \\ \frac{d \, X(u,v)}{d\,v} = & x_{01} - x_{00} + u ( x_{11} - x_{01} - x_{10} + x_{00} ) \\ \frac{d \, Y(u,v)}{d\,u} = & y_{10} - y_{00} + v ( y_{11} - y_{01} - y_{10} + y_{00} ) \\ \frac{d \, Y(u,v)}{d\,v} = & y_{01} - y_{00} + u ( y_{11} - y_{01} - y_{10} + y_{00} ) \\ X(u + du, v) - X(u, v) = & du \left ( x_{10} - x_{00} + v ( x_{11} - x_{01} - x_{10} + x_{00} ) \right ) \\ X(u, v + dv) - X(u, v) = & dv \left ( x_{01} - x_{00} + u ( x_{11} - x_{01} - x_{10} + x_{00} ) \right ) \\ Y(u + du, v) - Y(u, v) = & du \left ( y_{10} - y_{00} + v ( y_{11} - y_{01} - y_{10} + y_{00} ) \right ) \\ Y(u, v + dv) - Y(u, v) = & dv \left ( y_{01} - y_{00} + u ( y_{11} - y_{01} - y_{10} + y_{00} ) \right ) \end{array}$$

In other words, it is true that the bilinear interpolation is quite difficult for arbitrary quadrilaterals, and very problematic for self-intersecting quadrilaterals. However, the most common quadrilateral types -- rectangles and parallelograms -- are easy, and even the general case is solvable at least numerically, even in the presence of singularities.


Why bilinear interpolation with quadrilaterals?

As I've shown above, for the rectangles and parallelograms -- the only quadrilaterals I've used bilinear interpolation with in real solutions --, bilinear interpolation is easy and simple.

Indeed, the emphasis on quadrilaterals (in the sense of arbitrary quadrilaterals) seems incorrect, as bilinear interpolation is mostly used with rectangles or parallelograms.

Perhaps the emphasis should be on that bilinear interpolation uses two variables to interpolate between four known values; or more generally, $k$-linear interpolation uses $k$ variables to interpolate between $2^k$ values. Trilinear interpolation is similarly common for cuboids with vertices $$\begin{cases} \vec{p}_{011} = \vec{p}_{010} + \vec{p}_{001} - \vec{p}_{000} \\ \vec{p}_{101} = \vec{p}_{100} + \vec{p}_{001} - \vec{p}_{000} \\ \vec{p}_{110} = \vec{p}_{100} + \vec{p}_{010} - \vec{p}_{000} \\ \vec{p}_{111} = \vec{p}_{100} + \vec{p}_{010} + \vec{p}_{001} - 2 \vec{p}_{000} \end{cases}$$ i.e. cuboids defined by one vertex and three edge vectors.

Regular grids are ubiquitous, and linear mapping is the simplest interpolation method, with easy properties. Cubic interpolation and other interpolation methods do produce better results, but are computationally more expensive, and the properties may produce unwanted behaviour: most typically, the interpolated value is no longer guaranteed to reside within the range spanned by the constants.


Why a bilinear interpolation with a quadrilateral?

As pointed out in the EDIT of the question, this issue is a bit more subtle.
At first the comment by Rahul is applied here as a heuristics. Rotate our $[-1,+1]×[-1,+1]$ square over $45^o$, with $xy \to \frac{1}{2}(y^2-x^2)$ as a consequence. Instead of the basic polynomial $\,xy\,$ one gets two basic polynomials $x^2$ and $y^2$, five in total : $1,x,y,x^2,y^2$ . For the basic shape this would imply five nodal points instead of four. Now take a look at the picture below: an extra nodal point $(0)$ in the middle has been provided.
enter image description here
The shape on the right side is known in Finite Difference circles as a five point star. It will be demonstrated here that it is possible to treat this Finite Difference pencil as if it were a Finite Element. Let the coordinates of the parent five point star be given by: $$ (0) = (0,0) \quad ; \quad \begin{cases} (1) = (-1,0) \quad ; \quad (2) = (+1,0) \\ (3) = (0,-1) \quad ; \quad (4) = (0,+1)\end{cases} $$ Let function behaviour "inside" the five point star be approximated by a quadratic interpolation between the function values at the vertices or nodal points, let $T$ be such a function and use its Taylor expansion around the origin $(0)$: $$ T(\xi,\eta) = T(0) + \frac{\partial T}{\partial \xi}(0).\xi + \frac{\partial T}{\partial \eta}(0).\eta + \frac{1}{2} \frac{\partial^2 T}{\partial \xi^2}(0).\xi^2 + \frac{1}{2} \frac{\partial^2 T}{\partial \eta^2}(0).\eta^2 $$ Specify $T$ for the vertices with the basic polynomials of the five point star: $$ T_0 = T(0)\\ T_1 = T(0) - \frac{\partial T}{\partial \xi}(0) + \frac{1}{2} \frac{\partial^2 T}{\partial \xi^2}(0)\\ T_2 = T(0) + \frac{\partial T}{\partial \xi}(0) + \frac{1}{2} \frac{\partial^2 T}{\partial \xi^2}(0)\\ T_3 = T(0) - \frac{\partial T}{\partial \eta}(0) + \frac{1}{2} \frac{\partial^2 T}{\partial \eta^2}(0)\\ T_4 = T(0) + \frac{\partial T}{\partial \eta}(0) + \frac{1}{2} \frac{\partial^2 T}{\partial \eta^2}(0)\\ \quad \mbox{ F.E. } \leftarrow \mbox{ F.D. } $$ Solving these equations is not much of a problem and well-known Finite Difference schemes are recognized: $$ T(0) = T_0 \\ \frac{\partial T}{\partial \xi}(0) = \frac{T_2-T_1}{2}\\ \frac{\partial T}{\partial \eta}(0) = \frac{T_4-T_3}{2} \\ \frac{\partial^2 T}{\partial \xi^2}(0) = T_1-2T_0+T_2 \\ \frac{\partial^2 T}{\partial \eta^2}(0) = T_3-2T_0+T_4 \\ \quad \mbox{ F.D. } \leftarrow \mbox{ F.E. } $$ Finite Element shape functions may be constructed as follows: $$ T = N_0.T_0 + N_1.T_1 + N_2.T_2 + N_3.T_3 + N_4.T_4 = \\ T_0 + \frac{T_2-T_1}{2}\xi + \frac{T_4-T_3}{2}\eta + \frac{T_1-2T_0+T_2}{2}\xi^2 + \frac{T_3-2T_0+T_4}{2}\eta^2 =\\ (1-\xi^2-\eta^2)T_0 + \frac{1}{2}(-\xi+\xi^2)T_1 + \frac{1}{2}(+\xi+\xi^2)T_2 +\frac{1}{2}(-\eta+\eta^2)T_3 + \frac{1}{2}(+\eta+\eta^2)T_4 \\ \Longrightarrow \quad \begin{cases} N_0 = 1-\xi^2-\eta^2 \\ N_1 = (-\xi+\xi^2)/2\\ N_2 = (+\xi+\xi^2)/2\\ N_3 = (-\eta+\eta^2)/2\\ N_4 = (+\eta+\eta^2)/2 \end{cases} $$ It is assumed that the same parameters $(\xi,\eta)$ are employed for the function $T$ as well as for the (global Cartesian) coordinates $x$ and $y$. Herewith it is expressed that we have, as with the linear triangle and the bilinear quadrilateral, an isoparametric transformation: $$ \begin{cases} x = N_0.x_0 + N_1.x_1 + N_2.x_2 + N_3.x_3 + N_4.x_4 \\ y = N_0.y_0 + N_1.y_1 + N_2.y_2 + N_3.y_3 + N_4.y_4 \end{cases} \quad \Longleftrightarrow \\ \begin{cases} x =& x_0 + (x_2-x_1)/2\cdot\xi + (x_4-x_3)/2\cdot\eta\\ &+ \left[(x_1+x_2)/2-x_0\right]\cdot\xi^2 + \left[(x_3+x_4)/2-x_0\right]\cdot\eta^2 \\ y =& y_0 + (y_2-y_1)/2\cdot\xi + (y_4-y_3)/2\cdot\eta\\ &+ \left[(y_1+y_2)/2-y_0\right]\cdot\xi^2 + \left[(y_3+y_4)/2-y_0\right]\cdot\eta^2 \end{cases} $$ Now take a look at the picture below and let attention be shifted from the original quadrilateral to the quadrilateral that joins the midpoints of the edges of the original. The latter is known as the Varignon parallelogram and it may be associated with our five point star.
enter image description here
When doing so, the diagonals of the parallelogram become the local coordinate axes of the star and by a well-known property of the diagonals of a parallelogram we have: $$ \begin{cases} x_0 = (x_1+x_2)/2 \\ x_0 = (x_3+x_4)/2 \end{cases} \quad \mbox{and} \quad \begin{cases} y_0 = (y_1+y_2)/2 \\ y_0 = (y_3+y_4)/2 \end{cases} \quad \Longrightarrow \\ \begin{cases} x = x_0 + (x_2-x_1)/2.\xi + (x_4-x_3)/2.\eta \\ y = y_0 + (y_2-y_1)/2.\xi + (y_4-y_3)/2.\eta \end{cases} $$ Swithing back to the (numbering of) the original quadrilateral (on the left in the picture) we have: $$\require{cancel} \begin{array}{l} x(\xi,\eta) = A_x + B_x.\xi + C_x.\eta\cancel{+ D_x.\xi.\eta} \\ y(\xi,\eta) = A_y + B_y.\xi + C_y.\eta\cancel{+ D_y.\xi.\eta} \end{array} \qquad \mbox{ where: } \\ \begin{array}{ll} A_x = \frac{1}{4} ( x_1 + x_2 + x_3 + x_4 ) & ; \quad A_y = \frac{1}{4} ( y_1 + y_2 + y_3 + y_4 ) \\ B_x = \frac{1}{4} \left[(x_2 + x_4) - (x_1 + x_3)\right] & ; \quad B_y = \frac{1}{4} \left[(y_2 + y_4) - (y_1 + y_3)\right] \\ C_x = \frac{1}{4} \left[(x_3 + x_4) - (x_1 + x_2)\right] & ; \quad C_y = \frac{1}{4} \left[(y_3 + y_4) - (y_1 + y_2)\right] \\ \cancel{D_x = \frac{1}{4} ( + x_1 - x_2 - x_3 + x_4 )} & ; \quad \cancel{D_y = \frac{1}{4} ( + y_1 - y_2 - y_3 + y_4 )} \end{array} $$ Which is precisely the original bilinear interpolation, where the non-linear $\,\xi.\eta\,$ terms simply have been erased. Due to the linearity achieved, the local parameters $(\xi,\eta)$ can easily be expressed now in the global coordinates $(x,y)$ : $$ \begin{bmatrix} \xi \\ \eta \end{bmatrix} = \begin{bmatrix} B_x & C_x \\ B_y & C_y \end{bmatrix}^{-1} \begin{bmatrix} x-A_x \\ y-A_y \end{bmatrix} $$ Because of the isoparametrics, exactly the same interpolation is applicable to any other function $T$. Substitute $\,\xi(x,y),\eta(x,y)\,$ in: $$ T(\xi,\eta) = A_T + B_T.\xi + C_T.\eta \quad \mbox{ where: } \quad \begin{cases} A_T = \frac{1}{4} ( T_1 + T_2 + T_3 + T_4 ) \\ B_T = \frac{1}{4} \left[(T_2 + T_4) - (T_1 + T_3)\right] \\ C_T = \frac{1}{4} \left[(T_3 + T_4) - (T_1 + T_2)\right] \end{cases} $$ The idea is to evaluate function values at the midpoints of the edges of the bilinear quadrilateral. Joining these midpoints together gives the Varignon parallelogram. Then employ the now linear interpolation of this parallelogram as an extrapolation for points inside the original quadrilateral (under the assumption that there is no problem with determining if a point is inside/outside an arbitrary convex quadrilateral within an unstructured grid). Here is a visualization of the substitute interpolation. The original quadrilateral is in black (with red vertices), the Varignon parallelogram is in blue, the $(\xi,\eta)$ coordinate axes are in yellow, the area covered by the substitute interpolation, with $-1 < \xi+\eta < +1$ and $-1 < \xi-\eta < +1$ , is in grey. There are four triangles remaining.
enter image description here

LATE EDIT. Continuing story at:

  • Any employment for the Varignon parallelogram?