How can I determine the best relationship for 3 variables, given several data points?
What is the best way to determine the relationship for three apparently related variables? The relationship does not appear to be linear, and may follow a combination of non-linear functions.
I have the following data points:
x y z
1 0.5 0.01
1 1 0.01
1 2 0.01
1 10 0.01
1.3 0.5 0.015
1.3 1 0.0177
1.3 2 0.023
1.3 10 0.066
1.5 0.5 0.018
1.5 1 0.0223
1.5 2 0.031
1.5 10 0.1
Assume z is the output, and x and y is the input, and no variable can be 0.
- Given these sample data points, how can I predict z given x and y?
- Is there a mathematical relationship between the variables?
- How can I find an equation that relates these variables?
A rough drawing of the points $(y,z)$ on a graph shows that $z(y)$ is quite linear for each one of the three values of $x$ : That is on the form $z\simeq Ay+B$
Again, a rough drawing of $(A,x)$ and $(B,x)$ on a graph shows that they are almost linear functions of $x$ : That is on the form $A\simeq a_1x+b_1$ and $B\simeq a_2x+b_2$
This draw us to consider the function $z \simeq (a_1x+b_1)y+(a_2x+b_2)$ which can be expressed as : $$z \simeq Axy+By+Cx+D$$
Then, we can proceed on a more accurate manner : A linear regression in order to evaluate the coefficients $A,B,C,D$ for the best fit according to the mean square deviation :
The mean absolute error ( https://en.wikipedia.org/wiki/Mean_absolute_error ) is : $$MAE=0.00032$$
In ADDITION :
The preliminary drawings made to observe the almost linear relationships in the given data :
This was a preliminary search which leads to the selected relationship $z \simeq Axy+By+Cx+D$. The numerical values appearing below are of no use for the computation of the coefficients $A,B,C,D$ as shown above.