How to calculate relative error when true value is zero?
Solution 1:
First of all, let me precise that I am not a statistician but a physicist very concerned by numerical issues in particular in the area of fitting data to models.
So, first consider that you have $[X(i),Y(i)]$ data points and that you want to adjust a model such as $$Y =a+b X+c X^2$$ Among your data points, you have one for which $Y(i)=0$. If you know that, for a specific and defined value of $X=x$, your model must return $Y=0$, you must include this condition and rewrite you model as $$Y=b (X-x)+c (X-x)^2$$ When doing the a posteriori analysis, you should not consider the data point $[x,0]$ since, by construction, it has been excluded from the data set by the constraint (you can even eliminate the data point from the data set; this will not change your results at all).
The other problem is more general. When your $Y(i)$ are almost of the same order of magnitude, the errors which define the objective function (say the sum of squares) is not very important. But, if the $Y(i)$ cover a very large range, minimizing the sum of squares of residuals give an incredible weight to the highest values and the small values of $Y$ play very little role; so typically, the low values are quite poorly represented by the model.
If you want all data points to be represented with the "same" quality of fit, weighted regression is required. By myself, what I use to do is to systematically minimize the sum of the squares of relative errors and, here, we come to your specific question : what to do if, for one data point, $Y=0$ ? I faced this situation in model for which no constraint was evident and I so decided, long long time ago, to define the relative error as $$\Delta =2 \frac{{Y_{cal}}-{Y_{exp}}}{{Y_{cal}}+{Y_{exp}}}$$ If the absolute error is small, this does not make any difference; if the absolute error is large, this bounds the error to human size.
I hope and wish these few notes will be of some help to you. Do not hesitate to post if you want to contiue this discussion.
Solution 2:
If this is based on any kind of real-world situation, then there should be multiple $x_{test}$ measurements, i.e. a distribution. Then it will have a standard deviation, or at least quantiles, and you can define the distance from the mean of the $x_{test}$ to $x_{true}$ in terms of these. E.g., $(\mu_{test} - x_{true}) / \sigma_{test}$ will give you a sort of 'relativized error'. You can also apply standard statistical tests for significance, e.g. the t-test.
Solution 3:
Let me share one approach that makes sense to use in some cases.
In my case, the signal follows roughly the inverse square law in magnitude, but also goes above and below zero, crossing zero at various points. I am interested in the relative error (i.e. far away, where the signal is microvolts, I need precision down to the nanovolt, but near the source, where the signal is a few volts, I need millivolt precision, and would like to ignore deviations in the nanovolt range; so using absolute error doesn't make sense).
But, if I simply divide, either by the true signal, the approximation, or various combinations of the two, the relative error shoots to infinity near the zero-crossings.
The solution is to weigh the absolute error by the inverse of a yardstick signal, that has a similar fall-off properties to the signals of interest, and is positive everywhere. In the formula for relative error, the true signal itself is used for that, but it doesn't have to be, to produce the behaviour you expect from the relative error.
In fact, the normalising signal could be wrong by a multiplicative factor (e.g. if your space is anisotropic, but you still use 1/r^2
as the denominator), and the ratio would still work well as a relative error. Thinking in terms of a log scale helps somewhat, because the relative error becomes a subtraction, rather than division.
EDIT
To quote an article (1) with 600+ citations reported by Google Scholar, from an authority in these numerical issues:
$\epsilon = (f_2 - f_1) / f_1\;\;\;$ (7)
[...]
$E_1$ may of course be expressed as a percent, and like any relative error indicator it will become meaningless when $f_1$ [...] is zero or small relative to $(f_2 - f_1)$, in which case the denominator of Eq. (7) should be replaced with some suitable normalizing value for the problem at hand [...]
(Note that $E_1$ is defined to be a multiple of $\epsilon$ in the article, but these details are irrelevant in the present context.)
I take this as a strong indicator that up until at least 1994, there was no better analogue of relative error for signals that cross zero, than the idea being proposed here, namely, dividing by a normalising signal (that I call the "yardstick" signal above).