easy to implement method to fit a power function (regression)

I want to fit to a dataset a power function ($y=Ax^B$). What is the best and easiest method to do this. I need the $A$ and $B$ parameters too. I'm using in general financial data in my project, which looks like this:

8553600 458.2
17193600    373.6
25833600    694.16
34646400    738.33
44064000    817.89
54259200    1040.67
67910400    1032.69
76291200    1222.1
84844800    1245.65
94089600    1217.44
102211200   1579.38
110592000   1859.24
118886400   1711.67
127612800   2303.62
136684800   2658.26
219196800   3669.23
225676800   3525.02
225763200   3749.27

I need to implement the algorithm in a Java similar language called ActionScript.


There's the obvious approach of proceeding based on taking the logarithm of both sides of your regression formula:

$$\ln\;y=\ln\left(ax^b\right)$$

$$\ln\;y=\ln\;a+b\ln\;x$$

which can be seen to be an expression of the form

$$v=k+bu$$

where $v=\ln\;y$, $u=\ln\;x$, and $k=\ln\;a$.

Now a linear regression in the variables $v$ and $u$ applies.

In particular, formula numbers 16, 20, 27 and 28 of this page now apply.

Once you have the slope $b$ and the intercept $k$ of the linear transformation, $a$ is just the exponential of the intercept ($\exp\;k$), and $b$ is the slope of the transformed line.

I note here the possible caveat of the fit parameters being biased towards the data with ordinates small in magnitude. The rigorous way of going about it would be to treat the parameters from the linear regression as provisional and then apply a nonlinear least-squares algorithm like Levenberg-Marquardt to the data, using the parameters from the linear regression as a starting point. This may or may not be needed though; it really depends on the data you have.


I'll polish off comments I gave earlier: again, the problem with using the logarithm to linearize your nonlinear function is that it tends to overemphasize the errors in the small values of y. Remember that the assumption of linear least squares is that the abscissas are accurate, but the ordinates are contaminated by error.

In other words, the $y_i$ are actually of the form $\hat{y}_i\pm\sigma_i$ where the $\hat{y}_i$ are the "true values" (presumably unknown), and the $\sigma_i$ are inherent uncertainties. If one takes the logarithms of the $y_i$, the uncertainties are also transformed, and we have to take this into account.

The key formula is that if the $y_i$ are transformed by a function $f(y)$, the $\sigma_i$ are transformed according to $f'(y_i)\sigma_i$.

For the case at hand, the objective function we now have to minimize is of the form

$$f(a,b)=\sum_i{y_i^2\;(\ln\;y_i-\ln\;a-b\ln\;x_i)^2}$$

and we have to modify the formulae for linear regression accordingly:

$$m=\sum_i y_i^2$$

$$\bar{x}=\frac{\displaystyle \sum_i y_i^2 \ln\;x_i}{m}$$

$$t=\sum_i y_i^2 (\ln\;x_i-\bar{x})^2$$

whereupon

$$b=\frac{\displaystyle \sum_i y_i^2\ln\;y_i (\ln\;x_i-\bar{x})}{t}$$

and

$$a=\exp\left(\frac{\displaystyle \sum_i y_i^2\ln\;y_i}{m}-b\bar{x}\right)$$

These should be better provisional values for later polishing.


It turns out that for a separable nonlinear fit (linear in one of the parameters), the NLLS problem greatly simplifies.

Remember that the actual quantity that we have to minimize is

$$F(a,b)=\sum_i{(y_i-a x_i^b)^2}$$

If we take the gradient $\nabla F(a,b)$:

$$\nabla F(a,b)=(2a\sum_i x_i^{2b}-2\sum_i x_i^b y_i\quad 2a\sum_i \ln\left(x_i\right) x_i^{2b}-2\sum_i \ln\left(x_i\right) x_i^b y_i)^T$$

equate both components to 0, and then eliminate the linear parameter $a$, we get the univariate nonlinear equation in $b$:

$$\left(\sum_i x_i^b y_i\right)\left(\sum_i \ln\left(x_i\right) x_i^{2b}\right)-\left(\sum_i x_i^{2b}\right)\left(\sum_i \ln\left(x_i\right) x_i^b y_i\right)=0$$

which can be attacked with standard techniques, e.g. the secant method or Newton's method. But, how do we start the iteration? Why, with the provisional $b$ we got from the (weighted) linear regression earlier!

Having gotten $b$, it is now a simple matter to get $a$:

$$a=\left(\sum_i x_i^b y_i\right) / \left(\sum_i x_i^{2b}\right)$$

and you now have your parameters.