Polynomial best fit line for very large values

Solution 1:

If I correctly understand, you want to fit a polynomial model such as

$$Y = a_0 + a_1 X + a_2 X^2 + a_3 X^3 + \cdots$$

but you want that the first point $(X_0,Y_0)$ be exactly matched. So, you have a parameter which has to be removed. Rewrite your equation as

$$Y - Y_0 = a_1 (X - X_0) + a_2 (X - X_0)^2 + a_3 (X - X_0)^3 +\cdots$$

You see that here, your first point is perfectly matched. So, define as new variables

$Z_i = Y_i - Y_0$
$T_i = X_i - X_0 $

and perform your regression as

$$Z = a_1 T + a_2 T^2 + a_3 T^3 + \cdots$$

But do not forget to exclude the intercept (the option "nointercept" is available in almost any regression tool). If you do not have this capability, let me know.

Solution 2:

If the first point must be on the line exactly, that eliminates one degree of freedom from the standard fits. The large numbers are not a problem-there sometimes is if they are over a small range. I got Excel to do a third order polynomial fit, but it doesn't fit very well. A fifth order fits decently by eye. If you want to extract the coefficients, it would help to scale your first column by dividing by $10^5$ or so. I have done so in the below image. For a "rough and ready approach" I would take this fit and add the correct constant to make the first point fit. One way to force it closer to the first point is just to duplicate the first point a bunch of times. The image has $18$ copies of the first pointenter image description here