How do I calculate a trendline for a graph?

Google is not being my friend - it's been a long time since my stats class in college...I need to calculate the start and end points for a trendline on a graph - is there an easy way to do this? (working in C# but whatever language works for you)


Thanks to all for your help - I was off this issue for a couple of days and just came back to it - was able to cobble this together - not the most elegant code, but it works for my purposes - thought I'd share if anyone else encounters this issue:

public class Statistics
{
    public Trendline CalculateLinearRegression(int[] values)
    {
        var yAxisValues = new List<int>();
        var xAxisValues = new List<int>();

        for (int i = 0; i < values.Length; i++)
        {
            yAxisValues.Add(values[i]);
            xAxisValues.Add(i + 1);
        }

        return new Trendline(yAxisValues, xAxisValues);
    }
}

public class Trendline
{
    private readonly IList<int> xAxisValues;
    private readonly IList<int> yAxisValues;
    private int count;
    private int xAxisValuesSum;
    private int xxSum;
    private int xySum;
    private int yAxisValuesSum;

    public Trendline(IList<int> yAxisValues, IList<int> xAxisValues)
    {
        this.yAxisValues = yAxisValues;
        this.xAxisValues = xAxisValues;

        this.Initialize();
    }

    public int Slope { get; private set; }
    public int Intercept { get; private set; }
    public int Start { get; private set; }
    public int End { get; private set; }

    private void Initialize()
    {
        this.count = this.yAxisValues.Count;
        this.yAxisValuesSum = this.yAxisValues.Sum();
        this.xAxisValuesSum = this.xAxisValues.Sum();
        this.xxSum = 0;
        this.xySum = 0;

        for (int i = 0; i < this.count; i++)
        {
            this.xySum += (this.xAxisValues[i]*this.yAxisValues[i]);
            this.xxSum += (this.xAxisValues[i]*this.xAxisValues[i]);
        }

        this.Slope = this.CalculateSlope();
        this.Intercept = this.CalculateIntercept();
        this.Start = this.CalculateStart();
        this.End = this.CalculateEnd();
    }

    private int CalculateSlope()
    {
        try
        {
            return ((this.count*this.xySum) - (this.xAxisValuesSum*this.yAxisValuesSum))/((this.count*this.xxSum) - (this.xAxisValuesSum*this.xAxisValuesSum));
        }
        catch (DivideByZeroException)
        {
            return 0;
        }
    }

    private int CalculateIntercept()
    {
        return (this.yAxisValuesSum - (this.Slope*this.xAxisValuesSum))/this.count;
    }

    private int CalculateStart()
    {
        return (this.Slope*this.xAxisValues.First()) + this.Intercept;
    }

    private int CalculateEnd()
    {
        return (this.Slope*this.xAxisValues.Last()) + this.Intercept;
    }
}

OK, here's my best pseudo math:

The equation for your line is:

Y = a + bX

Where:

b = (sum(x*y) - sum(x)sum(y)/n) / (sum(x^2) - sum(x)^2/n)

a = sum(y)/n - b(sum(x)/n)

Where sum(xy) is the sum of all x*y etc. Not particularly clear I concede, but it's the best I can do without a sigma symbol :)

... and now with added Sigma

b = (Σ(xy) - (ΣxΣy)/n) / (Σ(x^2) - (Σx)^2/n)

a = (Σy)/n - b((Σx)/n)

Where Σ(xy) is the sum of all x*y etc. and n is the number of points


Given that the trendline is straight, find the slope by choosing any two points and calculating:

(A) slope = (y1-y2)/(x1-x2)

Then you need to find the offset for the line. The line is specified by the equation:

(B) y = offset + slope*x

So you need to solve for offset. Pick any point on the line, and solve for offset:

(C) offset = y - (slope*x)

Now you can plug slope and offset into the line equation (B) and have the equation that defines your line. If your line has noise you'll have to decide on an averaging algorithm, or use curve fitting of some sort.

If your line isn't straight then you'll need to look into Curve fitting, or Least Squares Fitting - non trivial, but do-able. You'll see the various types of curve fitting at the bottom of the least squares fitting webpage (exponential, polynomial, etc) if you know what kind of fit you'd like.

Also, if this is a one-off, use Excel.


Here is a very quick (and semi-dirty) implementation of Bedwyr Humphreys's answer. The interface should be compatible with @matt's answer as well, but uses decimal instead of int and uses more IEnumerable concepts to hopefully make it easier to use and read.

Slope is b, Intercept is a

public class Trendline
{
    public Trendline(IList<decimal> yAxisValues, IList<decimal> xAxisValues)
        : this(yAxisValues.Select((t, i) => new Tuple<decimal, decimal>(xAxisValues[i], t)))
    { }
    public Trendline(IEnumerable<Tuple<Decimal, Decimal>> data)
    {
        var cachedData = data.ToList();

        var n = cachedData.Count;
        var sumX = cachedData.Sum(x => x.Item1);
        var sumX2 = cachedData.Sum(x => x.Item1 * x.Item1);
        var sumY = cachedData.Sum(x => x.Item2);
        var sumXY = cachedData.Sum(x => x.Item1 * x.Item2);

        //b = (sum(x*y) - sum(x)sum(y)/n)
        //      / (sum(x^2) - sum(x)^2/n)
        Slope = (sumXY - ((sumX * sumY) / n))
                    / (sumX2 - (sumX * sumX / n));

        //a = sum(y)/n - b(sum(x)/n)
        Intercept = (sumY / n) - (Slope * (sumX / n));

        Start = GetYValue(cachedData.Min(a => a.Item1));
        End = GetYValue(cachedData.Max(a => a.Item1));
    }

    public decimal Slope { get; private set; }
    public decimal Intercept { get; private set; }
    public decimal Start { get; private set; }
    public decimal End { get; private set; }

    public decimal GetYValue(decimal xValue)
    {
        return Intercept + Slope * xValue;
    }
}

Regarding a previous answer

if (B) y = offset + slope*x

then (C) offset = y/(slope*x) is wrong

(C) should be:

offset = y-(slope*x)

See: http://zedgraph.org/wiki/index.php?title=Trend