Smoothing values over time: moving average or something better?

I'm coding something at the moment where I'm taking a bunch of values over time from a hardware compass. This compass is very accurate and updates very often, with the result that if it jiggles slightly, I end up with the odd value that's wildly inconsistent with its neighbours. I want to smooth those values out.

Having done some reading around, it would appear that what I want is a high-pass filter, a low-pass filter or a moving average. Moving average I can get down with, just keep a history of the last 5 values or whatever, and use the average of those values downstream in my code where I was once just using the most recent value.

That should, I think, smooth out those jiggles nicely, but it strikes me that it's probably quite inefficient, and this is probably one of those Known Problems to Proper Programmers to which there's a really neat Clever Math solution.

I am, however, one of those awful self-taught programmers without a shred of formal education in anything even vaguely related to CompSci or Math. Reading around a bit suggests that this may be a high or low pass filter, but I can't find anything that explains in terms comprehensible to a hack like me what the effect of these algorithms would be on an array of values, let alone how the math works. The answer given here, for instance, technically does answer my question, but only in terms comprehensible to those who would probably already know how to solve the problem.

It would be a very lovely and clever person indeed who could explain the sort of problem this is, and how the solutions work, in terms understandable to an Arts graduate.


If you are trying to remove the occasional odd value, a low-pass filter is the best of the three options that you have identified. Low-pass filters allow low-speed changes such as the ones caused by rotating a compass by hand, while rejecting high-speed changes such as the ones caused by bumps on the road, for example.

A moving average will probably not be sufficient, since the effects of a single "blip" in your data will affect several subsequent values, depending on the size of your moving average window.

If the odd values are easily detected, you may even be better off with a glitch-removal algorithm that completely ignores them:

if (abs(thisValue - averageOfLast10Values) > someThreshold)
{
    thisValue = averageOfLast10Values;
}

Here is a guick graph to illustrate:

graph comparison

The first graph is the input signal, with one unpleasant glitch. The second graph shows the effect of a 10-sample moving average. The final graph is a combination of the 10-sample average and the simple glitch detection algorithm shown above. When the glitch is detected, the 10-sample average is used instead of the actual value.


If your moving average has to be long in order to achieve the required smoothing, and you don't really need any particular shape of kernel, then you're better off if you use an exponentially decaying moving average:

a(i+1) = tiny*data(i+1) + (1.0-tiny)*a(i)

where you choose tiny to be an appropriate constant (e.g. if you choose tiny = 1- 1/N, it will have the same amount of averaging as a window of size N, but distributed differently over older points).

Anyway, since the next value of the moving average depends only on the previous one and your data, you don't have to keep a queue or anything. And you can think of this as doing something like, "Well, I've got a new point, but I don't really trust it, so I'm going to keep 80% of my old estimate of the measurement, and only trust this new data point 20%". That's pretty much the same as saying, "Well, I only trust this new point 20%, and I'll use 4 other points that I trust the same amount", except that instead of explicitly taking the 4 other points, you're assuming that the averaging you did last time was sensible so you can use your previous work.