What does the Fourier Transform mean in the context of images?
This is clearly a very important equation with tonnes of properties that I see come up a lot in image processing literature, but I don't understand why this equation is important, and what it is saying.
What does it really mean and why is the Fourier Transform so prevalent in image processing?
Solution 1:
In sound processing, the Fourier Transform has a physically intuitive meaning. A sound or function $f : [0,b] \rightarrow [-1,1]$ can be represented as a trigonometric series. And each term of the series corresponds to a frequency which you perceive. Many effects (like filtering, reverb, etc.) have an interpretation in the frequency domain which is useful for analysis. Unfortunately, the physical interpretation isn't as simple when we start talking about images. But on the analytic usefulness transfers over.
If we're talking about grey-scale, then an image is simply a function $f : [0,1]^2 \rightarrow [0,1]$. It takes a point in the square $[0,1]\times[0,1]$ and produces a value between 0 and 1, the intensity. The Fourier Transform just says we can represent this function in the frequency domain using a countable basis of trigonometric functions. Say you want to blur an image; this corresponds to a low-pass filter in the frequency domain. The following link shows many examples of Fourier transforms of images, gives an explanation of the physical interpretation (which I don't claim to understand entirely) and shows examples of basic image processing. This website shows more illustrative examples.
Another very important use of (variants of) the 2d-Fourier transform is image compression. The Wikipedia page for the JPEG codec lists the basis functions used to represent images.
Finally, note that if you're talking about an RGB image you can represent the image using the Fourier transform on each color component.
Solution 2:
The Fourier transform - any Fourier transform - splits a signal into "frequencies", and measures the amplitude and alignment of each frequency.
In the case of sound, these are audible frequencies that you can hear. But in the case of an image, things are less obvious. The mathematics is still the same, but it's harder to wrap your brain around.
The Fourier transform measures "spatial frequencies" in the image. If you imagine horizontal or vertical bars of colour repeating at different speeds, these are the "frequencies" that the Fourier transform is measuring. Much like a sound signal, an image with long, rolling, smooth colour transitions contains many low frequencies, whereas one with abrupt changes in colour possesses lots of high frequencies.
The Fourier transform thus has a couple of uses in image processing. I can think of two:
First, when you change any signal, its spectrum obviously changes as well. When you take a photograph and the camera moves, you get a blurry image. It's not at all obvious how you could try to "unblur" this image. But, when you talk about the spectrum of the image, a blur is simply a low-pass filtering operation. In principle, if you undo that filtering, you could unblur the image.
(Obviously, that's the theory. In practise, it's not that simple...)
Lots of other interesting things you could do to an image are quite complicated in terms of what happens to the individual pixels, but very simple in terms of how the spectrum changes. So using the Fourier transform to get you a spectrum is an obvious step.
Alternatively, the Fourier transform is useful for image compression. If you save the individual pixel colours less accurately, the image just looks like some God-awful computer graphics from the 1980s. But if you save the spectrum less accurately, the picture just gets slightly blurry, which is far less annoying.
By doing a sophisticated analysis of the way the human brain processes image data, you can estimate which frequencies in a given image are "the most important", and store those with high precision, while throwing away any "less important" frequencies. This is how JPEG and friends work.