It appears to be defined as NUMBER_OF_FALSE_POSITIVES / NUMBER_OF_WINDOWS, where the detection window is a 64x128 moving window. Notice in the last paragraph of section 4 it states:

... In a multiscale detector it corresponds to a raw error rate of about 0.8 false positives per 640×480 image tested.


I had the same confusion. The authors state that they are using DET curves. When you look at several examples about DET curves you see that x-axis is actually False Positive Rate. That means FPPW is FALSE_POSITIVE_RATE.

Hence FPPW = NUMBER_OF_FALSE_POSITIVES / NUMBER_OF_NEGATIVE_SAMPLES

DET curve


They have a window which they move across the image and evaluate if it shows a human or not. FPPW is a measure of how often they detect something else as a human within their detector window. It describes the quality of their classification in a way that is independent from image sizes or people counts on a particular image.

So basically they count how often their dumb computer says "yes that's a human", when they show it some rock or icecream.