Why use functors over functions?

At least four good reasons:

Separation of concerns

In your particular example, the functor-based approach has the advantage of separating the iteration logic from the average-calculation logic. So you can use your functor in other situations (think about all the other algorithms in the STL), and you can use other functors with for_each.

Parameterisation

You can parameterise a functor more easily. So for instance, you could have a CalculateAverageOfPowers functor that takes the average of the squares, or cubes, etc. of your data, which would be written thus:

class CalculateAverageOfPowers
{
public:
    CalculateAverageOfPowers(float p) : acc(0), n(0), p(p) {}
    void operator() (float x) { acc += pow(x, p); n++; }
    float getAverage() const { return acc / n; }
private:
    float acc;
    int   n;
    float p;
};

You could of course do the same thing with a traditional function, but then makes it difficult to use with function pointers, because it has a different prototype to CalculateAverage.

Statefulness

And as functors can be stateful, you could do something like this:

CalculateAverage avg;
avg = std::for_each(dataA.begin(), dataA.end(), avg);
avg = std::for_each(dataB.begin(), dataB.end(), avg);
avg = std::for_each(dataC.begin(), dataC.end(), avg);

to average across a number of different data-sets.

Note that almost all STL algorithms/containers that accept functors require them to be "pure" predicates, i.e. have no observable change in state over time. for_each is a special case in this regard (see e.g. Effective Standard C++ Library - for_each vs. transform).

Performance

Functors can often be inlined by the compiler (the STL is a bunch of templates, after all). Whilst the same is theoretically true of functions, compilers typically won't inline through a function pointer. The canonical example is to compare std::sort vs qsort; the STL version is often 5-10x faster, assuming the comparison predicate itself is simple.

Summary

Of course, it's possible to emulate the first three with traditional functions and pointers, but it becomes a great deal simpler with functors.


Advantages of Functors:

  • Unlike Functions Functor can have state.
  • Functor fits into OOP paradigm as compared to functions.
  • Functor often may be inlined unlike Function pointers
  • Functor doesn't require vtable and runtime dispatching, and hence more efficient in most cases.

std::for_each is easily the most capricious and least useful of the standard algorithms. It's just a nice wrapper for a loop. However, even it has advantages.

Consider what your first version of CalculateAverage must look like. It will have a loop over the iterators, and then do stuff with each element. What happens if you write that loop incorrectly? Oops; there's a compiler or runtime error. The second version can never have such errors. Yes, it's not a lot of code, but why do we have to write loops so often? Why not just once?

Now, consider real algorithms; the ones that actually do work. Do you want to write std::sort? Or std::find? Or std::nth_element? Do you even know how to implement it in the most efficient way possible? How many times do you want to implement these complex algorithms?

As for ease of reading, that's in the eyes of the beholder. As I said, std::for_each is hardly the first choice for algorithms (especially with C++0x's range-based for syntax). But if you're talking about real algorithms, they're very readable; std::sort sorts a list. Some of the more obscure ones like std::nth_element won't be as familiar, but you can always look it up in your handy C++ reference.

And even std::for_each is perfectly readable once you use Lambda's in C++0x.


•Unlike Functions Functor can have state.

This is very interesting because std::binary_function, std::less and std::equal_to has a template for an operator() that is const. But what if you wanted to print a debug message with the current call count for that object, how would you do it?

Here is template for std::equal_to:

struct equal_to : public binary_function<_Tp, _Tp, bool>
{
  bool
  operator()(const _Tp& __x, const _Tp& __y) const
  { return __x == __y; }
};

I can think of 3 ways to allow the operator() to be const, and yet change a member variable. But what is the best way? Take this example:

#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cassert>  // assert() MACRO

// functor for comparing two integer's, the quotient when integer division by 10.
// So 50..59 are same, and 60..69 are same.
// Used by std::sort()

struct lessThanByTen: public std::less<int>
{
private:
    // data members
    int count;  // nr of times operator() was called

public:
    // default CTOR sets count to 0
    lessThanByTen() :
        count(0)
    {
    }


    // @override the bool operator() in std::less<int> which simply compares two integers
    bool operator() ( const int& arg1, const int& arg2) const
    {
        // this won't compile, because a const method cannot change a member variable (count)
//      ++count;


        // Solution 1. this trick allows the const method to change a member variable
        ++(*(int*)&count);

        // Solution 2. this trick also fools the compilers, but is a lot uglier to decipher
        ++(*(const_cast<int*>(&count)));

        // Solution 3. a third way to do same thing:
        {
        // first, stack copy gets bumped count member variable
        int incCount = count+1;

        const int *iptr = &count;

        // this is now the same as ++count
        *(const_cast<int*>(iptr)) = incCount;
        }

        std::cout << "DEBUG: operator() called " << count << " times.\n";

        return (arg1/10) < (arg2/10);
    }
};

void test1();
void printArray( const std::string msg, const int nums[], const size_t ASIZE);

int main()
{
    test1();
    return 0;
}

void test1()
{
    // unsorted numbers
    int inums[] = {33, 20, 10, 21, 30, 31, 32, 22, };

    printArray( "BEFORE SORT", inums, 8 );

    // sort by quotient of integer division by 10
    std::sort( inums, inums+8, lessThanByTen() );

    printArray( "AFTER  SORT", inums, 8 );

}

//! @param msg can be "this is a const string" or a std::string because of implicit string(const char *) conversion.
//! print "msg: 1,2,3,...N", where 1..8 are numbers in nums[] array

void printArray( const std::string msg, const int nums[], const size_t ASIZE)
{
    std::cout << msg << ": ";
    for (size_t inx = 0; inx < ASIZE; ++inx)
    {
        if (inx > 0)
            std::cout << ",";
        std::cout << nums[inx];
    }
    std::cout << "\n";
}

Because all 3 solutions are compiled in, it increments count by 3. Here's the output:

gcc -g -c Main9.cpp
gcc -g Main9.o -o Main9 -lstdc++
./Main9
BEFORE SORT: 33,20,10,21,30,31,32,22
DEBUG: operator() called 3 times.
DEBUG: operator() called 6 times.
DEBUG: operator() called 9 times.
DEBUG: operator() called 12 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 12 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 18 times.
DEBUG: operator() called 18 times.
DEBUG: operator() called 21 times.
DEBUG: operator() called 21 times.
DEBUG: operator() called 24 times.
DEBUG: operator() called 27 times.
DEBUG: operator() called 30 times.
DEBUG: operator() called 33 times.
DEBUG: operator() called 36 times.
AFTER  SORT: 10,20,21,22,33,30,31,32