How do the stream manipulators work?

It is well known that the user can define stream manipulators like this:

ostream& tab(ostream & output)
{
    return output<< '\t';
} 

And this can be used in main() like this:

cout<<'a'<<tab<<'b'<<'c'<<endl;

Please explain me how does this all work? If operator<< assumes as a second parameter a pointer to the function that takes and returns ostream &, then please explain my why it is necessary? What would be wrong if the function does not take and return ostream & but it was void instead of ostream &?

Also it is interesting why “dec”, “hex” manipulators take effect until I don’t change between them, but user defined manipulators should be always used in order to take effect for each streaming?


The standard defines the following operator<< overload in the basic_ostream class template:

basic_ostream<charT,traits>& operator<<(
    basic_ostream<charT,traits>& (*pf) (basic_ostream<charT,traits>&) );

Effects: None. Does not behave as a formatted output function (as described in 27.6.2.5.1).

Returns: pf(*this).

The parameter is a pointer to a function taking and returning a reference to a std::ostream.

This means that you can "stream" a function with this signature to an ostream object and it has the effect of calling that function on the stream. If you use the name of a function in an expression then it is (usually) converted to a pointer to that function.

std::hex is an std::ios_base manipulator defined as follows.

   ios_base& hex(ios_base& str);

Effects: Calls str.setf(ios_base::hex, ios_base::basefield).

Returns: str.

This means that streaming hex to an ostream will set the output base formatting flags to output numbers in hexadecimal. The manipulator doesn't output anything itself.


There is nothing wrong with it except there is no overloaded << operator defined for it. The existing overloads for << are expecting a manipulator with the signature ostream& (*fp)(ostream&).

If you gave it a manipulator with the type ostream& (*fp)() you would get a compiler error since it does not have a definition for operator<<(ostream&, ostream& (*fp)()). If you wanted this functionality you would have to overload the << operator to accept manipulators of this type.

You would have to write a definition for this:
ostream& ostream::operator<<(ostream& (*m)())

Keep in mind here that nothing magical is happening here. The stream libraries rely heavily on standard C++ features: operator overloading, classes, and references.

Now that you know how you can create the functionality you described, here's why we don't:

Without passing a reference to the stream we are trying to manipulate, we can't make modifications to the stream connected to the final device (cin, out, err, fstream, etc). The function (modifiers are all just functions with fancy names) would either have to return a new ostream that had nothing to do with the one to the left of the << operator, or through some very ugly mechanism, figure out which ostream it should connect with else everything to right of the modifier won't make it to the final device, but would rather be sent to whatever ostream the function/modifier returned.

Think of streams like this

cout << "something here" << tab << "something else"<< endl;

really means

(((cout << "something here") << tab ) << "something else" ) << endl);

where each set of parentheses does something to cout (write, modify etc) and then returns cout so the next set of parentheses can work on it.

If your tab modifier/function did not take a reference to an ostream it would have to somehow guess what ostream was to the left of the << operator to perform its task. Were you working with cour, cerr, some file stream...? The internals of the function will never know unless they are handed that information somehow, and why not that how to be as simple as a reference to it.

Now to really drive the point home, let's look at what endl really is and which overloaded version of the << operator we are using:

This operator looks like this:

  ostream& ostream::operator<<(ostream& (*m)(ostream&)) 
  {  
      return (*m)(*this);
  }

endl looks like this:

  ostream& endl(ostream& os)      
  {  
      os << '\n'; 
      os.flush();     
      return os;
  }

The purpose of endl is to add a newline and flush the stream, making sure all the contents of the stream’s internal buffer have been written to the device. In order to do this, it first needs to write a '\n' to this stream. It then needs to tell the stream to flush. The only way for endl to know which stream to write to and flush is for the operator to pass that information to the endl function when it calls it. It'd be like me telling you to wash my car, but never tell you which car is mine in the full parking lot. You'd never be able to get your job done. You need me to either hand you my car, or I can wash it myself.

I hope that clears things up

PS - If you do happen to accidentally find my car, please wash it.


Normally the stream manipulator sets some flags (or other settings) on the stream object, so that next time it is used, it will act according to the flags. The manipulator therefore returns the same object its passed. The operator<< overload that called the manipulator already has this object, of course, so as you noticed, the return value isn't strictly needed for that case. I think this covers all the standard manipulators - they all return their input.

However, with the return value, the framework is flexible enough that a custom stream manipulator could return a different object, presumably a wrapper for the object its given. This other object would then be returned from cout << 'a' << tab, and could do something that the built-in ostream formatting settings don't support.

Not sure how you'd arrange for this other object to be freed, though, so I don't know how practical this is. It might have to be something peculiar, like a proxy object that's managed by the ostream itself. Then the manipulator would only work for custom stream classes that actively support it, which isn't usually the point of manipulators.