Pure virtual functions may not have an inline definition. Why?
Pure virtual functions are those member functions that are virtual and have the pure-specifier ( = 0;
)
Clause 10.4 paragraph 2 of C++03 tells us what an abstract class is and, as a side note, the following:
[Note: a function declaration cannot provide both a pure-specifier and a definition —end note] [Example:
struct C {
virtual void f() = 0 { }; // ill-formed
};
—end example]
For those who are not very familiar with the issue, please note that pure virtual functions can have definitions but the above-mentioned clause forbids such definitions to appear inline (lexically in-class). (For uses of defining pure virtual functions you may see, for example, this GotW)
Now for all other kinds and types of functions it is allowed to provide an in-class definition, and this restriction seems at first glance absolutely artificial and inexplicable. Come to think of it, it seems such on second and subsequent glances :) But I believe the restriction wouldn't be there if there weren't a specific reason for that.
My question is: does anybody know those specific reasons? Good guesses are also welcome.
Notes:
- MSVC does allow PVF's to have inline definitions. So don't get surprised :)
- the word
inline
in this question does not refer to the inline keyword. It is supposed to mean lexically in-class
Solution 1:
In the SO thread "Why is a pure virtual function initialized by 0?" Jerry Coffin provided this quote from Bjarne Stroustrup’s The Design & Evolution of C++, section §13.2.3, where I've added some emphasis of the part I think is relevant:
The curious
=0
syntax was chosen over the obvious alternative of introducing a new keyword pure or abstract because at the time I saw no chance of getting a new keyword accepted. Had I suggested pure, Release 2.0 would have shipped without abstract classes. Given a choice between a nicer syntax and abstract classes, I chose abstract classes. Rather than risking delay and incurring the certain fights over pure, I used the tradition C and C++ convention of using 0 to represent "not there." The=0
syntax fits with my view that a function body is the initializer for a function and also with the (simplistic, but usually adequate) view of the set of virtual functions being implemented as a vector of function pointers. [ … ]
So, when choosing the syntax Bjarne was thinking of a function body as a kind of initializer part of the declarator, and =0
as an alternate form of initializer, one that indicated “no body” (or in his words, “not there”).
It stands to reason that one cannot both indicate “not there” and have a body – in that conceptual picture.
Or, still in that conceptual picture, having two initializers.
Now, that's as far as my telepathic powers, google-foo and soft-reasoning goes. I surmise that nobody's been Interested Enough™ to formulate a proposal to the committee about having this purely syntactical restriction lifted, and following up with all the work that that entails. Thus it's still that way.
Solution 2:
You shouldn't have so much faith in the standardization committee. Not everything has a deep reason to explain it. Something are so just because at first nobody thought otherwise and after nobody thought that changing it is important enough (I think it is the case here); for things old enough it could even be an artifact of the first implementation. Some are the result of evolution -- there was a deep reason at a time, but the reason was removed and the initial decision wasn't reconsidered again (it could be also the case here, where the initial decision was because any definition of the pure function was forbidden). Some are the result of negotiation between different POV and the result lacks coherence but this lack was deemed necessary to reach to consensus.
Solution 3:
Good guesses... well, considering the situation:
- it is legal to declare the function inline and provide an explicitly inline body (outside the class), so there's clearly no objection to the only practical implication of being declared inside the class.
- I see no potential ambiguities or conflicts introduced in the grammar, so no logical reason for the exclusion of function definitions in situ.
My guess: the use for bodies for pure virtual functions was realised after the = 0 | { ... }
grammar was formulated, and the grammar simply wasn't revised. It's worth considering that there are a lot of proposals for language changes / enhancements - including those to make things like this more logical and consistent - but the number that are picked up by someone and written up as formal proposals is much smaller, and the number of those the Committee has time to consider, and believes the compiler-vendors will be prepared to implement, is much smaller again. Things like this need a champion, and perhaps you're the first person to see an issue in it. To get a feel for this process, check out http://www2.research.att.com/~bs/evol-issues.html.
Solution 4:
Good guesses are welcome you say?
I think the = 0
at the declaration comes from having the implementation in mind. Most likely this definition means, that you get a NULL
entry in the RTTI's vtbl
of the class information -- the location where at runtime addresses of the member functions of a class are stored.
But actually, when put a definition of the function in your *.cpp
file, you introduce a name into the object file for the linker: An address in the *.o
file where to find a specific function.
The basic linker then does need to know about C++ anymore. It can just link together, even though you declared it as = 0
.
I think I read that it is possible what you described, although I forgot the behaviour :-)...