What is the branch in the destructor reported by gcov?
When I use gcov to measure test coverage of C++ code it reports branches in destructors.
struct Foo
{
virtual ~Foo()
{
}
};
int main (int argc, char* argv[])
{
Foo f;
}
When I run gcov with branch probabilities enabled (-b) I get the following output.
$ gcov /home/epronk/src/lcov-1.9/example/example.gcda -o /home/epronk/src/lcov-1.9/example -b
File 'example.cpp'
Lines executed:100.00% of 6
Branches executed:100.00% of 2
Taken at least once:50.00% of 2
Calls executed:40.00% of 5
example.cpp:creating 'example.cpp.gcov'
The part that bothers me is the "Taken at least once:50.00% of 2".
The generated .gcov file gives more detail.
$ cat example.cpp.gcov | c++filt
-: 0:Source:example.cpp
-: 0:Graph:/home/epronk/src/lcov-1.9/example/example.gcno
-: 0:Data:/home/epronk/src/lcov-1.9/example/example.gcda
-: 0:Runs:1
-: 0:Programs:1
-: 1:struct Foo
function Foo::Foo() called 1 returned 100% blocks executed 100%
1: 2:{
function Foo::~Foo() called 1 returned 100% blocks executed 75%
function Foo::~Foo() called 0 returned 0% blocks executed 0%
1: 3: virtual ~Foo()
1: 4: {
1: 5: }
branch 0 taken 0% (fallthrough)
branch 1 taken 100%
call 2 never executed
call 3 never executed
call 4 never executed
-: 6:};
-: 7:
function main called 1 returned 100% blocks executed 100%
1: 8:int main (int argc, char* argv[])
-: 9:{
1: 10: Foo f;
call 0 returned 100%
call 1 returned 100%
-: 11:}
Notice the line "branch 0 taken 0% (fallthrough)".
What causes this branch and what do I need to do in the code to get a 100% here?
- g++ (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2
- gcov (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2
Solution 1:
In a typical implementation the destructor usually has two branches: one for non-dynamic object destruction, another for dynamic object destruction. The selection of a specific branch is performed through a hidden boolean parameter passed to the destructor by the caller. It is usually passed through a register as either 0 or 1.
I would guess that, since in your case the destruction is for a non-dynamic object, the dynamic branch is not taken. Try adding a new
-ed and then delete
-ed object of class Foo
and the second branch should become taken as well.
The reason this branching is necessary is rooted in the specification of C++ language. When some class defines its own operator delete
, the selection of a specific operator delete
to call is done as if it was looked up from inside the class destructor. The end result of that is that for classes with virtual destructor operator delete
behaves as if it were a virtual function (despite formally being a static member of the class).
Many compilers implement this behavior literally: the proper operator delete
is called directly from inside the destructor implementation. Of course, operator delete
should only be called when destroying dynamically allocated objects (not for local or static objects). To achieve this, the call to operator delete
is placed into a branch controlled by the hidden parameter mentioned above.
In your example things look pretty trivial. I'd expect the optimizer to remove all unnecessary branching. However, it appears that somehow it managed to survive optimization.
Here's a little bit of additional research. Consider this code
#include <stdio.h>
struct A {
void operator delete(void *) { scanf("11"); }
virtual ~A() { printf("22"); }
};
struct B : A {
void operator delete(void *) { scanf("33"); }
virtual ~B() { printf("44"); }
};
int main() {
A *a = new B;
delete a;
}
This is how the code for the destructor of A
will look like when compiler with GCC 4.3.4 under default optimization settings
__ZN1AD2Ev: ; destructor A::~A
LFB8:
pushl %ebp
LCFI8:
movl %esp, %ebp
LCFI9:
subl $8, %esp
LCFI10:
movl 8(%ebp), %eax
movl $__ZTV1A+8, (%eax)
movl $LC1, (%esp) ; LC1 is "22"
call _printf
movl $0, %eax ; <------ Note this
testb %al, %al ; <------
je L10 ; <------
movl 8(%ebp), %eax ; <------
movl %eax, (%esp) ; <------
call __ZN1AdlEPv ; <------ calling `A::operator delete`
L10:
leave
ret
(The destructor of B
is a bit more complicated, which is why I use A
here as an example. But as far as the branching in question is concerned, destructor of B
does it in the same way).
However, right after this destructor the generated code contains another version of the destructor for the very same class A
, which looks exactly the same, except the movl $0, %eax
instruction is replaced with movl $1, %eax
instruction.
__ZN1AD0Ev: ; another destructor A::~A
LFB10:
pushl %ebp
LCFI13:
movl %esp, %ebp
LCFI14:
subl $8, %esp
LCFI15:
movl 8(%ebp), %eax
movl $__ZTV1A+8, (%eax)
movl $LC1, (%esp) ; LC1 is "22"
call _printf
movl $1, %eax ; <------ See the difference?
testb %al, %al ; <------
je L14 ; <------
movl 8(%ebp), %eax ; <------
movl %eax, (%esp) ; <------
call __ZN1AdlEPv ; <------ calling `A::operator delete`
L14:
leave
ret
Note the code blocks I labeled with arrows. This is exactly what I was talking about. Register al
serves as that hidden parameter. This "pseudo-branch" is supposed to either invoke or skip the call to operator delete
in accordance with the value of al
. However, in the first version of the destructor this parameter is hardcoded into the body as always 0
, while in the second one it is hardcoded as always 1
.
Class B
also has two versions of the destructor generated for it. So we end up with 4 distinctive destructors in the compiled program: two destructors for each class.
I can guess that at the beginning the compiler internally thought in terms of a single "parameterized" destructor (which works exactly as I described above the break). And then it decided to split the parameterized destructor into two independent non-parameterized versions: one for the hardcoded parameter value of 0
(non-dynamic destructor) and another for the hardcoded parameter value of 1
(dynamic destructor). In non-optimized mode it does that literally, by assigning the actual parameter value inside the body of the function and leaving all the branching totally intact. This is acceptable in non-optimized code, I guess. And that's exactly what you are dealing with.
In other words, the answer to your question is: It is impossible to make the compiler to take all the branches in this case. There's no way to achieve 100% coverage. Some of these branches are "dead". It is just that the approach to generating non-optimized code is rather "lazy" and "loose" in this version of GCC.
There might be a way to prevent the split in non-optimized mode, I think. I just haven't found it yet. Or, quite possibly, it can't be done. Older versions of GCC used true parameterized destructors. Maybe in this version of GCC they decided to switch to two-destructor approach and while doing it they "reused" the existing code-generator in such a quick-and-dirty way, expecting the optimizer to clean out the useless branches.
When you are compiling with optimization enabled GCC will not allow itself such luxuries as useless branching in the final code. You should probably try to analyze optimized code. Non-optimized GCC-generated code has lots of meaningless inaccessible branches like this one.
Solution 2:
In the destructor, GCC generated a conditial jump for a condition which can never be true (%al is not zero, since it was just assigned a 1):
[...]
29: b8 01 00 00 00 mov $0x1,%eax
2e: 84 c0 test %al,%al
30: 74 30 je 62 <_ZN3FooD0Ev+0x62>
[...]