Big difference (x9) in the execution time between almost identical code in C and C++
Both programs do exactly the same thing. They use the same exact algorithm, and given its low complexity, their performance is mostly bound to efficiency of the input and output handling.
scanning the input with scanf("%d", &fact_num);
on one side and cin >> fact_num;
on the other does not seem very costly either way. In fact it should be less costly in C++ since the type of conversion is known at compile time and the correct parser can be invoked directly by the C++ compiler. The same holds for the output. You even make a point of writing a separate call for printf("%s","\n");
, but the C compiler is good enough to compile this as a call to putchar('\n');
.
So looking at the complexity of both the I/O and computation, the C++ version should be faster than the C version.
Completely disabling the buffering of stdout
slows the C implementation to something even slower than the C++ version. Another test by AlexLop with an fflush(stdout);
after the last printf
yields similar performance as the C++ version. It is not as slow as completely disabling buffering because output is written to the system in small chunks instead of one byte at a time.
This seems to point to a specific behavior in your C++ library: I suspect your system's implementation of cin
and cout
flushes the output to cout
when input is requested from cin
. Some C libraries do this as well, but usually only when reading/writing to and from the terminal. The benchmarking done by the www.spoj.com site probably redirects input and output to and from files.
AlexLop did another test: reading all the inputs at once in a vector and subsequently computing and writing all output helps understanding why the C++ version is so much slower. It increases performance to that of the C version, this proves my point and removes suspicion on the C++ formatting code.
Another test by Blastfurnace, storing all outputs in an std::ostringstream
and flushing that in one blast at the end, does improve the C++ performance to that of the basic C version. QED.
Interlacing input from
cin
and output tocout
seems to cause very inefficient I/O handling, defeating the stream buffering scheme. reducing performance by a factor of 10.
PS: your algorithm is incorrect for fact_num >= UINT_MAX / 5
because fives *= 5
will overflow and wrap around before it becomes > fact_num
. You can correct this by making fives
an unsigned long
or an unsigned long long
if one of these types is larger than unsigned int
. Also use %u
as the scanf
format. You are lucky the guys at www.spoj.com are not too strict in their benchmarks.
EDIT: As later explained by vitaux, this behavior is indeed mandated by the C++ standard. cin
is tied to cout
by default. An input operation from cin
for which the input buffer needs refilling will cause cout
to flush pending output. In the OP's implementation, cin
seems to flush cout
systematically, which is a bit overkill and visibly inefficient.
Ilya Popov provided a simple solution for this: cin
can be untied from cout
by casting another magical spell in addition to std::ios_base::sync_with_stdio(false);
:
cin.tie(nullptr);
Also note that such forced flush also occurs when using std::endl
instead of '\n'
to produce an end of line on cout
. Changing the output line to the more C++ idiomatic and innocent looking cout << num_of_trailing_zeros << endl;
would degrade performance the same way.
Another trick to make iostream
s faster when you use both cin
and cout
is to call
cin.tie(nullptr);
By default, when you input anything from cin
, it flushes cout
. It can significantly harm performance if you do interleaved input and output. This is done for the command line interface uses, where you show some prompt and then wait for data:
std::string name;
cout << "Enter your name:";
cin >> name;
In this case you want to make sure the prompt is actually shown before you start waiting for input. With the line above you break that tie, cin
and cout
become independent.
Since C++11, one more way to achieve better performance with iostreams is to use std::getline
together with std::stoi
, like this:
std::string line;
for (int i = 0; i < n && std::getline(std::cin, line); ++i)
{
int x = std::stoi(line);
}
This way can come close to C-style in performance, or even surpass scanf
. Using getchar
and especially getchar_unlocked
together with handwritten parsing still provides better performance.
PS. I have written a post comparing several ways to input numbers in C++, useful for online judges, but it is only in Russian, sorry. The code samples and the final table, however, should be understandable.
The problem is that, quoting cppreference:
any input from std::cin, output to std::cerr, or program termination forces a call to std::cout.flush()
This is easy to test: if you replace
cin >> fact_num;
with
scanf("%d", &fact_num);
and same for cin >> num_of_inputs
but keep cout
you'll get pretty much the same performance in your C++ version (or, rather IOStream version) as in C one:
The same happens if you keep cin
but replace
cout << num_of_trailing_zeros << "\n";
with
printf("%d", num_of_trailing_zeros);
printf("%s","\n");
A simple solution is to untie cout
and cin
as mentioned by Ilya Popov:
cin.tie(nullptr);
Standard library implementations are allowed to omit the call to flush in certain cases, but not always. Here's a quote from C++14 27.7.2.1.3 (thanks to chqrlie):
Class basic_istream::sentry : First, if is.tie() is not a null pointer, the function calls is.tie()->flush() to synchronize the output sequence with any associated external C stream. Except that this call can be suppressed if the put area of is.tie() is empty. Further an implementation is allowed to defer the call to flush until a call of is.rdbuf()->underflow() occurs. If no such call occurs before the sentry object is destroyed, the call to flush may be eliminated entirely.