Why does rand() + rand() produce negative numbers?
I observed that rand()
library function when it is called just once within a loop, it almost always produces positive numbers.
for (i = 0; i < 100; i++) {
printf("%d\n", rand());
}
But when I add two rand()
calls, the numbers generated now have more negative numbers.
for (i = 0; i < 100; i++) {
printf("%d = %d\n", rand(), (rand() + rand()));
}
Can someone explain why I am seeing negative numbers in the second case?
PS: I initialize the seed before the loop as srand(time(NULL))
.
rand()
is defined to return an integer between 0
and RAND_MAX
.
rand() + rand()
could overflow. What you observe is likely a result of undefined behaviour caused by integer overflow.
The problem is the addition. rand()
returns an int
value of 0...RAND_MAX
. So, if you add two of them, you will get up to RAND_MAX * 2
. If that exceeds INT_MAX
, the result of the addition overflows the valid range an int
can hold. Overflow of signed values is undefined behaviour and may lead to your keyboard talking to you in foreign tongues.
As there is no gain here in adding two random results, the simple idea is to just not do it. Alternatively you can cast each result to unsigned int
before the addition if that can hold the sum. Or use a larger type. Note that long
is not necessarily wider than int
, the same applies to long long
if int
is at least 64 bits!
Conclusion: Just avoid the addition. It does not provide more "randomness". If you need more bits, you might concatenate the values sum = a + b * (RAND_MAX + 1)
, but that also likely requires a larger data type than int
.
As your stated reason is to avoid a zero-result: That cannot be avoided by adding the results of two rand()
calls, as both can be zero. Instead, you can just increment. If RAND_MAX == INT_MAX
, this cannot be done in int
. However, (unsigned int)rand() + 1
will do very, very likely. Likely (not definitively), because it does require UINT_MAX > INT_MAX
, which is true on all implementations I'm aware of (which covers quite some embedded architectures, DSPs and all desktop, mobile and server platforms of the past 30 years).
Warning:
Although already sprinkled in comments here, please note that adding two random values does not get a uniform distribution, but a triangular distribution like rolling two dice: to get 12
(two dice) both dice have to show 6
. for 11
there are already two possible variants: 6 + 5
or 5 + 6
, etc.
So, the addition is also bad from this aspect.
Also note that the results rand()
generates are not independent of each other, as they are generated by a pseudorandom number generator. Note also that the standard does not specify the quality or uniform distribution of the calculated values.
This is an answer to a clarification of the question made in comment to this answer,
the reason i was adding was to avoid '0' as the random number in my code. rand()+rand() was the quick dirty solution which readily came to my mind.
The problem was to avoid 0. There are (at least) two problems with the proposed solution. One is, as the other answers indicate, that rand()+rand()
can invoke undefined behavior. Best advice is to never invoke undefined behavior. Another issue is there's no guarantee that rand()
won't produce 0 twice in a row.
The following rejects zero, avoids undefined behavior, and in the vast majority of cases will be faster than two calls to rand()
:
int rnum;
for (rnum = rand(); rnum == 0; rnum = rand()) {}
// or do rnum = rand(); while (rnum == 0);
Basically rand()
produce numbers between 0
and RAND_MAX
, and 2 RAND_MAX > INT_MAX
in your case.
You can modulus with the max value of your data-type to prevent overflow. This ofcourse will disrupt the distribution of the random numbers, but rand
is just a way to get quick random numbers.
#include <stdio.h>
#include <limits.h>
int main(void)
{
int i=0;
for (i=0; i<100; i++)
printf(" %d : %d \n", rand(), ((rand() % (INT_MAX/2))+(rand() % (INT_MAX/2))));
for (i=0; i<100; i++)
printf(" %d : %ld \n", rand(), ((rand() % (LONG_MAX/2))+(rand() % (LONG_MAX/2))));
return 0;
}