Is 1.0 a valid output from std::generate_canonical?
I always thought random numbers would lie between zero and one, without 1
, i.e. they are numbers from the half-open interval [0,1). The documention on cppreference.com of std::generate_canonical
confirms this.
However, when I run the following program:
#include <iostream>
#include <limits>
#include <random>
int main()
{
std::mt19937 rng;
std::seed_seq sequence{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
rng.seed(sequence);
rng.discard(12 * 629143 + 6);
float random = std::generate_canonical<float,
std::numeric_limits<float>::digits>(rng);
if (random == 1.0f)
{
std::cout << "Bug!\n";
}
return 0;
}
It gives me the following output:
Bug!
i.e. it generates me a perfect 1
, which causes problems in my MC integration. Is that valid behavior or is there an error on my side? This gives the same output with G++ 4.7.3
g++ -std=c++11 test.c && ./a.out
and clang 3.3
clang++ -stdlib=libc++ -std=c++11 test.c && ./a.out
If this is correct behavior, how can I avoid 1
?
Edit 1: G++ from git seems to suffer from the same problem. I am on
commit baf369d7a57fb4d0d5897b02549c3517bb8800fd
Date: Mon Sep 1 08:26:51 2014 +0000
and compiling with ~/temp/prefix/bin/c++ -std=c++11 -Wl,-rpath,/home/cschwan/temp/prefix/lib64 test.c && ./a.out
gives the same output, ldd
yields
linux-vdso.so.1 (0x00007fff39d0d000)
libstdc++.so.6 => /home/cschwan/temp/prefix/lib64/libstdc++.so.6 (0x00007f123d785000)
libm.so.6 => /lib64/libm.so.6 (0x000000317ea00000)
libgcc_s.so.1 => /home/cschwan/temp/prefix/lib64/libgcc_s.so.1 (0x00007f123d54e000)
libc.so.6 => /lib64/libc.so.6 (0x000000317e600000)
/lib64/ld-linux-x86-64.so.2 (0x000000317e200000)
Edit 2: I reported the behavior here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63176
Edit 3: The clang team seems to be aware of the problem: http://llvm.org/bugs/show_bug.cgi?id=18767
Solution 1:
The problem is in mapping from the codomain of std::mt19937
(std::uint_fast32_t
) to float
; the algorithm described by the standard gives incorrect results (inconsistent with its description of the output of the algorithm) when loss of precision occurs if the current IEEE754 rounding mode is anything other than round-to-negative-infinity (note that the default is round-to-nearest).
The 7549723rd output of mt19937 with your seed is 4294967257 (0xffffffd9u
), which when rounded to 32-bit float gives 0x1p+32
, which is equal to the max value of mt19937, 4294967295 (0xffffffffu
) when that is also rounded to 32-bit float.
The standard could ensure correct behavior if it were to specify that when converting from the output of the URNG to the RealType
of generate_canonical
, rounding is to be performed towards negative infinity; this would give a correct result in this case. As QOI, it would be good for libstdc++ to make this change.
With this change, 1.0
will no longer be generated; instead the boundary values 0x1.fffffep-N
for 0 < N <= 8
will be generated more often (approximately 2^(8 - N - 32)
per N
, depending on the actual distribution of MT19937).
I would recommend to not use float
with std::generate_canonical
directly; rather generate the number in double
and then round towards negative infinity:
double rd = std::generate_canonical<double,
std::numeric_limits<float>::digits>(rng);
float rf = rd;
if (rf > rd) {
rf = std::nextafter(rf, -std::numeric_limits<float>::infinity());
}
This problem can also occur with std::uniform_real_distribution<float>
; the solution is the same, to specialize the distribution on double
and round the result towards negative infinity in float
.
Solution 2:
According to the standard, 1.0
is not valid.
C++11 §26.5.7.2 Function template generate_canonical
Each function instantiated from the template described in this section 26.5.7.2 maps the result of one or more invocations of a supplied uniform random number generator
g
to one member of the specified RealType such that, if the values gi produced byg
are uniformly distributed, the instantiation’s results tj , 0 ≤ tj < 1, are distributed as uniformly as possible as specified below.