In MATLAB, are variables REALLY double-precision by default?

This question arose out of something strange that I noticed after investigating this question further...

I always understood MATLAB variables to be double-precision by default. So, if I were to do something like declare a variable with 20 digits after the decimal point:

>> num = 2.71828182845904553488;
>> class(num)  % Display the variable type
ans =
double

I would expect the last 4 digits to be ignored, since the floating-point relative accuracy is on the order of 10-16:

>> eps(num)
ans =
    4.440892098500626e-016

If I try to display the number with more than 16 digits after the decimal point (using either fprintf or sprintf), I get what I expect to see:

>> fprintf('%0.20f\n', num)
2.71828182845904550000
>> sprintf('%0.20f', num)
ans =
2.71828182845904550000

In other words, digits 17 through 20 are all 0.

But things get weird when I pass num to the variable precision arithmetic function in the Symbolic Toolbox, telling it to represent the number using 21 digits of precision:

>> vpa(num, 21)
ans =
2.71828182845904553488

WHAT?! Those last 4 digits have reappeared! Shouldn't they have been lost when the original number I entered was stored as a double-precision variable num? Since num is a double-precision variable when it is passed to vpa, how did vpa know what they were?

My best guess as to what is happening is that MATLAB internally represents num with more precision than a double since I initialized it to a number with more digits past the decimal point than a double-precision variable could handle. Is this really what is happening, or is something else going on?



BONUS: And here's an additional source of confusion if you don't already have a migraine from the above...

>> num = 2.71828182845904553488;  % Declare with 20 digits past the decimal
>> num = 2.718281828459045531;    % Re-declare with 18 digits past the decimal
>> vpa(num, 21)
ans =
2.71828182845904553488  % It's the original 20-digit number!!!

Solution 1:

They're doubles. vpa() is simply choosing to display non-significant digits beyond the floating point relative accuracy, where printf() and disp() truncate or zero them out.

You're only getting your original four digits back out because the literal you chose to initialize num with just happens to be the exact decimal expansion of a binary double value, because it was copy and pasted from the output of the expansion of an actual double value from the other question. It won't work for other nearby values, as you show in your "BONUS" addendum.

More precisely, all numeric literals in Matlab produce values of type double. They get converted to the binary double value that is nearest to the decimal value they represent. In effect, digits in a literal beyond the limit of precision of the double type are silently dropped. When you copy and paste the output of vpa to create a new variable, as the other question's poster did with the e = ... statement, you're initializing a value from a literal, instead of dealing directly with the result of a previous expression.

The differences here are just in output formatting. I think what's going on is that vpa() is taking that double precision binary double and treating it as an exact value. For a given binary mantissa-exponent value, you can calculate the decimal equivalent to arbitrarily many decimal places. If you have a limited precision ("width") in the binary value, as you do with any fixed-size data type, only so many of those decimal digits are significant. printf() and Matlab's default display handle this by truncating the output or displaying non-significant digits as 0. vpa() is ignoring the limits of precision and continuing to calculate as many decimal places as you request.

Those additional digits are bogus, in the sense that if they were replaced by other values to produce a nearby decimal value, they would all get "rounded" to the same binary double value.

Here's a way to show it. These values of x are all the same when stored in doubles, and will all be represented the same by vpa().

x = [
    2.7182818284590455348848081484902650117874145507812500
    2.7182818284590455348848081484902650117874145507819999
    2.7182818284590455348848
    2.71828182845904553488485555555555555555555555555555
    exp(1)
    ]
unique(x)

Here's another way of demonstrating it. Here are two doubles that are very close to each other.

x0 = exp(1)
x1 = x0 + eps(x0)

vpa(x0) and vpa(x1) should produce outputs that differ a lot past the 16th digit. However, you shouldn't be able to create a double value x such that vpa(x) produces a decimal representation that falls between vpa(x0) and vpa(x1).

(UPDATE: Amro points out that you can use fprintf('%bx\n', x) to display an exact representation of the underlying binary value in hex format. You can use this to confirm the literals map to the same double.)

I suspect vpa() behaves this way because it treats its inputs as exact values, and polymorphically supports other Matlab types from the Symbolic Toolbox that have more precision than doubles. Those values will need to be initialized by means other than numeric literals, which is why sym() takes a string as an input and vpa(exp(1)) differs from vpa(sym('exp(1)')).

Make sense? Sorry for the long-windedness.

(Note I don't have the Symbolic Toolbox so I can't test vpa() myself.)

Solution 2:

first :

it appears that sprintf and fprintf have different behavior on different versions of MATLAB for example in MATLAB 2018 a

num=2.7182818284590666666666;    
sprintf('%0.70f', num)
ans =
'2.7182818284590668511668809514958411455154418945312500000000000000000000'

second :

Floating-Point Numbers

MATLAB® represents floating-point numbers in either double-precision or single-precision format. The default is double precision, but you can make any number single precision with a simple conversion function.

Double-Precision Floating Point

MATLAB constructs the double-precision (or double) data type according to IEEE® Standard 754 for double precision. Any value stored as a double requires 64 bits, formatted as shown in the table below:

Bits : 63
Usage : Sign (0 = positive, 1 = negative)

Bits : 62 to 52 Usage : Exponent, biased by 1023

Bits : 51 to 0 Usage : Fraction f of the number 1.f

refer to this link for more info

Between 252=4,503,599,627,370,496 and 253=9,007,199,254,740,992 the representable numbers are exactly the integers. For the next range, from 253 to 254, everything is multiplied by 2, so the representable numbers are the even ones, etc. Conversely, for the previous range from 2^51 to 2^52, the spacing is 0.5, etc.

The spacing as a fraction of the numbers in the range from 2^n to 2^n+1 is 2^n−52. The maximum relative rounding error when rounding a number to the nearest representable one (the machine epsilon) is therefore 2^−53.

so in your case where n=1 (2^1 <= num <= 2^2) the spacing is 2^-51 ,

i think it is safe to assume that sprintf and sprintf algorithms for showing numbers are tricky and MATLAB Double type is based on IEEE standard,


about VPA:

vpa Uses Guard Digits to Maintain Precision

The value of the digits function specifies the minimum number of significant digits used. Internally, vpa can use more digits than digits specifies. These additional digits are called guard digits because they guard against round-off errors in subsequent calculations.

Numerically approximate 1/3 using four significant digits.

a = vpa(1/3, 4)
a =
0.3333

Approximate the result a using 20 digits. The result shows that the toolbox internally used more than four digits when computing a. The last digits in the result are incorrect because of the round-off error.

vpa(a, 20)
ans =
0.33333333333303016843

the problem you may encounter is because of spacing , gaurd digits algorithm and round off problem ,

for example using matlab 2018 a :

 sprintf('%0.28f', 8.0)
 ans =
 '8.0000000000000000000000000000'

but:

sprintf('%0.28f', 8.1)
ans =
'8.0999999999999996447286321199'

because the number is between 2^3 and 2^4 so the spacing is 2^-49 (= 1.77 e-15) so the number is valid till 15th decimal place and

sprintf('%0.28f', 64.1)
ans =
'64.0999999999999943156581139192'

because the number is between 2^6 and 2^7 so the spacing is 2^-46 (= 1.42 e-14) so the number is valid till 14th decimal place