Why is NaN not equal to NaN? [duplicate]

The relevant IEEE standard defines a numeric constant NaN (not a number) and prescribes that NaN should compare as not equal to itself. Why is that?

All the languages I'm familiar with implement this rule. But it often causes significant problems, for example unexpected behavior when NaN is stored in a container, when NaN is in the data that is being sorted, etc. Not to mention, the vast majority of programmers expect any object to be equal to itself (before they learn about NaN), so surprising them adds to the bugs and confusion.

IEEE standards are well thought out, so I am sure there is a good reason why NaN comparing as equal to itself would be bad. I just can't figure out what it is.

Edit: please refer to What is the rationale for all comparisons returning false for IEEE754 NaN values? as the authoritative answer.


Solution 1:

The accepted answer is 100% without question WRONG. Not halfway wrong or even slightly wrong. I fear this issue is going to confuse and mislead programmers for a long time to come when this question pops up in searches.

NaN is designed to propagate through all calculations, infecting them like a virus, so if somewhere in your deep, complex calculations you hit upon a NaN, you don't bubble out a seemingly sensible answer. Otherwise by identity NaN/NaN should equal 1, along with all the other consequences like (NaN/NaN)==1, (NaN*1)==NaN, etc. If you imagine that your calculations went wrong somewhere (rounding produced a zero denominator, yielding NaN), etc then you could get wildly incorrect (or worse: subtly incorrect) results from your calculations with no obvious indicator as to why.

There are also really good reasons for NaNs in calculations when probing the value of a mathematical function; one of the examples given in the linked document is finding the zeros() of a function f(). It is entirely possible that in the process of probing the function with guess values that you will probe one where the function f() yields no sensible result. This allows zeros() to see the NaN and continue its work.

The alternative to NaN is to trigger an exception as soon as an illegal operation is encountered (also called a signal or a trap). Besides the massive performance penalties you might encounter, at the time there was no guarantee that the CPUs would support it in hardware or the OS/language would support it in software; everyone was their own unique snowflake in handling floating-point. IEEE decided to explicitly handle it in software as the NaN values so it would be portable across any OS or programming language. Correct floating point algorithms are generally correct across all floating point implementations, whether that be node.js or COBOL (hah).

In theory, you don't have to set specific #pragma directives, set crazy compiler flags, catch the correct exceptions, or install special signal handlers to make what appears to be the identical algorithm actually work correctly. Unfortunately some language designers and compiler writers have been really busy undoing this feature to the best of their abilities.

Please read some of the information about the history of IEEE 754 floating point. Also this answer on a similar question where a member of the committee responded: What is the rationale for all comparisons returning false for IEEE754 NaN values?

"An Interview with the Old Man of Floating-Point"

"History of IEEE Floating-Point Format"

What every computer scientist should know about floating point arithmetic

Solution 2:

Well, log(-1) gives NaN, and acos(2) also gives NaN. Does that mean that log(-1) == acos(2)? Clearly not. Hence it makes perfect sense that NaN is not equal to itself.

Revisiting this almost two years later, here's a "NaN-safe" comparison function:

function compare(a,b) {
    return a == b || (isNaN(a) && isNaN(b));
}

Solution 3:

My original answer (from 4 years ago) criticizes the decision from the modern-day perspective without understanding the context in which the decision was made. As such, it doesn't answer the question.

The correct answer is given here:

NaN != NaN originated out of two pragmatic considerations:

[...] There was no isnan( ) predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something like isnan( ) which could take many years

There was one disadvantage to that approach: it made NaN less useful in many situations unrelated to numerical computation. For example, much later when people wanted to use NaN to represent missing values and put them in hash-based containers, they couldn't do it.

If the committee foresaw future use cases, and considered them important enough, they could have gone for the more verbose !(x<x & x>x) instead of x!=x as a test for NaN. However, their focus was more pragmatic and narrow: providing the best solution for a numeric computation, and as such they saw no issue with their approach.

===

Original answer:

I am sorry, much as I appreciate the thought that went into the top-voted answer, I disagree with it. NaN does not mean "undefined" - see http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF, page 7 (search for the word "undefined"). As that document confirms, NaN is a well-defined concept.

Furthermore, IEEE approach was to follow the regular mathematics rules as much as possible, and when they couldn't, follow the rule of "least surprise" - see https://stackoverflow.com/a/1573715/336527. Any mathematical object is equal to itself, so the rules of mathematics would imply that NaN == NaN should be True. I cannot see any valid and powerful reason to deviate from such a major mathematical principle (not to mention the less important rules of trichotomy of comparison, etc.).

As a result, my conclusion is as follows.

IEEE committee members did not think this through very clearly, and made a mistake. Since very few people understood the IEEE committee approach, or cared about what exactly the standard says about NaN (to wit: most compilers' treatment of NaN violates the IEEE standard anyway), nobody raised an alarm. Hence, this mistake is now embedded in the standard. It is unlikely to be fixed, since such a fix would break a lot of existing code.

Edit: Here is one post from a very informative discussion. Note: to get an unbiased view you have to read the entire thread, as Guido takes a different view to that of some other core developers. However, Guido is not personally interested in this topic, and largely follows Tim Peters recommendation. If anyone has Tim Peters' arguments in favor of NaN != NaN, please add them in comments; they have a good chance to change my opinion.