Why is one string greater than the other when comparing strings in JavaScript?
I see this code from a book:
var a = "one";
var b = "four";
a>b; // will return true
but it doesn't mention why "one" is bigger than "four". I tried c = "a"
and it is smaller than a and b. I want to know how JavaScript compares these strings.
Because, as in many programming languages, strings are compared lexicographically.
You can think of this as a fancier version of alphabetical ordering, the difference being that alphabetic ordering only covers the 26 characters a
through z
.
This answer is in response to a java question, but the logic is exactly the same. Another good one: String Compare "Logic".
"one" starts with 'o', "four" starts with 'f', 'o' is later in the alphabet than 'f' so "one" is greater than "four". See this page for some nice examples of JavaScript string comparisons (with explanations!).
Javascript uses Lexicographical order for the > operator. 'f' proceeds 'o' so the comparison "one" > "four" returns true
In the 11th edition of the ECMAScript Language Specification the "Abstract Relational Comparison" clause defines how to compute x < y
. When the expression is reverted (i.e. x > y
) we should compute the result of y < x
instead.
So to solve "one" > "four"
we must solve "four" < "one"
instead.
The same clause says this:
The comparison of Strings uses a simple lexicographic ordering on sequences of code unit values.
And this if both operands are strings:
- If Type(px) is String and Type(py) is String, then
- If IsStringPrefix(py, px) is true, return false.
- If IsStringPrefix(px, py) is true, return true.
- Let k be the smallest nonnegative integer such that the code unit at index k within px is different from the code unit at index k within py. (There must be such a k, for neither String is a prefix of the other.)
- Let m be the integer that is the numeric value of the code unit at index k within px.
- Let n be the integer that is the numeric value of the code unit at index k within py.
- If m < n, return true. Otherwise, return false.
(We can safely ignore the first two points for this example)
So let's see the code units for "four":
[..."four"].map(c => c.charCodeAt(0));
//=> [102, 111, 117, 114]
And for "one":
[..."one"].map(c => c.charCodeAt(0));
//=> [111, 110, 101]
So now we must find a value for k (starting at 0) where both m[k] and n[k] are different:
| | 0 | 1 | 2 | 3 |
|---|-----|-----|-----|-----|
| m | 102 | 111 | 117 | 114 |
| n | 111 | 110 | 101 | |
We can see that at 0 both m[0] and n[0] are different.
Since m[0] < n[0] is true then "four" < "one"
is true and thus "one" > "four"
is true.
What does "☂︎" < "☀︎"
return?
[..."☂︎"].map(c => c.charCodeAt(0))
//=> [9730, 65038]
[..."☀︎"].map(c => c.charCodeAt(0))
//=> [9728, 65038]
| | 0 | 1 |
|---|------|-------|
| m | 9730 | 65038 |
| n | 9728 | 65038 |
Since 9730 < 9728
is false then "☂︎" < "☀︎"
is false which is nice because rain is not better than sun (obviously ;).