accessing a python int literals methods [duplicate]
I have read that everything in python is an object, and as such I started to experiment with different types and invoking __str__
on them — at first I was feeling really excited, but then I got confused.
>>> "hello world".__str__()
'hello world'
>>> [].__str__()
'[]'
>>> 3.14.__str__()
'3.14'
>>> 3..__str__()
'3.0'
>>> 123.__str__()
File "<stdin>", line 1
123.__str__()
^
SyntaxError: invalid syntax
- Why does
something.__str__()
work for "everything" besidesint
? - Is
123
not an object of typeint
?
You need parens:
(4).__str__()
The problem is the lexer thinks "4." is going to be a floating-point number.
Also, this works:
x = 4
x.__str__()
So you think you can dance floating-point?
123
is just as much of an object as 3.14
, the "problem" lies within the grammar rules of the language; the parser thinks we are about to define a float — not an int with a trailing method call.
We will get the expected behavior if we wrap the number in parenthesis, as in the below.
>>> (123).__str__()
'123'
Or if we simply add some whitespace after 123
:
>>> 123 .__str__()
'123'
The reason it does not work for 123.__str__()
is that the dot following the 123
is interpreted as the decimal-point of some partially declared floating-point.
>>> 123.__str__()
File "", line 1
123.__str__()
^
SyntaxError: invalid syntax
The parser tries to interpret __str__()
as a sequence of digits, but obviously fails — and we get a SyntaxError basically saying that the parser stumbled upon something that it did not expect.
Elaboration
When looking at 123.__str__()
the python parser could use either 3 characters and interpret these 3 characters as an integer, or it could use 4 characters and interpret these as the start of a floating-point.
123.__str__()
^^^ - int
123.__str__()
^^^^- start of floating-point
Just as a little child would like as much cake as possible on their plate, the parser is greedy and would like to swallow as much as it can all at once — even if this isn't always the best of ideas —as such the latter ("better") alternative is chosen.
When it later realizes that __str__()
can in no way be interpreted as the decimals of a floating-point it is already too late; SyntaxError.
Note
123 .__str__() # works fine
In the above snippet,
123
(note the space) must be interpreted as an integer since no number can contain spaces. This means that it is semantically equivalent to(123).__str__()
.
Note
123..__str__() # works fine
The above also works because a number can contain at most one decimal-point, meaning that it is equivalent to
(123.).__str__()
.
For the language-lawyers
This section contains the lexical definition of the relevant literals.
Lexical analysis - 2.4.5 Floating point literals
floatnumber ::= pointfloat | exponentfloat
pointfloat ::= [intpart] fraction | intpart "."
exponentfloat ::= (intpart | pointfloat) exponent
intpart ::= digit+
fraction ::= "." digit+
exponent ::= ("e" | "E") ["+" | "-"] digit+
Lexical analysis - 2.4.4 Integer literals
integer ::= decimalinteger | octinteger | hexinteger | bininteger
decimalinteger ::= nonzerodigit digit* | "0"+
nonzerodigit ::= "1"..."9"
digit ::= "0"..."9"
octinteger ::= "0" ("o" | "O") octdigit+
hexinteger ::= "0" ("x" | "X") hexdigit+
bininteger ::= "0" ("b" | "B") bindigit+
octdigit ::= "0"..."7"
hexdigit ::= digit | "a"..."f" | "A"..."F"
bindigit ::= "0" | "1"
Add a space after the 4
:
4 .__str__()
Otherwise, the lexer will split this expression into the tokens "4."
, "__str__"
, "("
and ")"
, i.e. the first token is interpreted as a floating point number. The lexer always tries to build the longest possible token.
actually (to increase unreadability...):
4..hex()
is valid, too. it gives '0x1.0000000000000p+2'
-- but then it's a float, of course...