accessing a python int literals methods [duplicate]

I have read that everything in python is an object, and as such I started to experiment with different types and invoking __str__ on them — at first I was feeling really excited, but then I got confused.

>>> "hello world".__str__()
'hello world'
>>> [].__str__()
'[]'
>>> 3.14.__str__()
'3.14'
>>> 3..__str__()
'3.0'
>>> 123.__str__()
  File "<stdin>", line 1
    123.__str__()
              ^
SyntaxError: invalid syntax
  • Why does something.__str__() work for "everything" besides int?
  • Is 123 not an object of type int?

You need parens:

(4).__str__()

The problem is the lexer thinks "4." is going to be a floating-point number.

Also, this works:

x = 4
x.__str__()

So you think you can  dance  floating-point?

123 is just as much of an object as 3.14, the "problem" lies within the grammar rules of the language; the parser thinks we are about to define a float — not an int with a trailing method call.

We will get the expected behavior if we wrap the number in parenthesis, as in the below.

>>> (123).__str__()
'123'

Or if we simply add some whitespace after 123:

>>> 123 .__str__()
'123'


The reason it does not work for 123.__str__() is that the dot following the 123 is interpreted as the decimal-point of some partially declared floating-point.

>>> 123.__str__()
  File "", line 1
    123.__str__()
              ^
SyntaxError: invalid syntax

The parser tries to interpret __str__() as a sequence of digits, but obviously fails — and we get a SyntaxError basically saying that the parser stumbled upon something that it did not expect.



Elaboration

When looking at 123.__str__() the python parser could use either 3 characters and interpret these 3 characters as an integer, or it could use 4 characters and interpret these as the start of a floating-point.

123.__str__()
^^^ - int
123.__str__()
^^^^- start of floating-point

Just as a little child would like as much cake as possible on their plate, the parser is greedy and would like to swallow as much as it can all at once — even if this isn't always the best of ideas —as such the latter ("better") alternative is chosen.

When it later realizes that __str__() can in no way be interpreted as the decimals of a floating-point it is already too late; SyntaxError.

Note

 123 .__str__() # works fine

In the above snippet, 123  (note the space) must be interpreted as an integer since no number can contain spaces. This means that it is semantically equivalent to (123).__str__().

Note

 123..__str__() # works fine

The above also works because a number can contain at most one decimal-point, meaning that it is equivalent to (123.).__str__().



For the language-lawyers

This section contains the lexical definition of the relevant literals.

Lexical analysis - 2.4.5 Floating point literals

floatnumber   ::=  pointfloat | exponentfloat
pointfloat    ::=  [intpart] fraction | intpart "."
exponentfloat ::=  (intpart | pointfloat) exponent
intpart       ::=  digit+
fraction      ::=  "." digit+
exponent      ::=  ("e" | "E") ["+" | "-"] digit+

Lexical analysis - 2.4.4 Integer literals

integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
decimalinteger ::=  nonzerodigit digit* | "0"+
nonzerodigit   ::=  "1"..."9"
digit          ::=  "0"..."9"
octinteger     ::=  "0" ("o" | "O") octdigit+
hexinteger     ::=  "0" ("x" | "X") hexdigit+
bininteger     ::=  "0" ("b" | "B") bindigit+
octdigit       ::=  "0"..."7"
hexdigit       ::=  digit | "a"..."f" | "A"..."F"
bindigit       ::=  "0" | "1"

Add a space after the 4:

4 .__str__()

Otherwise, the lexer will split this expression into the tokens "4.", "__str__", "(" and ")", i.e. the first token is interpreted as a floating point number. The lexer always tries to build the longest possible token.


actually (to increase unreadability...):

4..hex()

is valid, too. it gives '0x1.0000000000000p+2' -- but then it's a float, of course...