Why can't attribute names be Python keywords?
There is a restriction on the syntax of attribute access, in Python (at least in the CPython 2.7.2 implementation):
>>> class C(object): pass
>>> o = C()
>>> o.x = 123 # Works
>>> o.if = 123
o.if = 123
^
SyntaxError: invalid syntax
My question is twofold:
- Is there a fundamental reason why using Python keyword attribute names (as in
o.if = 123
) is forbidden? - Is/where is the above restriction on attribute names documented?
It would make sense to do o.class = …
, in one of my programs, and I am a little disappointed to not be able to do it (o.class_
would work, but it does not look as simple).
PS: The problem is obviously that if
and class
are Python keywords. The question is why using keywords as attribute names would be forbidden (I don't see any ambiguity in the expression o.class = 123
), and whether this is documented.
Solution 1:
Because parser is simpler when keywords are always keywords, and not contextual (e.g. if
is a keyword when on the statement level, but just an identifier when inside an expression — for if
it'd be double hard because of X if C else Y
, and for
is used in list comprehensions and generator expressions).
So the code doesn't even get to the point where there's attribute access, it's simply rejected by the parser, just like incorrect indentation (which is why it's a SyntaxError
, and not AttributeError
or something). It doesn't differentiate whether you use if
as an attribute name, a variable name, a function name, or a type name. It can never be an identifier, simply because parser always assigns it "keyword" label and makes it a different token than identifiers.
It's the same in most languages, and language grammar (+ lexer specification) is the documentation for that. Language spec mentions it explicitly. It also doesn't change in Python 3.
Also, just because you can use setattr
or __dict__
to make an attribute with a reserved name, doesn't mean you should. Don't force yourself/API user to use getattr
instead of natural attribute access. getattr
should be reserved for when access to a variable attribute name is needed.
Solution 2:
Because if
is a keyword. You have similar issues with o.while
and o.for
:
pax> python
>>> class C(object): pass
...
>>> o = C()
>>> o.not_a_keyword = 123
>>> o.if = 123
File "<stdin>", line 1
o.if = 123
^
SyntaxError: invalid syntax
>>> o.while = 123
File "<stdin>", line 1
o.while = 123
^
SyntaxError: invalid syntax
>>> o.for = 123
File "<stdin>", line 1
o.for = 123
^
SyntaxError: invalid syntax
Other keywords in Python can be obtained with:
>>> import keyword
>>> keyword.kwlist
['and', 'as', 'assert', 'break', 'class', 'continue', 'def',
'del', 'elif', 'else', 'except', 'exec', 'finally', 'for',
'from', 'global', 'if', 'import', 'in', 'is', 'lambda',
'not', 'or', 'pass', 'print', 'raise', 'return', 'try',
'while', 'with', 'yield']
You should not generally use a keyword as variable name in Python.
I would suggest choosing a more descriptive name, such as iface
if it's an interface, or infld
for an input field and so forth.
As to your question edit as to why keywords aren't allowed, it simplifies parsers greatly if the lexical elements are context free. Having to treat the lexical token if
as a keyword in some places and an identifier in others would introduce complexity that's not really needed if you choose your identifiers more wisely.
For example, the C++ statement:
long int int = char[new - int];
could (with a little difficulty) be evaluated with a complex parser based on where those lexical elements occur (and what exists on either side of them). But, (at least partially) in the interests of simplicity (and readability), this is not done.