Sunday, May 20, 2007

Python Pitfall: Not all objects are created equal

I've been entertaining the idea of writing a series of posts about Python warts for a couple of weeks now. Overall, python is a remarkably consistent programming language, but there are a few edge cases that people should be aware of. My hope is that, by pointing out their existence, others can save themselves a rude surprise. I've decided to call the series "python pitfalls".

So here goes my inaugural post: not all objects are created equal in python. New-style classes (which are only "new" in the sense they were introduced in python 2.2 which is quite old now) all inherit from the base object class. For example, consider the following simple class:

>>> class MyObject(object):
... pass

This is about as simple a class as you can make in Python; MyObject inherits all of its behaviour from the object base class. Now, lets set an attribute on an instance of our new class:

>>> b = MyObject()
>>> b.myattr = 42
>>> print b.myattr
42

Nothing fancy here. But recall that our MyObject class adds nothing to the base object class, implying that the ability to set arbitrary attributes on an instance must originate with the object class's implementation. Let's give it a try:

>>> a = object()
>>> a.myattr = 42
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'object' object has no attribute 'myattr'
>>> setattr(a, 'myattr', 42)
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'object' object has no attribute 'myattr'

What is going on here? We were able to set attributes on instances of MyObject, but not on instances of object itself? That's odd: MyObject inherits all of its behaviour from object, so they should be exactly the same!

My first guess was that the object class had a __slots__ attribute restricting which attributes could be set on it (see this article for an explanation of __slots__). One of the properties of __slots__ is that, unlike most other class attributes, it is not inherited by subclasses. Which would explain why we can set arbitrary attributes on instances of MyObject, which is a subclass of object, but not on instances of object itself. However, to my surprise, object does not define a __slots__ attribute:

>>> '__slots__' in dir(object)
False
>>> dir(object)
['__class__', '__delattr__', '__doc__', '__getattribute__',
'__hash__', '__init__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__str__']

Look: no __slots__! So that isn't it.

As far as I can tell, the fact instances of object do not allow attributes to be set on them is simply an implementation artifact. They should, but they don't. Go figure. Luckily, there is seldom need to create instances of object directly; the class really just exists as a base class for deriving new-style classes. But I do still find it odd somehow subclassing object adds functionality not present in the base class.

2 comments:

Richard said...

I wanted to use an 'object' on which to hang some arbitrary attributes but like you I had to sub-class it :-(

I can't see why you shouldn't be able to use a base object in this way. Is this a bug?

Kelly Yancey said...

I think this is an implementation artifact: the standard python classes that are implemented in C do not have a __dict__ attribute and thus cannot have attributes added to them. As soon as you subclass them, though, the subclass gets a __dict__ attribute. The exception is if you use __slots__ to restrict the instance's attributes and do not include __dict__ in the attribute list (since any attributes not explicitely listed in the __slots__ list would need to be added to the object dictionary, but without including '__dict__' in the __slots__ list, python cannot make an object dictionary).

Anyway, I don't know the motivation (if any) for why the C-implemented classes don't have a __dict__ attribute. I would venture to guess that it is an optimization (much like __slots__) to reduce the cost of instantiating these commonly-used classes.

As for whether or not it is a bug: I think it'll depend on who you ask. From a consistency point of view, I would agree with you that the current behavior is inconsistent and hence should be considered a bug.