在上一文章末尾,给出了一段代码,就涉及到descriptor与attribute lookup的问题。而get系列函数(__get__, __getattr__, __getattribute__) 也很容易搞晕,本文就这些问题简单总结一下。
- python中一切都是对象,“everything is object”,包括类,类的实例,数字,模块
- 任何object都是类(class or type)的实例(instance)
- 如果一个descriptor只实现了__get__方法,我们称之为non-data descriptor, 如果同时实现了__get__ __set__我们称之为data descriptor。
实例属性查找
The implementation works through a precedence chain that gives data descriptors priority over instance variables, instance variables priority over non-data descriptors, and assigns lowest priority to__getattr__()if provided.
(1)如果“attr”是出现在Clz或其基类的__dict__中, 且attr是data descriptor, 那么调用其__get__方法, 否则
(2)如果“attr”出现在obj的__dict__中, 那么直接返回 obj.__dict__[\'attr\'], 否则
(3)如果“attr”出现在Clz或其基类的__dict__中
(3.1)如果attr是non-data descriptor,那么调用其__get__方法, 否则
(3.2)返回 __dict__[\'attr\']
(4)如果Clz有__getattr__方法,调用__getattr__方法,否则
(5)抛出AttributeError
1 #coding=utf-8 2 class DataDescriptor(object): 3 def __init__(self, init_value): 4 self.value = init_value 5 6 def __get__(self, instance, typ): 7 return \'DataDescriptor __get__\' 8 9 def __set__(self, instance, value): 10 print (\'DataDescriptor __set__\') 11 self.value = value 12 13 class NonDataDescriptor(object): 14 def __init__(self, init_value): 15 self.value = init_value 16 17 def __get__(self, instance, typ): 18 return(\'NonDataDescriptor __get__\') 19 20 class Base(object): 21 dd_base = DataDescriptor(0) 22 ndd_base = NonDataDescriptor(0) 23 24 25 class Derive(Base): 26 dd_derive = DataDescriptor(0) 27 ndd_derive = NonDataDescriptor(0) 28 same_name_attr = \'attr in class\' 29 30 def __init__(self): 31 self.not_des_attr = \'I am not descriptor attr\' 32 self.same_name_attr = \'attr in object\' 33 34 def __getattr__(self, key): 35 return \'__getattr__ with key %s\' % key 36 37 def change_attr(self): 38 self.__dict__[\'dd_base\'] = \'dd_base now in object dict \' 39 self.__dict__[\'ndd_derive\'] = \'ndd_derive now in object dict \' 40 41 def main(): 42 b = Base() 43 d = Derive() 44 print \'Derive object dict\', d.__dict__ 45 assert d.dd_base == "DataDescriptor __get__" 46 assert d.ndd_derive == \'NonDataDescriptor __get__\' 47 assert d.not_des_attr == \'I am not descriptor attr\' 48 assert d.no_exists_key == \'__getattr__ with key no_exists_key\' 49 assert d.same_name_attr == \'attr in object\' 50 d.change_attr() 51 print \'Derive object dict\', d.__dict__ 52 assert d.dd_base != \'dd_base now in object dict \' 53 assert d.ndd_derive == \'ndd_derive now in object dict \' 54 55 try: 56 b.no_exists_key 57 except Exception, e: 58 assert isinstance(e, AttributeError) 59 60 if __name__ == \'__main__\': 61 main()
Derive object dict {\'same_name_attr\': \'attr in object\', \'not_des_attr\': \'I am not descriptor attr\'}Derive object dict {\'same_name_attr\': \'attr in object\', \'ndd_derive\': \'ndd_derive now in object dict \', \'not_des_attr\': \'I am not descriptor attr\', \'dd_base\': \'dd_base now in object dict \'}
调用change_attr方法之后,dd_base既出现在类的__dict__(作为data descriptor), 也出现在实例的__dict__, 因为attribute lookup的循序,所以优先返回的还是Clz.__dict__[\'dd_base\']。而ndd_base虽然出现在类的__dict__, 但是因为是nondata descriptor,所以优先返回obj.__dict__[\'dd_base\']。其他:line48,line56表明了__getattr__的作用。line49表明obj.__dict__优先于Clz.__dict__
cached_property例子
我们再来看看上一文章的这段代码。
1 import functools, time
2 class cached_property(object):
3 """ A property that is only computed once per instance and then replaces
4 itself with an ordinary attribute. Deleting the attribute resets the
5 property. """
6
7 def __init__(self, func):
8 functools.update_wrapper(self, func)
9 self.func = func
10
11 def __get__(self, obj, cls):
12 if obj is None: return self
13 value = obj.__dict__[self.func.__name__] = self.func(obj)
14 return value
15
16 class TestClz(object):
17 @cached_property
18 def complex_calc(self):
19 print \'very complex_calc\'
20 return sum(range(100))
21
22 if __name__==\'__main__\':
23 t = TestClz()
24 print \'>>> first call\'
25 print t.complex_calc
26 print \'>>> second call\'
27 print t.complex_calc
cached_property是一个non-data descriptor。在TestClz中,用cached_property装饰方法complex_calc,返回值是一个descriptor实例,所以在调用的时候没有使用小括号。
类属性查找
前面提到过,类的也是对象,类是元类(metaclass)的实例,所以类属性的查找顺序基本同上。区别在于第二步,由于Clz可能有基类,所以是在Clz及其基类的__dict__”查找“attr,注意这里的查找并不是直接返回clz.__dict__[\'attr\']。具体来说,这第二步分为以下两种情况:
(2.1)如果clz.__dict__[\'attr\']是一个descriptor(不管是data descriptor还是non-data descriptor),都调用其__get__方法
(2.2)否则返回clz.__dict__[\'attr\']
这就解释了一个很有意思的问题:method与function的问题
>>> class Widget(object):
... def func(self):
... pass
...
>>> w = Widget()
>>> Widget.__dict__
dict_proxy({\'__dict__\': <attribute \'__dict__\' of \'Widget\' objects>, \'__module__\': \'__main__\', \'__weakref__\': <attribute \'__weakref__\' of \'Widget\' objects>, \'__doc__\': None, \'func\': <function func at 0x7fdc7d0d1668>})
>>> w.__dict__
{}
>>> Widget.__dict__[\'func\']
<function func at 0x7fdc7d0d1668>
>>> Widget.func
<unbound method Widget.func>
>>>
Widget是一个之定义了一个func函数的类,func是类的属性,这个也可以通过Widget.__dict__、w.__dict__看到。Widget.__dict__[\'func\']返回的是一个function,但Widget.func是一个unbound method,即Widget.func并不等同于Widget.__dict__[\'func\'],按照前面的类属性的访问顺序,我们可以怀疑,func是一个descriptor,这样才不会走到第2.2这种情况。验证如下:
>>> dir(Widget.__dict__[\'func\'])
[\'__call__\', \'__class__\', \'__closure__\', \'__code__\', \'__defaults__\', \'__delattr__\', \'__dict__\', \'__doc__\', \'__format__\', \'__get__\', \'__getattribute__\', \'__globals__\', \'__hash__\', \'__init__\', \'__module__\', \'__name__\', \'__new__\', \'__reduce__\', \'__reduce_ex__\', \'__repr__\', \'__setattr__\', \'__sizeof__\', \'__str__\', \'__subclasshook__\', \'func_closure\', \'func_code\', \'func_defaults\', \'func_dict\', \'func_doc\', \'func_globals\', \'func_name\']
属性赋值
Python的属性赋值(attribute assignment)也会受到descriptor(data descriptor)的影响,同时也会受到__setattr__函数的影响。当然Python中还有一个setattr,setattr(x, \'foobar\', 123)等价于x.foobar = 123,二者都叫attribute assignment。
首先看看__setattr__:
object.__setattr__(self, name, value)
Called when an attribute assignment is attempted. This is called instead of the normal mechanism
那什么是normal mechanism,简单来说就是x.__dict__[\'foobar\'] = 123,不管\'foobar\'之前是否是x的属性(当然赋值之后就一定是了)。但是如果‘’foobar‘’是类属性,且是data descriptor,那么回优先调用__set__。我们来看一个例子:
1 class MaxValDes(object): 2 def __init__(self, attr, max_val): 3 self.attr = attr 4 self.max_val = max_val 5 6 def __get__(self, instance, typ): 7 return instance.__dict__[self.attr] 8 9 def __set__(self, instance, value): 10 instance.__dict__[self.attr] = min(self.max_val, value) 11 print \'MaxValDes __set__\', self.attr, instance.__dict__[self.attr] 12 13 class Widget(object): 14 a = MaxValDes(\'a\', 10) 15 def __init__(self): 16 self.a = 0 17 18 # def __setattr__(self, name, value): 19 # self.__dict__[name] = value 20 # print \'Widget __setattr__\', name, self.__dict__[name] 21 22 if __name__ == \'__main__\': 23 w0 = Widget() 24 w0.a = 123
输出如下:
MaxValDes __set__ a 0
MaxValDes __set__ a 10
可以看到,即使Widget的实例也有一个‘a’属性,但是调用w.a的时候会调用类属性‘a’(一个descriptor)的__set__方法。如果不注释掉第18到第20行,输出如下
Widget __setattr__ a 0
Widget __setattr__ a 123
可以看到,优先调用Widget 的__setattr__方法。因此:对于属性赋值,obj = Clz(), 那么obj.attr = var,按照这样的顺序:
(1)如果Clz定义了__setattr__方法,那么调用该方法,否则
(2)如果“attr”是出现在Clz或其基类的__dict__中, 且attr是data descriptor, 那么调用其__set__方法, 否则
(3)等价调用obj.__dict__[\'attr\'] = var