Chris@87
|
1 """
|
Chris@87
|
2 =============================
|
Chris@87
|
3 Subclassing ndarray in python
|
Chris@87
|
4 =============================
|
Chris@87
|
5
|
Chris@87
|
6 Credits
|
Chris@87
|
7 -------
|
Chris@87
|
8
|
Chris@87
|
9 This page is based with thanks on the wiki page on subclassing by Pierre
|
Chris@87
|
10 Gerard-Marchant - http://www.scipy.org/Subclasses.
|
Chris@87
|
11
|
Chris@87
|
12 Introduction
|
Chris@87
|
13 ------------
|
Chris@87
|
14
|
Chris@87
|
15 Subclassing ndarray is relatively simple, but it has some complications
|
Chris@87
|
16 compared to other Python objects. On this page we explain the machinery
|
Chris@87
|
17 that allows you to subclass ndarray, and the implications for
|
Chris@87
|
18 implementing a subclass.
|
Chris@87
|
19
|
Chris@87
|
20 ndarrays and object creation
|
Chris@87
|
21 ============================
|
Chris@87
|
22
|
Chris@87
|
23 Subclassing ndarray is complicated by the fact that new instances of
|
Chris@87
|
24 ndarray classes can come about in three different ways. These are:
|
Chris@87
|
25
|
Chris@87
|
26 #. Explicit constructor call - as in ``MySubClass(params)``. This is
|
Chris@87
|
27 the usual route to Python instance creation.
|
Chris@87
|
28 #. View casting - casting an existing ndarray as a given subclass
|
Chris@87
|
29 #. New from template - creating a new instance from a template
|
Chris@87
|
30 instance. Examples include returning slices from a subclassed array,
|
Chris@87
|
31 creating return types from ufuncs, and copying arrays. See
|
Chris@87
|
32 :ref:`new-from-template` for more details
|
Chris@87
|
33
|
Chris@87
|
34 The last two are characteristics of ndarrays - in order to support
|
Chris@87
|
35 things like array slicing. The complications of subclassing ndarray are
|
Chris@87
|
36 due to the mechanisms numpy has to support these latter two routes of
|
Chris@87
|
37 instance creation.
|
Chris@87
|
38
|
Chris@87
|
39 .. _view-casting:
|
Chris@87
|
40
|
Chris@87
|
41 View casting
|
Chris@87
|
42 ------------
|
Chris@87
|
43
|
Chris@87
|
44 *View casting* is the standard ndarray mechanism by which you take an
|
Chris@87
|
45 ndarray of any subclass, and return a view of the array as another
|
Chris@87
|
46 (specified) subclass:
|
Chris@87
|
47
|
Chris@87
|
48 >>> import numpy as np
|
Chris@87
|
49 >>> # create a completely useless ndarray subclass
|
Chris@87
|
50 >>> class C(np.ndarray): pass
|
Chris@87
|
51 >>> # create a standard ndarray
|
Chris@87
|
52 >>> arr = np.zeros((3,))
|
Chris@87
|
53 >>> # take a view of it, as our useless subclass
|
Chris@87
|
54 >>> c_arr = arr.view(C)
|
Chris@87
|
55 >>> type(c_arr)
|
Chris@87
|
56 <class 'C'>
|
Chris@87
|
57
|
Chris@87
|
58 .. _new-from-template:
|
Chris@87
|
59
|
Chris@87
|
60 Creating new from template
|
Chris@87
|
61 --------------------------
|
Chris@87
|
62
|
Chris@87
|
63 New instances of an ndarray subclass can also come about by a very
|
Chris@87
|
64 similar mechanism to :ref:`view-casting`, when numpy finds it needs to
|
Chris@87
|
65 create a new instance from a template instance. The most obvious place
|
Chris@87
|
66 this has to happen is when you are taking slices of subclassed arrays.
|
Chris@87
|
67 For example:
|
Chris@87
|
68
|
Chris@87
|
69 >>> v = c_arr[1:]
|
Chris@87
|
70 >>> type(v) # the view is of type 'C'
|
Chris@87
|
71 <class 'C'>
|
Chris@87
|
72 >>> v is c_arr # but it's a new instance
|
Chris@87
|
73 False
|
Chris@87
|
74
|
Chris@87
|
75 The slice is a *view* onto the original ``c_arr`` data. So, when we
|
Chris@87
|
76 take a view from the ndarray, we return a new ndarray, of the same
|
Chris@87
|
77 class, that points to the data in the original.
|
Chris@87
|
78
|
Chris@87
|
79 There are other points in the use of ndarrays where we need such views,
|
Chris@87
|
80 such as copying arrays (``c_arr.copy()``), creating ufunc output arrays
|
Chris@87
|
81 (see also :ref:`array-wrap`), and reducing methods (like
|
Chris@87
|
82 ``c_arr.mean()``.
|
Chris@87
|
83
|
Chris@87
|
84 Relationship of view casting and new-from-template
|
Chris@87
|
85 --------------------------------------------------
|
Chris@87
|
86
|
Chris@87
|
87 These paths both use the same machinery. We make the distinction here,
|
Chris@87
|
88 because they result in different input to your methods. Specifically,
|
Chris@87
|
89 :ref:`view-casting` means you have created a new instance of your array
|
Chris@87
|
90 type from any potential subclass of ndarray. :ref:`new-from-template`
|
Chris@87
|
91 means you have created a new instance of your class from a pre-existing
|
Chris@87
|
92 instance, allowing you - for example - to copy across attributes that
|
Chris@87
|
93 are particular to your subclass.
|
Chris@87
|
94
|
Chris@87
|
95 Implications for subclassing
|
Chris@87
|
96 ----------------------------
|
Chris@87
|
97
|
Chris@87
|
98 If we subclass ndarray, we need to deal not only with explicit
|
Chris@87
|
99 construction of our array type, but also :ref:`view-casting` or
|
Chris@87
|
100 :ref:`new-from-template`. Numpy has the machinery to do this, and this
|
Chris@87
|
101 machinery that makes subclassing slightly non-standard.
|
Chris@87
|
102
|
Chris@87
|
103 There are two aspects to the machinery that ndarray uses to support
|
Chris@87
|
104 views and new-from-template in subclasses.
|
Chris@87
|
105
|
Chris@87
|
106 The first is the use of the ``ndarray.__new__`` method for the main work
|
Chris@87
|
107 of object initialization, rather then the more usual ``__init__``
|
Chris@87
|
108 method. The second is the use of the ``__array_finalize__`` method to
|
Chris@87
|
109 allow subclasses to clean up after the creation of views and new
|
Chris@87
|
110 instances from templates.
|
Chris@87
|
111
|
Chris@87
|
112 A brief Python primer on ``__new__`` and ``__init__``
|
Chris@87
|
113 =====================================================
|
Chris@87
|
114
|
Chris@87
|
115 ``__new__`` is a standard Python method, and, if present, is called
|
Chris@87
|
116 before ``__init__`` when we create a class instance. See the `python
|
Chris@87
|
117 __new__ documentation
|
Chris@87
|
118 <http://docs.python.org/reference/datamodel.html#object.__new__>`_ for more detail.
|
Chris@87
|
119
|
Chris@87
|
120 For example, consider the following Python code:
|
Chris@87
|
121
|
Chris@87
|
122 .. testcode::
|
Chris@87
|
123
|
Chris@87
|
124 class C(object):
|
Chris@87
|
125 def __new__(cls, *args):
|
Chris@87
|
126 print 'Cls in __new__:', cls
|
Chris@87
|
127 print 'Args in __new__:', args
|
Chris@87
|
128 return object.__new__(cls, *args)
|
Chris@87
|
129
|
Chris@87
|
130 def __init__(self, *args):
|
Chris@87
|
131 print 'type(self) in __init__:', type(self)
|
Chris@87
|
132 print 'Args in __init__:', args
|
Chris@87
|
133
|
Chris@87
|
134 meaning that we get:
|
Chris@87
|
135
|
Chris@87
|
136 >>> c = C('hello')
|
Chris@87
|
137 Cls in __new__: <class 'C'>
|
Chris@87
|
138 Args in __new__: ('hello',)
|
Chris@87
|
139 type(self) in __init__: <class 'C'>
|
Chris@87
|
140 Args in __init__: ('hello',)
|
Chris@87
|
141
|
Chris@87
|
142 When we call ``C('hello')``, the ``__new__`` method gets its own class
|
Chris@87
|
143 as first argument, and the passed argument, which is the string
|
Chris@87
|
144 ``'hello'``. After python calls ``__new__``, it usually (see below)
|
Chris@87
|
145 calls our ``__init__`` method, with the output of ``__new__`` as the
|
Chris@87
|
146 first argument (now a class instance), and the passed arguments
|
Chris@87
|
147 following.
|
Chris@87
|
148
|
Chris@87
|
149 As you can see, the object can be initialized in the ``__new__``
|
Chris@87
|
150 method or the ``__init__`` method, or both, and in fact ndarray does
|
Chris@87
|
151 not have an ``__init__`` method, because all the initialization is
|
Chris@87
|
152 done in the ``__new__`` method.
|
Chris@87
|
153
|
Chris@87
|
154 Why use ``__new__`` rather than just the usual ``__init__``? Because
|
Chris@87
|
155 in some cases, as for ndarray, we want to be able to return an object
|
Chris@87
|
156 of some other class. Consider the following:
|
Chris@87
|
157
|
Chris@87
|
158 .. testcode::
|
Chris@87
|
159
|
Chris@87
|
160 class D(C):
|
Chris@87
|
161 def __new__(cls, *args):
|
Chris@87
|
162 print 'D cls is:', cls
|
Chris@87
|
163 print 'D args in __new__:', args
|
Chris@87
|
164 return C.__new__(C, *args)
|
Chris@87
|
165
|
Chris@87
|
166 def __init__(self, *args):
|
Chris@87
|
167 # we never get here
|
Chris@87
|
168 print 'In D __init__'
|
Chris@87
|
169
|
Chris@87
|
170 meaning that:
|
Chris@87
|
171
|
Chris@87
|
172 >>> obj = D('hello')
|
Chris@87
|
173 D cls is: <class 'D'>
|
Chris@87
|
174 D args in __new__: ('hello',)
|
Chris@87
|
175 Cls in __new__: <class 'C'>
|
Chris@87
|
176 Args in __new__: ('hello',)
|
Chris@87
|
177 >>> type(obj)
|
Chris@87
|
178 <class 'C'>
|
Chris@87
|
179
|
Chris@87
|
180 The definition of ``C`` is the same as before, but for ``D``, the
|
Chris@87
|
181 ``__new__`` method returns an instance of class ``C`` rather than
|
Chris@87
|
182 ``D``. Note that the ``__init__`` method of ``D`` does not get
|
Chris@87
|
183 called. In general, when the ``__new__`` method returns an object of
|
Chris@87
|
184 class other than the class in which it is defined, the ``__init__``
|
Chris@87
|
185 method of that class is not called.
|
Chris@87
|
186
|
Chris@87
|
187 This is how subclasses of the ndarray class are able to return views
|
Chris@87
|
188 that preserve the class type. When taking a view, the standard
|
Chris@87
|
189 ndarray machinery creates the new ndarray object with something
|
Chris@87
|
190 like::
|
Chris@87
|
191
|
Chris@87
|
192 obj = ndarray.__new__(subtype, shape, ...
|
Chris@87
|
193
|
Chris@87
|
194 where ``subdtype`` is the subclass. Thus the returned view is of the
|
Chris@87
|
195 same class as the subclass, rather than being of class ``ndarray``.
|
Chris@87
|
196
|
Chris@87
|
197 That solves the problem of returning views of the same type, but now
|
Chris@87
|
198 we have a new problem. The machinery of ndarray can set the class
|
Chris@87
|
199 this way, in its standard methods for taking views, but the ndarray
|
Chris@87
|
200 ``__new__`` method knows nothing of what we have done in our own
|
Chris@87
|
201 ``__new__`` method in order to set attributes, and so on. (Aside -
|
Chris@87
|
202 why not call ``obj = subdtype.__new__(...`` then? Because we may not
|
Chris@87
|
203 have a ``__new__`` method with the same call signature).
|
Chris@87
|
204
|
Chris@87
|
205 The role of ``__array_finalize__``
|
Chris@87
|
206 ==================================
|
Chris@87
|
207
|
Chris@87
|
208 ``__array_finalize__`` is the mechanism that numpy provides to allow
|
Chris@87
|
209 subclasses to handle the various ways that new instances get created.
|
Chris@87
|
210
|
Chris@87
|
211 Remember that subclass instances can come about in these three ways:
|
Chris@87
|
212
|
Chris@87
|
213 #. explicit constructor call (``obj = MySubClass(params)``). This will
|
Chris@87
|
214 call the usual sequence of ``MySubClass.__new__`` then (if it exists)
|
Chris@87
|
215 ``MySubClass.__init__``.
|
Chris@87
|
216 #. :ref:`view-casting`
|
Chris@87
|
217 #. :ref:`new-from-template`
|
Chris@87
|
218
|
Chris@87
|
219 Our ``MySubClass.__new__`` method only gets called in the case of the
|
Chris@87
|
220 explicit constructor call, so we can't rely on ``MySubClass.__new__`` or
|
Chris@87
|
221 ``MySubClass.__init__`` to deal with the view casting and
|
Chris@87
|
222 new-from-template. It turns out that ``MySubClass.__array_finalize__``
|
Chris@87
|
223 *does* get called for all three methods of object creation, so this is
|
Chris@87
|
224 where our object creation housekeeping usually goes.
|
Chris@87
|
225
|
Chris@87
|
226 * For the explicit constructor call, our subclass will need to create a
|
Chris@87
|
227 new ndarray instance of its own class. In practice this means that
|
Chris@87
|
228 we, the authors of the code, will need to make a call to
|
Chris@87
|
229 ``ndarray.__new__(MySubClass,...)``, or do view casting of an existing
|
Chris@87
|
230 array (see below)
|
Chris@87
|
231 * For view casting and new-from-template, the equivalent of
|
Chris@87
|
232 ``ndarray.__new__(MySubClass,...`` is called, at the C level.
|
Chris@87
|
233
|
Chris@87
|
234 The arguments that ``__array_finalize__`` recieves differ for the three
|
Chris@87
|
235 methods of instance creation above.
|
Chris@87
|
236
|
Chris@87
|
237 The following code allows us to look at the call sequences and arguments:
|
Chris@87
|
238
|
Chris@87
|
239 .. testcode::
|
Chris@87
|
240
|
Chris@87
|
241 import numpy as np
|
Chris@87
|
242
|
Chris@87
|
243 class C(np.ndarray):
|
Chris@87
|
244 def __new__(cls, *args, **kwargs):
|
Chris@87
|
245 print 'In __new__ with class %s' % cls
|
Chris@87
|
246 return np.ndarray.__new__(cls, *args, **kwargs)
|
Chris@87
|
247
|
Chris@87
|
248 def __init__(self, *args, **kwargs):
|
Chris@87
|
249 # in practice you probably will not need or want an __init__
|
Chris@87
|
250 # method for your subclass
|
Chris@87
|
251 print 'In __init__ with class %s' % self.__class__
|
Chris@87
|
252
|
Chris@87
|
253 def __array_finalize__(self, obj):
|
Chris@87
|
254 print 'In array_finalize:'
|
Chris@87
|
255 print ' self type is %s' % type(self)
|
Chris@87
|
256 print ' obj type is %s' % type(obj)
|
Chris@87
|
257
|
Chris@87
|
258
|
Chris@87
|
259 Now:
|
Chris@87
|
260
|
Chris@87
|
261 >>> # Explicit constructor
|
Chris@87
|
262 >>> c = C((10,))
|
Chris@87
|
263 In __new__ with class <class 'C'>
|
Chris@87
|
264 In array_finalize:
|
Chris@87
|
265 self type is <class 'C'>
|
Chris@87
|
266 obj type is <type 'NoneType'>
|
Chris@87
|
267 In __init__ with class <class 'C'>
|
Chris@87
|
268 >>> # View casting
|
Chris@87
|
269 >>> a = np.arange(10)
|
Chris@87
|
270 >>> cast_a = a.view(C)
|
Chris@87
|
271 In array_finalize:
|
Chris@87
|
272 self type is <class 'C'>
|
Chris@87
|
273 obj type is <type 'numpy.ndarray'>
|
Chris@87
|
274 >>> # Slicing (example of new-from-template)
|
Chris@87
|
275 >>> cv = c[:1]
|
Chris@87
|
276 In array_finalize:
|
Chris@87
|
277 self type is <class 'C'>
|
Chris@87
|
278 obj type is <class 'C'>
|
Chris@87
|
279
|
Chris@87
|
280 The signature of ``__array_finalize__`` is::
|
Chris@87
|
281
|
Chris@87
|
282 def __array_finalize__(self, obj):
|
Chris@87
|
283
|
Chris@87
|
284 ``ndarray.__new__`` passes ``__array_finalize__`` the new object, of our
|
Chris@87
|
285 own class (``self``) as well as the object from which the view has been
|
Chris@87
|
286 taken (``obj``). As you can see from the output above, the ``self`` is
|
Chris@87
|
287 always a newly created instance of our subclass, and the type of ``obj``
|
Chris@87
|
288 differs for the three instance creation methods:
|
Chris@87
|
289
|
Chris@87
|
290 * When called from the explicit constructor, ``obj`` is ``None``
|
Chris@87
|
291 * When called from view casting, ``obj`` can be an instance of any
|
Chris@87
|
292 subclass of ndarray, including our own.
|
Chris@87
|
293 * When called in new-from-template, ``obj`` is another instance of our
|
Chris@87
|
294 own subclass, that we might use to update the new ``self`` instance.
|
Chris@87
|
295
|
Chris@87
|
296 Because ``__array_finalize__`` is the only method that always sees new
|
Chris@87
|
297 instances being created, it is the sensible place to fill in instance
|
Chris@87
|
298 defaults for new object attributes, among other tasks.
|
Chris@87
|
299
|
Chris@87
|
300 This may be clearer with an example.
|
Chris@87
|
301
|
Chris@87
|
302 Simple example - adding an extra attribute to ndarray
|
Chris@87
|
303 -----------------------------------------------------
|
Chris@87
|
304
|
Chris@87
|
305 .. testcode::
|
Chris@87
|
306
|
Chris@87
|
307 import numpy as np
|
Chris@87
|
308
|
Chris@87
|
309 class InfoArray(np.ndarray):
|
Chris@87
|
310
|
Chris@87
|
311 def __new__(subtype, shape, dtype=float, buffer=None, offset=0,
|
Chris@87
|
312 strides=None, order=None, info=None):
|
Chris@87
|
313 # Create the ndarray instance of our type, given the usual
|
Chris@87
|
314 # ndarray input arguments. This will call the standard
|
Chris@87
|
315 # ndarray constructor, but return an object of our type.
|
Chris@87
|
316 # It also triggers a call to InfoArray.__array_finalize__
|
Chris@87
|
317 obj = np.ndarray.__new__(subtype, shape, dtype, buffer, offset, strides,
|
Chris@87
|
318 order)
|
Chris@87
|
319 # set the new 'info' attribute to the value passed
|
Chris@87
|
320 obj.info = info
|
Chris@87
|
321 # Finally, we must return the newly created object:
|
Chris@87
|
322 return obj
|
Chris@87
|
323
|
Chris@87
|
324 def __array_finalize__(self, obj):
|
Chris@87
|
325 # ``self`` is a new object resulting from
|
Chris@87
|
326 # ndarray.__new__(InfoArray, ...), therefore it only has
|
Chris@87
|
327 # attributes that the ndarray.__new__ constructor gave it -
|
Chris@87
|
328 # i.e. those of a standard ndarray.
|
Chris@87
|
329 #
|
Chris@87
|
330 # We could have got to the ndarray.__new__ call in 3 ways:
|
Chris@87
|
331 # From an explicit constructor - e.g. InfoArray():
|
Chris@87
|
332 # obj is None
|
Chris@87
|
333 # (we're in the middle of the InfoArray.__new__
|
Chris@87
|
334 # constructor, and self.info will be set when we return to
|
Chris@87
|
335 # InfoArray.__new__)
|
Chris@87
|
336 if obj is None: return
|
Chris@87
|
337 # From view casting - e.g arr.view(InfoArray):
|
Chris@87
|
338 # obj is arr
|
Chris@87
|
339 # (type(obj) can be InfoArray)
|
Chris@87
|
340 # From new-from-template - e.g infoarr[:3]
|
Chris@87
|
341 # type(obj) is InfoArray
|
Chris@87
|
342 #
|
Chris@87
|
343 # Note that it is here, rather than in the __new__ method,
|
Chris@87
|
344 # that we set the default value for 'info', because this
|
Chris@87
|
345 # method sees all creation of default objects - with the
|
Chris@87
|
346 # InfoArray.__new__ constructor, but also with
|
Chris@87
|
347 # arr.view(InfoArray).
|
Chris@87
|
348 self.info = getattr(obj, 'info', None)
|
Chris@87
|
349 # We do not need to return anything
|
Chris@87
|
350
|
Chris@87
|
351
|
Chris@87
|
352 Using the object looks like this:
|
Chris@87
|
353
|
Chris@87
|
354 >>> obj = InfoArray(shape=(3,)) # explicit constructor
|
Chris@87
|
355 >>> type(obj)
|
Chris@87
|
356 <class 'InfoArray'>
|
Chris@87
|
357 >>> obj.info is None
|
Chris@87
|
358 True
|
Chris@87
|
359 >>> obj = InfoArray(shape=(3,), info='information')
|
Chris@87
|
360 >>> obj.info
|
Chris@87
|
361 'information'
|
Chris@87
|
362 >>> v = obj[1:] # new-from-template - here - slicing
|
Chris@87
|
363 >>> type(v)
|
Chris@87
|
364 <class 'InfoArray'>
|
Chris@87
|
365 >>> v.info
|
Chris@87
|
366 'information'
|
Chris@87
|
367 >>> arr = np.arange(10)
|
Chris@87
|
368 >>> cast_arr = arr.view(InfoArray) # view casting
|
Chris@87
|
369 >>> type(cast_arr)
|
Chris@87
|
370 <class 'InfoArray'>
|
Chris@87
|
371 >>> cast_arr.info is None
|
Chris@87
|
372 True
|
Chris@87
|
373
|
Chris@87
|
374 This class isn't very useful, because it has the same constructor as the
|
Chris@87
|
375 bare ndarray object, including passing in buffers and shapes and so on.
|
Chris@87
|
376 We would probably prefer the constructor to be able to take an already
|
Chris@87
|
377 formed ndarray from the usual numpy calls to ``np.array`` and return an
|
Chris@87
|
378 object.
|
Chris@87
|
379
|
Chris@87
|
380 Slightly more realistic example - attribute added to existing array
|
Chris@87
|
381 -------------------------------------------------------------------
|
Chris@87
|
382
|
Chris@87
|
383 Here is a class that takes a standard ndarray that already exists, casts
|
Chris@87
|
384 as our type, and adds an extra attribute.
|
Chris@87
|
385
|
Chris@87
|
386 .. testcode::
|
Chris@87
|
387
|
Chris@87
|
388 import numpy as np
|
Chris@87
|
389
|
Chris@87
|
390 class RealisticInfoArray(np.ndarray):
|
Chris@87
|
391
|
Chris@87
|
392 def __new__(cls, input_array, info=None):
|
Chris@87
|
393 # Input array is an already formed ndarray instance
|
Chris@87
|
394 # We first cast to be our class type
|
Chris@87
|
395 obj = np.asarray(input_array).view(cls)
|
Chris@87
|
396 # add the new attribute to the created instance
|
Chris@87
|
397 obj.info = info
|
Chris@87
|
398 # Finally, we must return the newly created object:
|
Chris@87
|
399 return obj
|
Chris@87
|
400
|
Chris@87
|
401 def __array_finalize__(self, obj):
|
Chris@87
|
402 # see InfoArray.__array_finalize__ for comments
|
Chris@87
|
403 if obj is None: return
|
Chris@87
|
404 self.info = getattr(obj, 'info', None)
|
Chris@87
|
405
|
Chris@87
|
406
|
Chris@87
|
407 So:
|
Chris@87
|
408
|
Chris@87
|
409 >>> arr = np.arange(5)
|
Chris@87
|
410 >>> obj = RealisticInfoArray(arr, info='information')
|
Chris@87
|
411 >>> type(obj)
|
Chris@87
|
412 <class 'RealisticInfoArray'>
|
Chris@87
|
413 >>> obj.info
|
Chris@87
|
414 'information'
|
Chris@87
|
415 >>> v = obj[1:]
|
Chris@87
|
416 >>> type(v)
|
Chris@87
|
417 <class 'RealisticInfoArray'>
|
Chris@87
|
418 >>> v.info
|
Chris@87
|
419 'information'
|
Chris@87
|
420
|
Chris@87
|
421 .. _array-wrap:
|
Chris@87
|
422
|
Chris@87
|
423 ``__array_wrap__`` for ufuncs
|
Chris@87
|
424 -------------------------------------------------------
|
Chris@87
|
425
|
Chris@87
|
426 ``__array_wrap__`` gets called at the end of numpy ufuncs and other numpy
|
Chris@87
|
427 functions, to allow a subclass to set the type of the return value
|
Chris@87
|
428 and update attributes and metadata. Let's show how this works with an example.
|
Chris@87
|
429 First we make the same subclass as above, but with a different name and
|
Chris@87
|
430 some print statements:
|
Chris@87
|
431
|
Chris@87
|
432 .. testcode::
|
Chris@87
|
433
|
Chris@87
|
434 import numpy as np
|
Chris@87
|
435
|
Chris@87
|
436 class MySubClass(np.ndarray):
|
Chris@87
|
437
|
Chris@87
|
438 def __new__(cls, input_array, info=None):
|
Chris@87
|
439 obj = np.asarray(input_array).view(cls)
|
Chris@87
|
440 obj.info = info
|
Chris@87
|
441 return obj
|
Chris@87
|
442
|
Chris@87
|
443 def __array_finalize__(self, obj):
|
Chris@87
|
444 print 'In __array_finalize__:'
|
Chris@87
|
445 print ' self is %s' % repr(self)
|
Chris@87
|
446 print ' obj is %s' % repr(obj)
|
Chris@87
|
447 if obj is None: return
|
Chris@87
|
448 self.info = getattr(obj, 'info', None)
|
Chris@87
|
449
|
Chris@87
|
450 def __array_wrap__(self, out_arr, context=None):
|
Chris@87
|
451 print 'In __array_wrap__:'
|
Chris@87
|
452 print ' self is %s' % repr(self)
|
Chris@87
|
453 print ' arr is %s' % repr(out_arr)
|
Chris@87
|
454 # then just call the parent
|
Chris@87
|
455 return np.ndarray.__array_wrap__(self, out_arr, context)
|
Chris@87
|
456
|
Chris@87
|
457 We run a ufunc on an instance of our new array:
|
Chris@87
|
458
|
Chris@87
|
459 >>> obj = MySubClass(np.arange(5), info='spam')
|
Chris@87
|
460 In __array_finalize__:
|
Chris@87
|
461 self is MySubClass([0, 1, 2, 3, 4])
|
Chris@87
|
462 obj is array([0, 1, 2, 3, 4])
|
Chris@87
|
463 >>> arr2 = np.arange(5)+1
|
Chris@87
|
464 >>> ret = np.add(arr2, obj)
|
Chris@87
|
465 In __array_wrap__:
|
Chris@87
|
466 self is MySubClass([0, 1, 2, 3, 4])
|
Chris@87
|
467 arr is array([1, 3, 5, 7, 9])
|
Chris@87
|
468 In __array_finalize__:
|
Chris@87
|
469 self is MySubClass([1, 3, 5, 7, 9])
|
Chris@87
|
470 obj is MySubClass([0, 1, 2, 3, 4])
|
Chris@87
|
471 >>> ret
|
Chris@87
|
472 MySubClass([1, 3, 5, 7, 9])
|
Chris@87
|
473 >>> ret.info
|
Chris@87
|
474 'spam'
|
Chris@87
|
475
|
Chris@87
|
476 Note that the ufunc (``np.add``) has called the ``__array_wrap__`` method of the
|
Chris@87
|
477 input with the highest ``__array_priority__`` value, in this case
|
Chris@87
|
478 ``MySubClass.__array_wrap__``, with arguments ``self`` as ``obj``, and
|
Chris@87
|
479 ``out_arr`` as the (ndarray) result of the addition. In turn, the
|
Chris@87
|
480 default ``__array_wrap__`` (``ndarray.__array_wrap__``) has cast the
|
Chris@87
|
481 result to class ``MySubClass``, and called ``__array_finalize__`` -
|
Chris@87
|
482 hence the copying of the ``info`` attribute. This has all happened at the C level.
|
Chris@87
|
483
|
Chris@87
|
484 But, we could do anything we wanted:
|
Chris@87
|
485
|
Chris@87
|
486 .. testcode::
|
Chris@87
|
487
|
Chris@87
|
488 class SillySubClass(np.ndarray):
|
Chris@87
|
489
|
Chris@87
|
490 def __array_wrap__(self, arr, context=None):
|
Chris@87
|
491 return 'I lost your data'
|
Chris@87
|
492
|
Chris@87
|
493 >>> arr1 = np.arange(5)
|
Chris@87
|
494 >>> obj = arr1.view(SillySubClass)
|
Chris@87
|
495 >>> arr2 = np.arange(5)
|
Chris@87
|
496 >>> ret = np.multiply(obj, arr2)
|
Chris@87
|
497 >>> ret
|
Chris@87
|
498 'I lost your data'
|
Chris@87
|
499
|
Chris@87
|
500 So, by defining a specific ``__array_wrap__`` method for our subclass,
|
Chris@87
|
501 we can tweak the output from ufuncs. The ``__array_wrap__`` method
|
Chris@87
|
502 requires ``self``, then an argument - which is the result of the ufunc -
|
Chris@87
|
503 and an optional parameter *context*. This parameter is returned by some
|
Chris@87
|
504 ufuncs as a 3-element tuple: (name of the ufunc, argument of the ufunc,
|
Chris@87
|
505 domain of the ufunc). ``__array_wrap__`` should return an instance of
|
Chris@87
|
506 its containing class. See the masked array subclass for an
|
Chris@87
|
507 implementation.
|
Chris@87
|
508
|
Chris@87
|
509 In addition to ``__array_wrap__``, which is called on the way out of the
|
Chris@87
|
510 ufunc, there is also an ``__array_prepare__`` method which is called on
|
Chris@87
|
511 the way into the ufunc, after the output arrays are created but before any
|
Chris@87
|
512 computation has been performed. The default implementation does nothing
|
Chris@87
|
513 but pass through the array. ``__array_prepare__`` should not attempt to
|
Chris@87
|
514 access the array data or resize the array, it is intended for setting the
|
Chris@87
|
515 output array type, updating attributes and metadata, and performing any
|
Chris@87
|
516 checks based on the input that may be desired before computation begins.
|
Chris@87
|
517 Like ``__array_wrap__``, ``__array_prepare__`` must return an ndarray or
|
Chris@87
|
518 subclass thereof or raise an error.
|
Chris@87
|
519
|
Chris@87
|
520 Extra gotchas - custom ``__del__`` methods and ndarray.base
|
Chris@87
|
521 -----------------------------------------------------------
|
Chris@87
|
522
|
Chris@87
|
523 One of the problems that ndarray solves is keeping track of memory
|
Chris@87
|
524 ownership of ndarrays and their views. Consider the case where we have
|
Chris@87
|
525 created an ndarray, ``arr`` and have taken a slice with ``v = arr[1:]``.
|
Chris@87
|
526 The two objects are looking at the same memory. Numpy keeps track of
|
Chris@87
|
527 where the data came from for a particular array or view, with the
|
Chris@87
|
528 ``base`` attribute:
|
Chris@87
|
529
|
Chris@87
|
530 >>> # A normal ndarray, that owns its own data
|
Chris@87
|
531 >>> arr = np.zeros((4,))
|
Chris@87
|
532 >>> # In this case, base is None
|
Chris@87
|
533 >>> arr.base is None
|
Chris@87
|
534 True
|
Chris@87
|
535 >>> # We take a view
|
Chris@87
|
536 >>> v1 = arr[1:]
|
Chris@87
|
537 >>> # base now points to the array that it derived from
|
Chris@87
|
538 >>> v1.base is arr
|
Chris@87
|
539 True
|
Chris@87
|
540 >>> # Take a view of a view
|
Chris@87
|
541 >>> v2 = v1[1:]
|
Chris@87
|
542 >>> # base points to the view it derived from
|
Chris@87
|
543 >>> v2.base is v1
|
Chris@87
|
544 True
|
Chris@87
|
545
|
Chris@87
|
546 In general, if the array owns its own memory, as for ``arr`` in this
|
Chris@87
|
547 case, then ``arr.base`` will be None - there are some exceptions to this
|
Chris@87
|
548 - see the numpy book for more details.
|
Chris@87
|
549
|
Chris@87
|
550 The ``base`` attribute is useful in being able to tell whether we have
|
Chris@87
|
551 a view or the original array. This in turn can be useful if we need
|
Chris@87
|
552 to know whether or not to do some specific cleanup when the subclassed
|
Chris@87
|
553 array is deleted. For example, we may only want to do the cleanup if
|
Chris@87
|
554 the original array is deleted, but not the views. For an example of
|
Chris@87
|
555 how this can work, have a look at the ``memmap`` class in
|
Chris@87
|
556 ``numpy.core``.
|
Chris@87
|
557
|
Chris@87
|
558
|
Chris@87
|
559 """
|
Chris@87
|
560 from __future__ import division, absolute_import, print_function
|