annotate DEPENDENCIES/mingw32/Python27/Lib/site-packages/numpy/doc/structured_arrays.py @ 133:4acb5d8d80b6 tip

Don't fail environmental check if README.md exists (but .txt and no-suffix don't)
author Chris Cannam
date Tue, 30 Jul 2019 12:25:44 +0100
parents 2a2c65a20a8b
children
rev   line source
Chris@87 1 """
Chris@87 2 =====================================
Chris@87 3 Structured Arrays (and Record Arrays)
Chris@87 4 =====================================
Chris@87 5
Chris@87 6 Introduction
Chris@87 7 ============
Chris@87 8
Chris@87 9 Numpy provides powerful capabilities to create arrays of structs or records.
Chris@87 10 These arrays permit one to manipulate the data by the structs or by fields of
Chris@87 11 the struct. A simple example will show what is meant.: ::
Chris@87 12
Chris@87 13 >>> x = np.zeros((2,),dtype=('i4,f4,a10'))
Chris@87 14 >>> x[:] = [(1,2.,'Hello'),(2,3.,"World")]
Chris@87 15 >>> x
Chris@87 16 array([(1, 2.0, 'Hello'), (2, 3.0, 'World')],
Chris@87 17 dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')])
Chris@87 18
Chris@87 19 Here we have created a one-dimensional array of length 2. Each element of
Chris@87 20 this array is a record that contains three items, a 32-bit integer, a 32-bit
Chris@87 21 float, and a string of length 10 or less. If we index this array at the second
Chris@87 22 position we get the second record: ::
Chris@87 23
Chris@87 24 >>> x[1]
Chris@87 25 (2,3.,"World")
Chris@87 26
Chris@87 27 Conveniently, one can access any field of the array by indexing using the
Chris@87 28 string that names that field. In this case the fields have received the
Chris@87 29 default names 'f0', 'f1' and 'f2'. ::
Chris@87 30
Chris@87 31 >>> y = x['f1']
Chris@87 32 >>> y
Chris@87 33 array([ 2., 3.], dtype=float32)
Chris@87 34 >>> y[:] = 2*y
Chris@87 35 >>> y
Chris@87 36 array([ 4., 6.], dtype=float32)
Chris@87 37 >>> x
Chris@87 38 array([(1, 4.0, 'Hello'), (2, 6.0, 'World')],
Chris@87 39 dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')])
Chris@87 40
Chris@87 41 In these examples, y is a simple float array consisting of the 2nd field
Chris@87 42 in the record. But, rather than being a copy of the data in the structured
Chris@87 43 array, it is a view, i.e., it shares exactly the same memory locations.
Chris@87 44 Thus, when we updated this array by doubling its values, the structured
Chris@87 45 array shows the corresponding values as doubled as well. Likewise, if one
Chris@87 46 changes the record, the field view also changes: ::
Chris@87 47
Chris@87 48 >>> x[1] = (-1,-1.,"Master")
Chris@87 49 >>> x
Chris@87 50 array([(1, 4.0, 'Hello'), (-1, -1.0, 'Master')],
Chris@87 51 dtype=[('f0', '>i4'), ('f1', '>f4'), ('f2', '|S10')])
Chris@87 52 >>> y
Chris@87 53 array([ 4., -1.], dtype=float32)
Chris@87 54
Chris@87 55 Defining Structured Arrays
Chris@87 56 ==========================
Chris@87 57
Chris@87 58 One defines a structured array through the dtype object. There are
Chris@87 59 **several** alternative ways to define the fields of a record. Some of
Chris@87 60 these variants provide backward compatibility with Numeric, numarray, or
Chris@87 61 another module, and should not be used except for such purposes. These
Chris@87 62 will be so noted. One specifies record structure in
Chris@87 63 one of four alternative ways, using an argument (as supplied to a dtype
Chris@87 64 function keyword or a dtype object constructor itself). This
Chris@87 65 argument must be one of the following: 1) string, 2) tuple, 3) list, or
Chris@87 66 4) dictionary. Each of these is briefly described below.
Chris@87 67
Chris@87 68 1) String argument (as used in the above examples).
Chris@87 69 In this case, the constructor expects a comma-separated list of type
Chris@87 70 specifiers, optionally with extra shape information.
Chris@87 71 The type specifiers can take 4 different forms: ::
Chris@87 72
Chris@87 73 a) b1, i1, i2, i4, i8, u1, u2, u4, u8, f2, f4, f8, c8, c16, a<n>
Chris@87 74 (representing bytes, ints, unsigned ints, floats, complex and
Chris@87 75 fixed length strings of specified byte lengths)
Chris@87 76 b) int8,...,uint8,...,float16, float32, float64, complex64, complex128
Chris@87 77 (this time with bit sizes)
Chris@87 78 c) older Numeric/numarray type specifications (e.g. Float32).
Chris@87 79 Don't use these in new code!
Chris@87 80 d) Single character type specifiers (e.g H for unsigned short ints).
Chris@87 81 Avoid using these unless you must. Details can be found in the
Chris@87 82 Numpy book
Chris@87 83
Chris@87 84 These different styles can be mixed within the same string (but why would you
Chris@87 85 want to do that?). Furthermore, each type specifier can be prefixed
Chris@87 86 with a repetition number, or a shape. In these cases an array
Chris@87 87 element is created, i.e., an array within a record. That array
Chris@87 88 is still referred to as a single field. An example: ::
Chris@87 89
Chris@87 90 >>> x = np.zeros(3, dtype='3int8, float32, (2,3)float64')
Chris@87 91 >>> x
Chris@87 92 array([([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]),
Chris@87 93 ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]),
Chris@87 94 ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]])],
Chris@87 95 dtype=[('f0', '|i1', 3), ('f1', '>f4'), ('f2', '>f8', (2, 3))])
Chris@87 96
Chris@87 97 By using strings to define the record structure, it precludes being
Chris@87 98 able to name the fields in the original definition. The names can
Chris@87 99 be changed as shown later, however.
Chris@87 100
Chris@87 101 2) Tuple argument: The only relevant tuple case that applies to record
Chris@87 102 structures is when a structure is mapped to an existing data type. This
Chris@87 103 is done by pairing in a tuple, the existing data type with a matching
Chris@87 104 dtype definition (using any of the variants being described here). As
Chris@87 105 an example (using a definition using a list, so see 3) for further
Chris@87 106 details): ::
Chris@87 107
Chris@87 108 >>> x = np.zeros(3, dtype=('i4',[('r','u1'), ('g','u1'), ('b','u1'), ('a','u1')]))
Chris@87 109 >>> x
Chris@87 110 array([0, 0, 0])
Chris@87 111 >>> x['r']
Chris@87 112 array([0, 0, 0], dtype=uint8)
Chris@87 113
Chris@87 114 In this case, an array is produced that looks and acts like a simple int32 array,
Chris@87 115 but also has definitions for fields that use only one byte of the int32 (a bit
Chris@87 116 like Fortran equivalencing).
Chris@87 117
Chris@87 118 3) List argument: In this case the record structure is defined with a list of
Chris@87 119 tuples. Each tuple has 2 or 3 elements specifying: 1) The name of the field
Chris@87 120 ('' is permitted), 2) the type of the field, and 3) the shape (optional).
Chris@87 121 For example::
Chris@87 122
Chris@87 123 >>> x = np.zeros(3, dtype=[('x','f4'),('y',np.float32),('value','f4',(2,2))])
Chris@87 124 >>> x
Chris@87 125 array([(0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]),
Chris@87 126 (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]),
Chris@87 127 (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]])],
Chris@87 128 dtype=[('x', '>f4'), ('y', '>f4'), ('value', '>f4', (2, 2))])
Chris@87 129
Chris@87 130 4) Dictionary argument: two different forms are permitted. The first consists
Chris@87 131 of a dictionary with two required keys ('names' and 'formats'), each having an
Chris@87 132 equal sized list of values. The format list contains any type/shape specifier
Chris@87 133 allowed in other contexts. The names must be strings. There are two optional
Chris@87 134 keys: 'offsets' and 'titles'. Each must be a correspondingly matching list to
Chris@87 135 the required two where offsets contain integer offsets for each field, and
Chris@87 136 titles are objects containing metadata for each field (these do not have
Chris@87 137 to be strings), where the value of None is permitted. As an example: ::
Chris@87 138
Chris@87 139 >>> x = np.zeros(3, dtype={'names':['col1', 'col2'], 'formats':['i4','f4']})
Chris@87 140 >>> x
Chris@87 141 array([(0, 0.0), (0, 0.0), (0, 0.0)],
Chris@87 142 dtype=[('col1', '>i4'), ('col2', '>f4')])
Chris@87 143
Chris@87 144 The other dictionary form permitted is a dictionary of name keys with tuple
Chris@87 145 values specifying type, offset, and an optional title. ::
Chris@87 146
Chris@87 147 >>> x = np.zeros(3, dtype={'col1':('i1',0,'title 1'), 'col2':('f4',1,'title 2')})
Chris@87 148 >>> x
Chris@87 149 array([(0, 0.0), (0, 0.0), (0, 0.0)],
Chris@87 150 dtype=[(('title 1', 'col1'), '|i1'), (('title 2', 'col2'), '>f4')])
Chris@87 151
Chris@87 152 Accessing and modifying field names
Chris@87 153 ===================================
Chris@87 154
Chris@87 155 The field names are an attribute of the dtype object defining the record structure.
Chris@87 156 For the last example: ::
Chris@87 157
Chris@87 158 >>> x.dtype.names
Chris@87 159 ('col1', 'col2')
Chris@87 160 >>> x.dtype.names = ('x', 'y')
Chris@87 161 >>> x
Chris@87 162 array([(0, 0.0), (0, 0.0), (0, 0.0)],
Chris@87 163 dtype=[(('title 1', 'x'), '|i1'), (('title 2', 'y'), '>f4')])
Chris@87 164 >>> x.dtype.names = ('x', 'y', 'z') # wrong number of names
Chris@87 165 <type 'exceptions.ValueError'>: must replace all names at once with a sequence of length 2
Chris@87 166
Chris@87 167 Accessing field titles
Chris@87 168 ====================================
Chris@87 169
Chris@87 170 The field titles provide a standard place to put associated info for fields.
Chris@87 171 They do not have to be strings. ::
Chris@87 172
Chris@87 173 >>> x.dtype.fields['x'][2]
Chris@87 174 'title 1'
Chris@87 175
Chris@87 176 Accessing multiple fields at once
Chris@87 177 ====================================
Chris@87 178
Chris@87 179 You can access multiple fields at once using a list of field names: ::
Chris@87 180
Chris@87 181 >>> x = np.array([(1.5,2.5,(1.0,2.0)),(3.,4.,(4.,5.)),(1.,3.,(2.,6.))],
Chris@87 182 dtype=[('x','f4'),('y',np.float32),('value','f4',(2,2))])
Chris@87 183
Chris@87 184 Notice that `x` is created with a list of tuples. ::
Chris@87 185
Chris@87 186 >>> x[['x','y']]
Chris@87 187 array([(1.5, 2.5), (3.0, 4.0), (1.0, 3.0)],
Chris@87 188 dtype=[('x', '<f4'), ('y', '<f4')])
Chris@87 189 >>> x[['x','value']]
Chris@87 190 array([(1.5, [[1.0, 2.0], [1.0, 2.0]]), (3.0, [[4.0, 5.0], [4.0, 5.0]]),
Chris@87 191 (1.0, [[2.0, 6.0], [2.0, 6.0]])],
Chris@87 192 dtype=[('x', '<f4'), ('value', '<f4', (2, 2))])
Chris@87 193
Chris@87 194 The fields are returned in the order they are asked for.::
Chris@87 195
Chris@87 196 >>> x[['y','x']]
Chris@87 197 array([(2.5, 1.5), (4.0, 3.0), (3.0, 1.0)],
Chris@87 198 dtype=[('y', '<f4'), ('x', '<f4')])
Chris@87 199
Chris@87 200 Filling structured arrays
Chris@87 201 =========================
Chris@87 202
Chris@87 203 Structured arrays can be filled by field or row by row. ::
Chris@87 204
Chris@87 205 >>> arr = np.zeros((5,), dtype=[('var1','f8'),('var2','f8')])
Chris@87 206 >>> arr['var1'] = np.arange(5)
Chris@87 207
Chris@87 208 If you fill it in row by row, it takes a take a tuple
Chris@87 209 (but not a list or array!)::
Chris@87 210
Chris@87 211 >>> arr[0] = (10,20)
Chris@87 212 >>> arr
Chris@87 213 array([(10.0, 20.0), (1.0, 0.0), (2.0, 0.0), (3.0, 0.0), (4.0, 0.0)],
Chris@87 214 dtype=[('var1', '<f8'), ('var2', '<f8')])
Chris@87 215
Chris@87 216 More information
Chris@87 217 ====================================
Chris@87 218 You can find some more information on recarrays and structured arrays
Chris@87 219 (including the difference between the two) `here
Chris@87 220 <http://www.scipy.org/Cookbook/Recarray>`_.
Chris@87 221
Chris@87 222 """
Chris@87 223 from __future__ import division, absolute_import, print_function