annotate DEPENDENCIES/mingw32/Python27/Lib/site-packages/numpy/doc/byteswapping.py @ 133:4acb5d8d80b6 tip

Don't fail environmental check if README.md exists (but .txt and no-suffix don't)
author Chris Cannam
date Tue, 30 Jul 2019 12:25:44 +0100
parents 2a2c65a20a8b
children
rev   line source
Chris@87 1 """
Chris@87 2
Chris@87 3 =============================
Chris@87 4 Byteswapping and byte order
Chris@87 5 =============================
Chris@87 6
Chris@87 7 Introduction to byte ordering and ndarrays
Chris@87 8 ==========================================
Chris@87 9
Chris@87 10 The ``ndarray`` is an object that provide a python array interface to data
Chris@87 11 in memory.
Chris@87 12
Chris@87 13 It often happens that the memory that you want to view with an array is
Chris@87 14 not of the same byte ordering as the computer on which you are running
Chris@87 15 Python.
Chris@87 16
Chris@87 17 For example, I might be working on a computer with a little-endian CPU -
Chris@87 18 such as an Intel Pentium, but I have loaded some data from a file
Chris@87 19 written by a computer that is big-endian. Let's say I have loaded 4
Chris@87 20 bytes from a file written by a Sun (big-endian) computer. I know that
Chris@87 21 these 4 bytes represent two 16-bit integers. On a big-endian machine, a
Chris@87 22 two-byte integer is stored with the Most Significant Byte (MSB) first,
Chris@87 23 and then the Least Significant Byte (LSB). Thus the bytes are, in memory order:
Chris@87 24
Chris@87 25 #. MSB integer 1
Chris@87 26 #. LSB integer 1
Chris@87 27 #. MSB integer 2
Chris@87 28 #. LSB integer 2
Chris@87 29
Chris@87 30 Let's say the two integers were in fact 1 and 770. Because 770 = 256 *
Chris@87 31 3 + 2, the 4 bytes in memory would contain respectively: 0, 1, 3, 2.
Chris@87 32 The bytes I have loaded from the file would have these contents:
Chris@87 33
Chris@87 34 >>> big_end_str = chr(0) + chr(1) + chr(3) + chr(2)
Chris@87 35 >>> big_end_str
Chris@87 36 '\\x00\\x01\\x03\\x02'
Chris@87 37
Chris@87 38 We might want to use an ``ndarray`` to access these integers. In that
Chris@87 39 case, we can create an array around this memory, and tell numpy that
Chris@87 40 there are two integers, and that they are 16 bit and big-endian:
Chris@87 41
Chris@87 42 >>> import numpy as np
Chris@87 43 >>> big_end_arr = np.ndarray(shape=(2,),dtype='>i2', buffer=big_end_str)
Chris@87 44 >>> big_end_arr[0]
Chris@87 45 1
Chris@87 46 >>> big_end_arr[1]
Chris@87 47 770
Chris@87 48
Chris@87 49 Note the array ``dtype`` above of ``>i2``. The ``>`` means 'big-endian'
Chris@87 50 (``<`` is little-endian) and ``i2`` means 'signed 2-byte integer'. For
Chris@87 51 example, if our data represented a single unsigned 4-byte little-endian
Chris@87 52 integer, the dtype string would be ``<u4``.
Chris@87 53
Chris@87 54 In fact, why don't we try that?
Chris@87 55
Chris@87 56 >>> little_end_u4 = np.ndarray(shape=(1,),dtype='<u4', buffer=big_end_str)
Chris@87 57 >>> little_end_u4[0] == 1 * 256**1 + 3 * 256**2 + 2 * 256**3
Chris@87 58 True
Chris@87 59
Chris@87 60 Returning to our ``big_end_arr`` - in this case our underlying data is
Chris@87 61 big-endian (data endianness) and we've set the dtype to match (the dtype
Chris@87 62 is also big-endian). However, sometimes you need to flip these around.
Chris@87 63
Chris@87 64 Changing byte ordering
Chris@87 65 ======================
Chris@87 66
Chris@87 67 As you can imagine from the introduction, there are two ways you can
Chris@87 68 affect the relationship between the byte ordering of the array and the
Chris@87 69 underlying memory it is looking at:
Chris@87 70
Chris@87 71 * Change the byte-ordering information in the array dtype so that it
Chris@87 72 interprets the undelying data as being in a different byte order.
Chris@87 73 This is the role of ``arr.newbyteorder()``
Chris@87 74 * Change the byte-ordering of the underlying data, leaving the dtype
Chris@87 75 interpretation as it was. This is what ``arr.byteswap()`` does.
Chris@87 76
Chris@87 77 The common situations in which you need to change byte ordering are:
Chris@87 78
Chris@87 79 #. Your data and dtype endianess don't match, and you want to change
Chris@87 80 the dtype so that it matches the data.
Chris@87 81 #. Your data and dtype endianess don't match, and you want to swap the
Chris@87 82 data so that they match the dtype
Chris@87 83 #. Your data and dtype endianess match, but you want the data swapped
Chris@87 84 and the dtype to reflect this
Chris@87 85
Chris@87 86 Data and dtype endianness don't match, change dtype to match data
Chris@87 87 -----------------------------------------------------------------
Chris@87 88
Chris@87 89 We make something where they don't match:
Chris@87 90
Chris@87 91 >>> wrong_end_dtype_arr = np.ndarray(shape=(2,),dtype='<i2', buffer=big_end_str)
Chris@87 92 >>> wrong_end_dtype_arr[0]
Chris@87 93 256
Chris@87 94
Chris@87 95 The obvious fix for this situation is to change the dtype so it gives
Chris@87 96 the correct endianness:
Chris@87 97
Chris@87 98 >>> fixed_end_dtype_arr = wrong_end_dtype_arr.newbyteorder()
Chris@87 99 >>> fixed_end_dtype_arr[0]
Chris@87 100 1
Chris@87 101
Chris@87 102 Note the the array has not changed in memory:
Chris@87 103
Chris@87 104 >>> fixed_end_dtype_arr.tobytes() == big_end_str
Chris@87 105 True
Chris@87 106
Chris@87 107 Data and type endianness don't match, change data to match dtype
Chris@87 108 ----------------------------------------------------------------
Chris@87 109
Chris@87 110 You might want to do this if you need the data in memory to be a certain
Chris@87 111 ordering. For example you might be writing the memory out to a file
Chris@87 112 that needs a certain byte ordering.
Chris@87 113
Chris@87 114 >>> fixed_end_mem_arr = wrong_end_dtype_arr.byteswap()
Chris@87 115 >>> fixed_end_mem_arr[0]
Chris@87 116 1
Chris@87 117
Chris@87 118 Now the array *has* changed in memory:
Chris@87 119
Chris@87 120 >>> fixed_end_mem_arr.tobytes() == big_end_str
Chris@87 121 False
Chris@87 122
Chris@87 123 Data and dtype endianness match, swap data and dtype
Chris@87 124 ----------------------------------------------------
Chris@87 125
Chris@87 126 You may have a correctly specified array dtype, but you need the array
Chris@87 127 to have the opposite byte order in memory, and you want the dtype to
Chris@87 128 match so the array values make sense. In this case you just do both of
Chris@87 129 the previous operations:
Chris@87 130
Chris@87 131 >>> swapped_end_arr = big_end_arr.byteswap().newbyteorder()
Chris@87 132 >>> swapped_end_arr[0]
Chris@87 133 1
Chris@87 134 >>> swapped_end_arr.tobytes() == big_end_str
Chris@87 135 False
Chris@87 136
Chris@87 137 An easier way of casting the data to a specific dtype and byte ordering
Chris@87 138 can be achieved with the ndarray astype method:
Chris@87 139
Chris@87 140 >>> swapped_end_arr = big_end_arr.astype('<i2')
Chris@87 141 >>> swapped_end_arr[0]
Chris@87 142 1
Chris@87 143 >>> swapped_end_arr.tobytes() == big_end_str
Chris@87 144 False
Chris@87 145
Chris@87 146 """
Chris@87 147 from __future__ import division, absolute_import, print_function