comparison DEPENDENCIES/mingw32/Python27/Lib/site-packages/numpy/lib/format.py @ 87:2a2c65a20a8b

Add Python libs and headers
author Chris Cannam
date Wed, 25 Feb 2015 14:05:22 +0000
parents
children
comparison
equal deleted inserted replaced
86:413a9d26189e 87:2a2c65a20a8b
1 """
2 Define a simple format for saving numpy arrays to disk with the full
3 information about them.
4
5 The ``.npy`` format is the standard binary file format in NumPy for
6 persisting a *single* arbitrary NumPy array on disk. The format stores all
7 of the shape and dtype information necessary to reconstruct the array
8 correctly even on another machine with a different architecture.
9 The format is designed to be as simple as possible while achieving
10 its limited goals.
11
12 The ``.npz`` format is the standard format for persisting *multiple* NumPy
13 arrays on disk. A ``.npz`` file is a zip file containing multiple ``.npy``
14 files, one for each array.
15
16 Capabilities
17 ------------
18
19 - Can represent all NumPy arrays including nested record arrays and
20 object arrays.
21
22 - Represents the data in its native binary form.
23
24 - Supports Fortran-contiguous arrays directly.
25
26 - Stores all of the necessary information to reconstruct the array
27 including shape and dtype on a machine of a different
28 architecture. Both little-endian and big-endian arrays are
29 supported, and a file with little-endian numbers will yield
30 a little-endian array on any machine reading the file. The
31 types are described in terms of their actual sizes. For example,
32 if a machine with a 64-bit C "long int" writes out an array with
33 "long ints", a reading machine with 32-bit C "long ints" will yield
34 an array with 64-bit integers.
35
36 - Is straightforward to reverse engineer. Datasets often live longer than
37 the programs that created them. A competent developer should be
38 able to create a solution in his preferred programming language to
39 read most ``.npy`` files that he has been given without much
40 documentation.
41
42 - Allows memory-mapping of the data. See `open_memmep`.
43
44 - Can be read from a filelike stream object instead of an actual file.
45
46 - Stores object arrays, i.e. arrays containing elements that are arbitrary
47 Python objects. Files with object arrays are not to be mmapable, but
48 can be read and written to disk.
49
50 Limitations
51 -----------
52
53 - Arbitrary subclasses of numpy.ndarray are not completely preserved.
54 Subclasses will be accepted for writing, but only the array data will
55 be written out. A regular numpy.ndarray object will be created
56 upon reading the file.
57
58 .. warning::
59
60 Due to limitations in the interpretation of structured dtypes, dtypes
61 with fields with empty names will have the names replaced by 'f0', 'f1',
62 etc. Such arrays will not round-trip through the format entirely
63 accurately. The data is intact; only the field names will differ. We are
64 working on a fix for this. This fix will not require a change in the
65 file format. The arrays with such structures can still be saved and
66 restored, and the correct dtype may be restored by using the
67 ``loadedarray.view(correct_dtype)`` method.
68
69 File extensions
70 ---------------
71
72 We recommend using the ``.npy`` and ``.npz`` extensions for files saved
73 in this format. This is by no means a requirement; applications may wish
74 to use these file formats but use an extension specific to the
75 application. In the absence of an obvious alternative, however,
76 we suggest using ``.npy`` and ``.npz``.
77
78 Version numbering
79 -----------------
80
81 The version numbering of these formats is independent of NumPy version
82 numbering. If the format is upgraded, the code in `numpy.io` will still
83 be able to read and write Version 1.0 files.
84
85 Format Version 1.0
86 ------------------
87
88 The first 6 bytes are a magic string: exactly ``\\x93NUMPY``.
89
90 The next 1 byte is an unsigned byte: the major version number of the file
91 format, e.g. ``\\x01``.
92
93 The next 1 byte is an unsigned byte: the minor version number of the file
94 format, e.g. ``\\x00``. Note: the version of the file format is not tied
95 to the version of the numpy package.
96
97 The next 2 bytes form a little-endian unsigned short int: the length of
98 the header data HEADER_LEN.
99
100 The next HEADER_LEN bytes form the header data describing the array's
101 format. It is an ASCII string which contains a Python literal expression
102 of a dictionary. It is terminated by a newline (``\\n``) and padded with
103 spaces (``\\x20``) to make the total length of
104 ``magic string + 4 + HEADER_LEN`` be evenly divisible by 16 for alignment
105 purposes.
106
107 The dictionary contains three keys:
108
109 "descr" : dtype.descr
110 An object that can be passed as an argument to the `numpy.dtype`
111 constructor to create the array's dtype.
112 "fortran_order" : bool
113 Whether the array data is Fortran-contiguous or not. Since
114 Fortran-contiguous arrays are a common form of non-C-contiguity,
115 we allow them to be written directly to disk for efficiency.
116 "shape" : tuple of int
117 The shape of the array.
118
119 For repeatability and readability, the dictionary keys are sorted in
120 alphabetic order. This is for convenience only. A writer SHOULD implement
121 this if possible. A reader MUST NOT depend on this.
122
123 Following the header comes the array data. If the dtype contains Python
124 objects (i.e. ``dtype.hasobject is True``), then the data is a Python
125 pickle of the array. Otherwise the data is the contiguous (either C-
126 or Fortran-, depending on ``fortran_order``) bytes of the array.
127 Consumers can figure out the number of bytes by multiplying the number
128 of elements given by the shape (noting that ``shape=()`` means there is
129 1 element) by ``dtype.itemsize``.
130
131 Notes
132 -----
133 The ``.npy`` format, including reasons for creating it and a comparison of
134 alternatives, is described fully in the "npy-format" NEP.
135
136 """
137 from __future__ import division, absolute_import, print_function
138
139 import numpy
140 import sys
141 import io
142 import warnings
143 from numpy.lib.utils import safe_eval
144 from numpy.compat import asbytes, asstr, isfileobj, long, basestring
145
146 if sys.version_info[0] >= 3:
147 import pickle
148 else:
149 import cPickle as pickle
150
151 MAGIC_PREFIX = asbytes('\x93NUMPY')
152 MAGIC_LEN = len(MAGIC_PREFIX) + 2
153 BUFFER_SIZE = 2**18 # size of buffer for reading npz files in bytes
154
155 # difference between version 1.0 and 2.0 is a 4 byte (I) header length
156 # instead of 2 bytes (H) allowing storage of large structured arrays
157
158 def _check_version(version):
159 if version not in [(1, 0), (2, 0), None]:
160 msg = "we only support format version (1,0) and (2, 0), not %s"
161 raise ValueError(msg % (version,))
162
163 def magic(major, minor):
164 """ Return the magic string for the given file format version.
165
166 Parameters
167 ----------
168 major : int in [0, 255]
169 minor : int in [0, 255]
170
171 Returns
172 -------
173 magic : str
174
175 Raises
176 ------
177 ValueError if the version cannot be formatted.
178 """
179 if major < 0 or major > 255:
180 raise ValueError("major version must be 0 <= major < 256")
181 if minor < 0 or minor > 255:
182 raise ValueError("minor version must be 0 <= minor < 256")
183 if sys.version_info[0] < 3:
184 return MAGIC_PREFIX + chr(major) + chr(minor)
185 else:
186 return MAGIC_PREFIX + bytes([major, minor])
187
188 def read_magic(fp):
189 """ Read the magic string to get the version of the file format.
190
191 Parameters
192 ----------
193 fp : filelike object
194
195 Returns
196 -------
197 major : int
198 minor : int
199 """
200 magic_str = _read_bytes(fp, MAGIC_LEN, "magic string")
201 if magic_str[:-2] != MAGIC_PREFIX:
202 msg = "the magic string is not correct; expected %r, got %r"
203 raise ValueError(msg % (MAGIC_PREFIX, magic_str[:-2]))
204 if sys.version_info[0] < 3:
205 major, minor = map(ord, magic_str[-2:])
206 else:
207 major, minor = magic_str[-2:]
208 return major, minor
209
210 def dtype_to_descr(dtype):
211 """
212 Get a serializable descriptor from the dtype.
213
214 The .descr attribute of a dtype object cannot be round-tripped through
215 the dtype() constructor. Simple types, like dtype('float32'), have
216 a descr which looks like a record array with one field with '' as
217 a name. The dtype() constructor interprets this as a request to give
218 a default name. Instead, we construct descriptor that can be passed to
219 dtype().
220
221 Parameters
222 ----------
223 dtype : dtype
224 The dtype of the array that will be written to disk.
225
226 Returns
227 -------
228 descr : object
229 An object that can be passed to `numpy.dtype()` in order to
230 replicate the input dtype.
231
232 """
233 if dtype.names is not None:
234 # This is a record array. The .descr is fine. XXX: parts of the
235 # record array with an empty name, like padding bytes, still get
236 # fiddled with. This needs to be fixed in the C implementation of
237 # dtype().
238 return dtype.descr
239 else:
240 return dtype.str
241
242 def header_data_from_array_1_0(array):
243 """ Get the dictionary of header metadata from a numpy.ndarray.
244
245 Parameters
246 ----------
247 array : numpy.ndarray
248
249 Returns
250 -------
251 d : dict
252 This has the appropriate entries for writing its string representation
253 to the header of the file.
254 """
255 d = {}
256 d['shape'] = array.shape
257 if array.flags.c_contiguous:
258 d['fortran_order'] = False
259 elif array.flags.f_contiguous:
260 d['fortran_order'] = True
261 else:
262 # Totally non-contiguous data. We will have to make it C-contiguous
263 # before writing. Note that we need to test for C_CONTIGUOUS first
264 # because a 1-D array is both C_CONTIGUOUS and F_CONTIGUOUS.
265 d['fortran_order'] = False
266
267 d['descr'] = dtype_to_descr(array.dtype)
268 return d
269
270 def _write_array_header(fp, d, version=None):
271 """ Write the header for an array and returns the version used
272
273 Parameters
274 ----------
275 fp : filelike object
276 d : dict
277 This has the appropriate entries for writing its string representation
278 to the header of the file.
279 version: tuple or None
280 None means use oldest that works
281 explicit version will raise a ValueError if the format does not
282 allow saving this data. Default: None
283 Returns
284 -------
285 version : tuple of int
286 the file version which needs to be used to store the data
287 """
288 import struct
289 header = ["{"]
290 for key, value in sorted(d.items()):
291 # Need to use repr here, since we eval these when reading
292 header.append("'%s': %s, " % (key, repr(value)))
293 header.append("}")
294 header = "".join(header)
295 # Pad the header with spaces and a final newline such that the magic
296 # string, the header-length short and the header are aligned on a
297 # 16-byte boundary. Hopefully, some system, possibly memory-mapping,
298 # can take advantage of our premature optimization.
299 current_header_len = MAGIC_LEN + 2 + len(header) + 1 # 1 for the newline
300 topad = 16 - (current_header_len % 16)
301 header = header + ' '*topad + '\n'
302 header = asbytes(_filter_header(header))
303
304 if len(header) >= (256*256) and version == (1, 0):
305 raise ValueError("header does not fit inside %s bytes required by the"
306 " 1.0 format" % (256*256))
307 if len(header) < (256*256):
308 header_len_str = struct.pack('<H', len(header))
309 version = (1, 0)
310 elif len(header) < (2**32):
311 header_len_str = struct.pack('<I', len(header))
312 version = (2, 0)
313 else:
314 raise ValueError("header does not fit inside 4 GiB required by "
315 "the 2.0 format")
316
317 fp.write(magic(*version))
318 fp.write(header_len_str)
319 fp.write(header)
320 return version
321
322 def write_array_header_1_0(fp, d):
323 """ Write the header for an array using the 1.0 format.
324
325 Parameters
326 ----------
327 fp : filelike object
328 d : dict
329 This has the appropriate entries for writing its string
330 representation to the header of the file.
331 """
332 _write_array_header(fp, d, (1, 0))
333
334
335 def write_array_header_2_0(fp, d):
336 """ Write the header for an array using the 2.0 format.
337 The 2.0 format allows storing very large structured arrays.
338
339 .. versionadded:: 1.9.0
340
341 Parameters
342 ----------
343 fp : filelike object
344 d : dict
345 This has the appropriate entries for writing its string
346 representation to the header of the file.
347 """
348 _write_array_header(fp, d, (2, 0))
349
350 def read_array_header_1_0(fp):
351 """
352 Read an array header from a filelike object using the 1.0 file format
353 version.
354
355 This will leave the file object located just after the header.
356
357 Parameters
358 ----------
359 fp : filelike object
360 A file object or something with a `.read()` method like a file.
361
362 Returns
363 -------
364 shape : tuple of int
365 The shape of the array.
366 fortran_order : bool
367 The array data will be written out directly if it is either
368 C-contiguous or Fortran-contiguous. Otherwise, it will be made
369 contiguous before writing it out.
370 dtype : dtype
371 The dtype of the file's data.
372
373 Raises
374 ------
375 ValueError
376 If the data is invalid.
377
378 """
379 _read_array_header(fp, version=(1, 0))
380
381 def read_array_header_2_0(fp):
382 """
383 Read an array header from a filelike object using the 2.0 file format
384 version.
385
386 This will leave the file object located just after the header.
387
388 .. versionadded:: 1.9.0
389
390 Parameters
391 ----------
392 fp : filelike object
393 A file object or something with a `.read()` method like a file.
394
395 Returns
396 -------
397 shape : tuple of int
398 The shape of the array.
399 fortran_order : bool
400 The array data will be written out directly if it is either
401 C-contiguous or Fortran-contiguous. Otherwise, it will be made
402 contiguous before writing it out.
403 dtype : dtype
404 The dtype of the file's data.
405
406 Raises
407 ------
408 ValueError
409 If the data is invalid.
410
411 """
412 _read_array_header(fp, version=(2, 0))
413
414
415 def _filter_header(s):
416 """Clean up 'L' in npz header ints.
417
418 Cleans up the 'L' in strings representing integers. Needed to allow npz
419 headers produced in Python2 to be read in Python3.
420
421 Parameters
422 ----------
423 s : byte string
424 Npy file header.
425
426 Returns
427 -------
428 header : str
429 Cleaned up header.
430
431 """
432 import tokenize
433 if sys.version_info[0] >= 3:
434 from io import StringIO
435 else:
436 from StringIO import StringIO
437
438 tokens = []
439 last_token_was_number = False
440 for token in tokenize.generate_tokens(StringIO(asstr(s)).read):
441 token_type = token[0]
442 token_string = token[1]
443 if (last_token_was_number and
444 token_type == tokenize.NAME and
445 token_string == "L"):
446 continue
447 else:
448 tokens.append(token)
449 last_token_was_number = (token_type == tokenize.NUMBER)
450 return tokenize.untokenize(tokens)
451
452
453 def _read_array_header(fp, version):
454 """
455 see read_array_header_1_0
456 """
457 # Read an unsigned, little-endian short int which has the length of the
458 # header.
459 import struct
460 if version == (1, 0):
461 hlength_str = _read_bytes(fp, 2, "array header length")
462 header_length = struct.unpack('<H', hlength_str)[0]
463 header = _read_bytes(fp, header_length, "array header")
464 elif version == (2, 0):
465 hlength_str = _read_bytes(fp, 4, "array header length")
466 header_length = struct.unpack('<I', hlength_str)[0]
467 header = _read_bytes(fp, header_length, "array header")
468 else:
469 raise ValueError("Invalid version %r" % version)
470
471 # The header is a pretty-printed string representation of a literal
472 # Python dictionary with trailing newlines padded to a 16-byte
473 # boundary. The keys are strings.
474 # "shape" : tuple of int
475 # "fortran_order" : bool
476 # "descr" : dtype.descr
477 header = _filter_header(header)
478 try:
479 d = safe_eval(header)
480 except SyntaxError as e:
481 msg = "Cannot parse header: %r\nException: %r"
482 raise ValueError(msg % (header, e))
483 if not isinstance(d, dict):
484 msg = "Header is not a dictionary: %r"
485 raise ValueError(msg % d)
486 keys = sorted(d.keys())
487 if keys != ['descr', 'fortran_order', 'shape']:
488 msg = "Header does not contain the correct keys: %r"
489 raise ValueError(msg % (keys,))
490
491 # Sanity-check the values.
492 if (not isinstance(d['shape'], tuple) or
493 not numpy.all([isinstance(x, (int, long)) for x in d['shape']])):
494 msg = "shape is not valid: %r"
495 raise ValueError(msg % (d['shape'],))
496 if not isinstance(d['fortran_order'], bool):
497 msg = "fortran_order is not a valid bool: %r"
498 raise ValueError(msg % (d['fortran_order'],))
499 try:
500 dtype = numpy.dtype(d['descr'])
501 except TypeError as e:
502 msg = "descr is not a valid dtype descriptor: %r"
503 raise ValueError(msg % (d['descr'],))
504
505 return d['shape'], d['fortran_order'], dtype
506
507 def write_array(fp, array, version=None):
508 """
509 Write an array to an NPY file, including a header.
510
511 If the array is neither C-contiguous nor Fortran-contiguous AND the
512 file_like object is not a real file object, this function will have to
513 copy data in memory.
514
515 Parameters
516 ----------
517 fp : file_like object
518 An open, writable file object, or similar object with a
519 ``.write()`` method.
520 array : ndarray
521 The array to write to disk.
522 version : (int, int) or None, optional
523 The version number of the format. None means use the oldest
524 supported version that is able to store the data. Default: None
525
526 Raises
527 ------
528 ValueError
529 If the array cannot be persisted.
530 Various other errors
531 If the array contains Python objects as part of its dtype, the
532 process of pickling them may raise various errors if the objects
533 are not picklable.
534
535 """
536 _check_version(version)
537 used_ver = _write_array_header(fp, header_data_from_array_1_0(array),
538 version)
539 # this warning can be removed when 1.9 has aged enough
540 if version != (2, 0) and used_ver == (2, 0):
541 warnings.warn("Stored array in format 2.0. It can only be"
542 "read by NumPy >= 1.9", UserWarning)
543
544 # Set buffer size to 16 MiB to hide the Python loop overhead.
545 buffersize = max(16 * 1024 ** 2 // array.itemsize, 1)
546
547 if array.dtype.hasobject:
548 # We contain Python objects so we cannot write out the data
549 # directly. Instead, we will pickle it out with version 2 of the
550 # pickle protocol.
551 pickle.dump(array, fp, protocol=2)
552 elif array.flags.f_contiguous and not array.flags.c_contiguous:
553 if isfileobj(fp):
554 array.T.tofile(fp)
555 else:
556 for chunk in numpy.nditer(
557 array, flags=['external_loop', 'buffered', 'zerosize_ok'],
558 buffersize=buffersize, order='F'):
559 fp.write(chunk.tobytes('C'))
560 else:
561 if isfileobj(fp):
562 array.tofile(fp)
563 else:
564 for chunk in numpy.nditer(
565 array, flags=['external_loop', 'buffered', 'zerosize_ok'],
566 buffersize=buffersize, order='C'):
567 fp.write(chunk.tobytes('C'))
568
569
570 def read_array(fp):
571 """
572 Read an array from an NPY file.
573
574 Parameters
575 ----------
576 fp : file_like object
577 If this is not a real file object, then this may take extra memory
578 and time.
579
580 Returns
581 -------
582 array : ndarray
583 The array from the data on disk.
584
585 Raises
586 ------
587 ValueError
588 If the data is invalid.
589
590 """
591 version = read_magic(fp)
592 _check_version(version)
593 shape, fortran_order, dtype = _read_array_header(fp, version)
594 if len(shape) == 0:
595 count = 1
596 else:
597 count = numpy.multiply.reduce(shape)
598
599 # Now read the actual data.
600 if dtype.hasobject:
601 # The array contained Python objects. We need to unpickle the data.
602 array = pickle.load(fp)
603 else:
604 if isfileobj(fp):
605 # We can use the fast fromfile() function.
606 array = numpy.fromfile(fp, dtype=dtype, count=count)
607 else:
608 # This is not a real file. We have to read it the
609 # memory-intensive way.
610 # crc32 module fails on reads greater than 2 ** 32 bytes,
611 # breaking large reads from gzip streams. Chunk reads to
612 # BUFFER_SIZE bytes to avoid issue and reduce memory overhead
613 # of the read. In non-chunked case count < max_read_count, so
614 # only one read is performed.
615
616 max_read_count = BUFFER_SIZE // min(BUFFER_SIZE, dtype.itemsize)
617
618 array = numpy.empty(count, dtype=dtype)
619 for i in range(0, count, max_read_count):
620 read_count = min(max_read_count, count - i)
621 read_size = int(read_count * dtype.itemsize)
622 data = _read_bytes(fp, read_size, "array data")
623 array[i:i+read_count] = numpy.frombuffer(data, dtype=dtype,
624 count=read_count)
625
626 if fortran_order:
627 array.shape = shape[::-1]
628 array = array.transpose()
629 else:
630 array.shape = shape
631
632 return array
633
634
635 def open_memmap(filename, mode='r+', dtype=None, shape=None,
636 fortran_order=False, version=None):
637 """
638 Open a .npy file as a memory-mapped array.
639
640 This may be used to read an existing file or create a new one.
641
642 Parameters
643 ----------
644 filename : str
645 The name of the file on disk. This may *not* be a file-like
646 object.
647 mode : str, optional
648 The mode in which to open the file; the default is 'r+'. In
649 addition to the standard file modes, 'c' is also accepted to mean
650 "copy on write." See `memmap` for the available mode strings.
651 dtype : data-type, optional
652 The data type of the array if we are creating a new file in "write"
653 mode, if not, `dtype` is ignored. The default value is None, which
654 results in a data-type of `float64`.
655 shape : tuple of int
656 The shape of the array if we are creating a new file in "write"
657 mode, in which case this parameter is required. Otherwise, this
658 parameter is ignored and is thus optional.
659 fortran_order : bool, optional
660 Whether the array should be Fortran-contiguous (True) or
661 C-contiguous (False, the default) if we are creating a new file in
662 "write" mode.
663 version : tuple of int (major, minor) or None
664 If the mode is a "write" mode, then this is the version of the file
665 format used to create the file. None means use the oldest
666 supported version that is able to store the data. Default: None
667
668 Returns
669 -------
670 marray : memmap
671 The memory-mapped array.
672
673 Raises
674 ------
675 ValueError
676 If the data or the mode is invalid.
677 IOError
678 If the file is not found or cannot be opened correctly.
679
680 See Also
681 --------
682 memmap
683
684 """
685 if not isinstance(filename, basestring):
686 raise ValueError("Filename must be a string. Memmap cannot use"
687 " existing file handles.")
688
689 if 'w' in mode:
690 # We are creating the file, not reading it.
691 # Check if we ought to create the file.
692 _check_version(version)
693 # Ensure that the given dtype is an authentic dtype object rather
694 # than just something that can be interpreted as a dtype object.
695 dtype = numpy.dtype(dtype)
696 if dtype.hasobject:
697 msg = "Array can't be memory-mapped: Python objects in dtype."
698 raise ValueError(msg)
699 d = dict(
700 descr=dtype_to_descr(dtype),
701 fortran_order=fortran_order,
702 shape=shape,
703 )
704 # If we got here, then it should be safe to create the file.
705 fp = open(filename, mode+'b')
706 try:
707 used_ver = _write_array_header(fp, d, version)
708 # this warning can be removed when 1.9 has aged enough
709 if version != (2, 0) and used_ver == (2, 0):
710 warnings.warn("Stored array in format 2.0. It can only be"
711 "read by NumPy >= 1.9", UserWarning)
712 offset = fp.tell()
713 finally:
714 fp.close()
715 else:
716 # Read the header of the file first.
717 fp = open(filename, 'rb')
718 try:
719 version = read_magic(fp)
720 _check_version(version)
721
722 shape, fortran_order, dtype = _read_array_header(fp, version)
723 if dtype.hasobject:
724 msg = "Array can't be memory-mapped: Python objects in dtype."
725 raise ValueError(msg)
726 offset = fp.tell()
727 finally:
728 fp.close()
729
730 if fortran_order:
731 order = 'F'
732 else:
733 order = 'C'
734
735 # We need to change a write-only mode to a read-write mode since we've
736 # already written data to the file.
737 if mode == 'w+':
738 mode = 'r+'
739
740 marray = numpy.memmap(filename, dtype=dtype, shape=shape, order=order,
741 mode=mode, offset=offset)
742
743 return marray
744
745
746 def _read_bytes(fp, size, error_template="ran out of data"):
747 """
748 Read from file-like object until size bytes are read.
749 Raises ValueError if not EOF is encountered before size bytes are read.
750 Non-blocking objects only supported if they derive from io objects.
751
752 Required as e.g. ZipExtFile in python 2.6 can return less data than
753 requested.
754 """
755 data = bytes()
756 while True:
757 # io files (default in python3) return None or raise on
758 # would-block, python2 file will truncate, probably nothing can be
759 # done about that. note that regular files can't be non-blocking
760 try:
761 r = fp.read(size - len(data))
762 data += r
763 if len(r) == 0 or len(data) == size:
764 break
765 except io.BlockingIOError:
766 pass
767 if len(data) != size:
768 msg = "EOF: reading %s, expected %d bytes got %d"
769 raise ValueError(msg % (error_template, size, len(data)))
770 else:
771 return data