cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: Network Working Group P. Deutsch cannam@128: Request for Comments: 1952 Aladdin Enterprises cannam@128: Category: Informational May 1996 cannam@128: cannam@128: cannam@128: GZIP file format specification version 4.3 cannam@128: cannam@128: Status of This Memo cannam@128: cannam@128: This memo provides information for the Internet community. This memo cannam@128: does not specify an Internet standard of any kind. Distribution of cannam@128: this memo is unlimited. cannam@128: cannam@128: IESG Note: cannam@128: cannam@128: The IESG takes no position on the validity of any Intellectual cannam@128: Property Rights statements contained in this document. cannam@128: cannam@128: Notices cannam@128: cannam@128: Copyright (c) 1996 L. Peter Deutsch cannam@128: cannam@128: Permission is granted to copy and distribute this document for any cannam@128: purpose and without charge, including translations into other cannam@128: languages and incorporation into compilations, provided that the cannam@128: copyright notice and this notice are preserved, and that any cannam@128: substantive changes or deletions from the original are clearly cannam@128: marked. cannam@128: cannam@128: A pointer to the latest version of this and related documentation in cannam@128: HTML format can be found at the URL cannam@128: . cannam@128: cannam@128: Abstract cannam@128: cannam@128: This specification defines a lossless compressed data format that is cannam@128: compatible with the widely used GZIP utility. The format includes a cannam@128: cyclic redundancy check value for detecting data corruption. The cannam@128: format presently uses the DEFLATE method of compression but can be cannam@128: easily extended to use other compression methods. The format can be cannam@128: implemented readily in a manner not covered by patents. cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 1] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: Table of Contents cannam@128: cannam@128: 1. Introduction ................................................... 2 cannam@128: 1.1. Purpose ................................................... 2 cannam@128: 1.2. Intended audience ......................................... 3 cannam@128: 1.3. Scope ..................................................... 3 cannam@128: 1.4. Compliance ................................................ 3 cannam@128: 1.5. Definitions of terms and conventions used ................. 3 cannam@128: 1.6. Changes from previous versions ............................ 3 cannam@128: 2. Detailed specification ......................................... 4 cannam@128: 2.1. Overall conventions ....................................... 4 cannam@128: 2.2. File format ............................................... 5 cannam@128: 2.3. Member format ............................................. 5 cannam@128: 2.3.1. Member header and trailer ........................... 6 cannam@128: 2.3.1.1. Extra field ................................... 8 cannam@128: 2.3.1.2. Compliance .................................... 9 cannam@128: 3. References .................................................. 9 cannam@128: 4. Security Considerations .................................... 10 cannam@128: 5. Acknowledgements ........................................... 10 cannam@128: 6. Author's Address ........................................... 10 cannam@128: 7. Appendix: Jean-Loup Gailly's gzip utility .................. 11 cannam@128: 8. Appendix: Sample CRC Code .................................. 11 cannam@128: cannam@128: 1. Introduction cannam@128: cannam@128: 1.1. Purpose cannam@128: cannam@128: The purpose of this specification is to define a lossless cannam@128: compressed data format that: cannam@128: cannam@128: * Is independent of CPU type, operating system, file system, cannam@128: and character set, and hence can be used for interchange; cannam@128: * Can compress or decompress a data stream (as opposed to a cannam@128: randomly accessible file) to produce another data stream, cannam@128: using only an a priori bounded amount of intermediate cannam@128: storage, and hence can be used in data communications or cannam@128: similar structures such as Unix filters; cannam@128: * Compresses data with efficiency comparable to the best cannam@128: currently available general-purpose compression methods, cannam@128: and in particular considerably better than the "compress" cannam@128: program; cannam@128: * Can be implemented readily in a manner not covered by cannam@128: patents, and hence can be practiced freely; cannam@128: * Is compatible with the file format produced by the current cannam@128: widely used gzip utility, in that conforming decompressors cannam@128: will be able to read data produced by the existing gzip cannam@128: compressor. cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 2] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: The data format defined by this specification does not attempt to: cannam@128: cannam@128: * Provide random access to compressed data; cannam@128: * Compress specialized data (e.g., raster graphics) as well as cannam@128: the best currently available specialized algorithms. cannam@128: cannam@128: 1.2. Intended audience cannam@128: cannam@128: This specification is intended for use by implementors of software cannam@128: to compress data into gzip format and/or decompress data from gzip cannam@128: format. cannam@128: cannam@128: The text of the specification assumes a basic background in cannam@128: programming at the level of bits and other primitive data cannam@128: representations. cannam@128: cannam@128: 1.3. Scope cannam@128: cannam@128: The specification specifies a compression method and a file format cannam@128: (the latter assuming only that a file can store a sequence of cannam@128: arbitrary bytes). It does not specify any particular interface to cannam@128: a file system or anything about character sets or encodings cannam@128: (except for file names and comments, which are optional). cannam@128: cannam@128: 1.4. Compliance cannam@128: cannam@128: Unless otherwise indicated below, a compliant decompressor must be cannam@128: able to accept and decompress any file that conforms to all the cannam@128: specifications presented here; a compliant compressor must produce cannam@128: files that conform to all the specifications presented here. The cannam@128: material in the appendices is not part of the specification per se cannam@128: and is not relevant to compliance. cannam@128: cannam@128: 1.5. Definitions of terms and conventions used cannam@128: cannam@128: byte: 8 bits stored or transmitted as a unit (same as an octet). cannam@128: (For this specification, a byte is exactly 8 bits, even on cannam@128: machines which store a character on a number of bits different cannam@128: from 8.) See below for the numbering of bits within a byte. cannam@128: cannam@128: 1.6. Changes from previous versions cannam@128: cannam@128: There have been no technical changes to the gzip format since cannam@128: version 4.1 of this specification. In version 4.2, some cannam@128: terminology was changed, and the sample CRC code was rewritten for cannam@128: clarity and to eliminate the requirement for the caller to do pre- cannam@128: and post-conditioning. Version 4.3 is a conversion of the cannam@128: specification to RFC style. cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 3] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: 2. Detailed specification cannam@128: cannam@128: 2.1. Overall conventions cannam@128: cannam@128: In the diagrams below, a box like this: cannam@128: cannam@128: +---+ cannam@128: | | <-- the vertical bars might be missing cannam@128: +---+ cannam@128: cannam@128: represents one byte; a box like this: cannam@128: cannam@128: +==============+ cannam@128: | | cannam@128: +==============+ cannam@128: cannam@128: represents a variable number of bytes. cannam@128: cannam@128: Bytes stored within a computer do not have a "bit order", since cannam@128: they are always treated as a unit. However, a byte considered as cannam@128: an integer between 0 and 255 does have a most- and least- cannam@128: significant bit, and since we write numbers with the most- cannam@128: significant digit on the left, we also write bytes with the most- cannam@128: significant bit on the left. In the diagrams below, we number the cannam@128: bits of a byte so that bit 0 is the least-significant bit, i.e., cannam@128: the bits are numbered: cannam@128: cannam@128: +--------+ cannam@128: |76543210| cannam@128: +--------+ cannam@128: cannam@128: This document does not address the issue of the order in which cannam@128: bits of a byte are transmitted on a bit-sequential medium, since cannam@128: the data format described here is byte- rather than bit-oriented. cannam@128: cannam@128: Within a computer, a number may occupy multiple bytes. All cannam@128: multi-byte numbers in the format described here are stored with cannam@128: the least-significant byte first (at the lower memory address). cannam@128: For example, the decimal number 520 is stored as: cannam@128: cannam@128: 0 1 cannam@128: +--------+--------+ cannam@128: |00001000|00000010| cannam@128: +--------+--------+ cannam@128: ^ ^ cannam@128: | | cannam@128: | + more significant byte = 2 x 256 cannam@128: + less significant byte = 8 cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 4] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: 2.2. File format cannam@128: cannam@128: A gzip file consists of a series of "members" (compressed data cannam@128: sets). The format of each member is specified in the following cannam@128: section. The members simply appear one after another in the file, cannam@128: with no additional information before, between, or after them. cannam@128: cannam@128: 2.3. Member format cannam@128: cannam@128: Each member has the following structure: cannam@128: cannam@128: +---+---+---+---+---+---+---+---+---+---+ cannam@128: |ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->) cannam@128: +---+---+---+---+---+---+---+---+---+---+ cannam@128: cannam@128: (if FLG.FEXTRA set) cannam@128: cannam@128: +---+---+=================================+ cannam@128: | XLEN |...XLEN bytes of "extra field"...| (more-->) cannam@128: +---+---+=================================+ cannam@128: cannam@128: (if FLG.FNAME set) cannam@128: cannam@128: +=========================================+ cannam@128: |...original file name, zero-terminated...| (more-->) cannam@128: +=========================================+ cannam@128: cannam@128: (if FLG.FCOMMENT set) cannam@128: cannam@128: +===================================+ cannam@128: |...file comment, zero-terminated...| (more-->) cannam@128: +===================================+ cannam@128: cannam@128: (if FLG.FHCRC set) cannam@128: cannam@128: +---+---+ cannam@128: | CRC16 | cannam@128: +---+---+ cannam@128: cannam@128: +=======================+ cannam@128: |...compressed blocks...| (more-->) cannam@128: +=======================+ cannam@128: cannam@128: 0 1 2 3 4 5 6 7 cannam@128: +---+---+---+---+---+---+---+---+ cannam@128: | CRC32 | ISIZE | cannam@128: +---+---+---+---+---+---+---+---+ cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 5] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: 2.3.1. Member header and trailer cannam@128: cannam@128: ID1 (IDentification 1) cannam@128: ID2 (IDentification 2) cannam@128: These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139 cannam@128: (0x8b, \213), to identify the file as being in gzip format. cannam@128: cannam@128: CM (Compression Method) cannam@128: This identifies the compression method used in the file. CM cannam@128: = 0-7 are reserved. CM = 8 denotes the "deflate" cannam@128: compression method, which is the one customarily used by cannam@128: gzip and which is documented elsewhere. cannam@128: cannam@128: FLG (FLaGs) cannam@128: This flag byte is divided into individual bits as follows: cannam@128: cannam@128: bit 0 FTEXT cannam@128: bit 1 FHCRC cannam@128: bit 2 FEXTRA cannam@128: bit 3 FNAME cannam@128: bit 4 FCOMMENT cannam@128: bit 5 reserved cannam@128: bit 6 reserved cannam@128: bit 7 reserved cannam@128: cannam@128: If FTEXT is set, the file is probably ASCII text. This is cannam@128: an optional indication, which the compressor may set by cannam@128: checking a small amount of the input data to see whether any cannam@128: non-ASCII characters are present. In case of doubt, FTEXT cannam@128: is cleared, indicating binary data. For systems which have cannam@128: different file formats for ascii text and binary data, the cannam@128: decompressor can use FTEXT to choose the appropriate format. cannam@128: We deliberately do not specify the algorithm used to set cannam@128: this bit, since a compressor always has the option of cannam@128: leaving it cleared and a decompressor always has the option cannam@128: of ignoring it and letting some other program handle issues cannam@128: of data conversion. cannam@128: cannam@128: If FHCRC is set, a CRC16 for the gzip header is present, cannam@128: immediately before the compressed data. The CRC16 consists cannam@128: of the two least significant bytes of the CRC32 for all cannam@128: bytes of the gzip header up to and not including the CRC16. cannam@128: [The FHCRC bit was never set by versions of gzip up to cannam@128: 1.2.4, even though it was documented with a different cannam@128: meaning in gzip 1.2.4.] cannam@128: cannam@128: If FEXTRA is set, optional extra fields are present, as cannam@128: described in a following section. cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 6] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: If FNAME is set, an original file name is present, cannam@128: terminated by a zero byte. The name must consist of ISO cannam@128: 8859-1 (LATIN-1) characters; on operating systems using cannam@128: EBCDIC or any other character set for file names, the name cannam@128: must be translated to the ISO LATIN-1 character set. This cannam@128: is the original name of the file being compressed, with any cannam@128: directory components removed, and, if the file being cannam@128: compressed is on a file system with case insensitive names, cannam@128: forced to lower case. There is no original file name if the cannam@128: data was compressed from a source other than a named file; cannam@128: for example, if the source was stdin on a Unix system, there cannam@128: is no file name. cannam@128: cannam@128: If FCOMMENT is set, a zero-terminated file comment is cannam@128: present. This comment is not interpreted; it is only cannam@128: intended for human consumption. The comment must consist of cannam@128: ISO 8859-1 (LATIN-1) characters. Line breaks should be cannam@128: denoted by a single line feed character (10 decimal). cannam@128: cannam@128: Reserved FLG bits must be zero. cannam@128: cannam@128: MTIME (Modification TIME) cannam@128: This gives the most recent modification time of the original cannam@128: file being compressed. The time is in Unix format, i.e., cannam@128: seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this cannam@128: may cause problems for MS-DOS and other systems that use cannam@128: local rather than Universal time.) If the compressed data cannam@128: did not come from a file, MTIME is set to the time at which cannam@128: compression started. MTIME = 0 means no time stamp is cannam@128: available. cannam@128: cannam@128: XFL (eXtra FLags) cannam@128: These flags are available for use by specific compression cannam@128: methods. The "deflate" method (CM = 8) sets these flags as cannam@128: follows: cannam@128: cannam@128: XFL = 2 - compressor used maximum compression, cannam@128: slowest algorithm cannam@128: XFL = 4 - compressor used fastest algorithm cannam@128: cannam@128: OS (Operating System) cannam@128: This identifies the type of file system on which compression cannam@128: took place. This may be useful in determining end-of-line cannam@128: convention for text files. The currently defined values are cannam@128: as follows: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 7] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: 0 - FAT filesystem (MS-DOS, OS/2, NT/Win32) cannam@128: 1 - Amiga cannam@128: 2 - VMS (or OpenVMS) cannam@128: 3 - Unix cannam@128: 4 - VM/CMS cannam@128: 5 - Atari TOS cannam@128: 6 - HPFS filesystem (OS/2, NT) cannam@128: 7 - Macintosh cannam@128: 8 - Z-System cannam@128: 9 - CP/M cannam@128: 10 - TOPS-20 cannam@128: 11 - NTFS filesystem (NT) cannam@128: 12 - QDOS cannam@128: 13 - Acorn RISCOS cannam@128: 255 - unknown cannam@128: cannam@128: XLEN (eXtra LENgth) cannam@128: If FLG.FEXTRA is set, this gives the length of the optional cannam@128: extra field. See below for details. cannam@128: cannam@128: CRC32 (CRC-32) cannam@128: This contains a Cyclic Redundancy Check value of the cannam@128: uncompressed data computed according to CRC-32 algorithm cannam@128: used in the ISO 3309 standard and in section 8.1.1.6.2 of cannam@128: ITU-T recommendation V.42. (See http://www.iso.ch for cannam@128: ordering ISO documents. See gopher://info.itu.ch for an cannam@128: online version of ITU-T V.42.) cannam@128: cannam@128: ISIZE (Input SIZE) cannam@128: This contains the size of the original (uncompressed) input cannam@128: data modulo 2^32. cannam@128: cannam@128: 2.3.1.1. Extra field cannam@128: cannam@128: If the FLG.FEXTRA bit is set, an "extra field" is present in cannam@128: the header, with total length XLEN bytes. It consists of a cannam@128: series of subfields, each of the form: cannam@128: cannam@128: +---+---+---+---+==================================+ cannam@128: |SI1|SI2| LEN |... LEN bytes of subfield data ...| cannam@128: +---+---+---+---+==================================+ cannam@128: cannam@128: SI1 and SI2 provide a subfield ID, typically two ASCII letters cannam@128: with some mnemonic value. Jean-Loup Gailly cannam@128: is maintaining a registry of subfield cannam@128: IDs; please send him any subfield ID you wish to use. Subfield cannam@128: IDs with SI2 = 0 are reserved for future use. The following cannam@128: IDs are currently defined: cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 8] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: SI1 SI2 Data cannam@128: ---------- ---------- ---- cannam@128: 0x41 ('A') 0x70 ('P') Apollo file type information cannam@128: cannam@128: LEN gives the length of the subfield data, excluding the 4 cannam@128: initial bytes. cannam@128: cannam@128: 2.3.1.2. Compliance cannam@128: cannam@128: A compliant compressor must produce files with correct ID1, cannam@128: ID2, CM, CRC32, and ISIZE, but may set all the other fields in cannam@128: the fixed-length part of the header to default values (255 for cannam@128: OS, 0 for all others). The compressor must set all reserved cannam@128: bits to zero. cannam@128: cannam@128: A compliant decompressor must check ID1, ID2, and CM, and cannam@128: provide an error indication if any of these have incorrect cannam@128: values. It must examine FEXTRA/XLEN, FNAME, FCOMMENT and FHCRC cannam@128: at least so it can skip over the optional fields if they are cannam@128: present. It need not examine any other part of the header or cannam@128: trailer; in particular, a decompressor may ignore FTEXT and OS cannam@128: and always produce binary output, and still be compliant. A cannam@128: compliant decompressor must give an error indication if any cannam@128: reserved bit is non-zero, since such a bit could indicate the cannam@128: presence of a new field that would cause subsequent data to be cannam@128: interpreted incorrectly. cannam@128: cannam@128: 3. References cannam@128: cannam@128: [1] "Information Processing - 8-bit single-byte coded graphic cannam@128: character sets - Part 1: Latin alphabet No.1" (ISO 8859-1:1987). cannam@128: The ISO 8859-1 (Latin-1) character set is a superset of 7-bit cannam@128: ASCII. Files defining this character set are available as cannam@128: iso_8859-1.* in ftp://ftp.uu.net/graphics/png/documents/ cannam@128: cannam@128: [2] ISO 3309 cannam@128: cannam@128: [3] ITU-T recommendation V.42 cannam@128: cannam@128: [4] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification", cannam@128: available in ftp://ftp.uu.net/pub/archiving/zip/doc/ cannam@128: cannam@128: [5] Gailly, J.-L., GZIP documentation, available as gzip-*.tar in cannam@128: ftp://prep.ai.mit.edu/pub/gnu/ cannam@128: cannam@128: [6] Sarwate, D.V., "Computation of Cyclic Redundancy Checks via Table cannam@128: Look-Up", Communications of the ACM, 31(8), pp.1008-1013. cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 9] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: [7] Schwaderer, W.D., "CRC Calculation", April 85 PC Tech Journal, cannam@128: pp.118-133. cannam@128: cannam@128: [8] ftp://ftp.adelaide.edu.au/pub/rocksoft/papers/crc_v3.txt, cannam@128: describing the CRC concept. cannam@128: cannam@128: 4. Security Considerations cannam@128: cannam@128: Any data compression method involves the reduction of redundancy in cannam@128: the data. Consequently, any corruption of the data is likely to have cannam@128: severe effects and be difficult to correct. Uncompressed text, on cannam@128: the other hand, will probably still be readable despite the presence cannam@128: of some corrupted bytes. cannam@128: cannam@128: It is recommended that systems using this data format provide some cannam@128: means of validating the integrity of the compressed data, such as by cannam@128: setting and checking the CRC-32 check value. cannam@128: cannam@128: 5. Acknowledgements cannam@128: cannam@128: Trademarks cited in this document are the property of their cannam@128: respective owners. cannam@128: cannam@128: Jean-Loup Gailly designed the gzip format and wrote, with Mark Adler, cannam@128: the related software described in this specification. Glenn cannam@128: Randers-Pehrson converted this document to RFC and HTML format. cannam@128: cannam@128: 6. Author's Address cannam@128: cannam@128: L. Peter Deutsch cannam@128: Aladdin Enterprises cannam@128: 203 Santa Margarita Ave. cannam@128: Menlo Park, CA 94025 cannam@128: cannam@128: Phone: (415) 322-0103 (AM only) cannam@128: FAX: (415) 322-1734 cannam@128: EMail: cannam@128: cannam@128: Questions about the technical content of this specification can be cannam@128: sent by email to: cannam@128: cannam@128: Jean-Loup Gailly and cannam@128: Mark Adler cannam@128: cannam@128: Editorial comments on this specification can be sent by email to: cannam@128: cannam@128: L. Peter Deutsch and cannam@128: Glenn Randers-Pehrson cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 10] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: 7. Appendix: Jean-Loup Gailly's gzip utility cannam@128: cannam@128: The most widely used implementation of gzip compression, and the cannam@128: original documentation on which this specification is based, were cannam@128: created by Jean-Loup Gailly . Since this cannam@128: implementation is a de facto standard, we mention some more of its cannam@128: features here. Again, the material in this section is not part of cannam@128: the specification per se, and implementations need not follow it to cannam@128: be compliant. cannam@128: cannam@128: When compressing or decompressing a file, gzip preserves the cannam@128: protection, ownership, and modification time attributes on the local cannam@128: file system, since there is no provision for representing protection cannam@128: attributes in the gzip file format itself. Since the file format cannam@128: includes a modification time, the gzip decompressor provides a cannam@128: command line switch that assigns the modification time from the file, cannam@128: rather than the local modification time of the compressed input, to cannam@128: the decompressed output. cannam@128: cannam@128: 8. Appendix: Sample CRC Code cannam@128: cannam@128: The following sample code represents a practical implementation of cannam@128: the CRC (Cyclic Redundancy Check). (See also ISO 3309 and ITU-T V.42 cannam@128: for a formal specification.) cannam@128: cannam@128: The sample code is in the ANSI C programming language. Non C users cannam@128: may find it easier to read with these hints: cannam@128: cannam@128: & Bitwise AND operator. cannam@128: ^ Bitwise exclusive-OR operator. cannam@128: >> Bitwise right shift operator. When applied to an cannam@128: unsigned quantity, as here, right shift inserts zero cannam@128: bit(s) at the left. cannam@128: ! Logical NOT operator. cannam@128: ++ "n++" increments the variable n. cannam@128: 0xNNN 0x introduces a hexadecimal (base 16) constant. cannam@128: Suffix L indicates a long value (at least 32 bits). cannam@128: cannam@128: /* Table of CRCs of all 8-bit messages. */ cannam@128: unsigned long crc_table[256]; cannam@128: cannam@128: /* Flag: has the table been computed? Initially false. */ cannam@128: int crc_table_computed = 0; cannam@128: cannam@128: /* Make the table for a fast CRC. */ cannam@128: void make_crc_table(void) cannam@128: { cannam@128: unsigned long c; cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 11] cannam@128: cannam@128: RFC 1952 GZIP File Format Specification May 1996 cannam@128: cannam@128: cannam@128: int n, k; cannam@128: for (n = 0; n < 256; n++) { cannam@128: c = (unsigned long) n; cannam@128: for (k = 0; k < 8; k++) { cannam@128: if (c & 1) { cannam@128: c = 0xedb88320L ^ (c >> 1); cannam@128: } else { cannam@128: c = c >> 1; cannam@128: } cannam@128: } cannam@128: crc_table[n] = c; cannam@128: } cannam@128: crc_table_computed = 1; cannam@128: } cannam@128: cannam@128: /* cannam@128: Update a running crc with the bytes buf[0..len-1] and return cannam@128: the updated crc. The crc should be initialized to zero. Pre- and cannam@128: post-conditioning (one's complement) is performed within this cannam@128: function so it shouldn't be done by the caller. Usage example: cannam@128: cannam@128: unsigned long crc = 0L; cannam@128: cannam@128: while (read_buffer(buffer, length) != EOF) { cannam@128: crc = update_crc(crc, buffer, length); cannam@128: } cannam@128: if (crc != original_crc) error(); cannam@128: */ cannam@128: unsigned long update_crc(unsigned long crc, cannam@128: unsigned char *buf, int len) cannam@128: { cannam@128: unsigned long c = crc ^ 0xffffffffL; cannam@128: int n; cannam@128: cannam@128: if (!crc_table_computed) cannam@128: make_crc_table(); cannam@128: for (n = 0; n < len; n++) { cannam@128: c = crc_table[(c ^ buf[n]) & 0xff] ^ (c >> 8); cannam@128: } cannam@128: return c ^ 0xffffffffL; cannam@128: } cannam@128: cannam@128: /* Return the CRC of the bytes buf[0..len-1]. */ cannam@128: unsigned long crc(unsigned char *buf, int len) cannam@128: { cannam@128: return update_crc(0L, buf, len); cannam@128: } cannam@128: cannam@128: cannam@128: cannam@128: cannam@128: Deutsch Informational [Page 12] cannam@128: