zlib Usage Example

Chris@4: Chris@4: Chris@4: Chris@4: Chris@4: zlib Usage Example Chris@4: Chris@4: Chris@4: Chris@4:

zlib Usage Example

Chris@4: We often get questions about how the deflate() and inflate() functions should be used. Chris@4: Users wonder when they should provide more input, when they should use more output, Chris@4: what to do with a Z_BUF_ERROR, how to make sure the process terminates properly, and Chris@4: so on. So for those who have read zlib.h (a few times), and Chris@4: would like further edification, below is an annotated example in C of simple routines to compress and decompress Chris@4: from an input file to an output file using deflate() and inflate() respectively. The Chris@4: annotations are interspersed between lines of the code. So please read between the lines. Chris@4: We hope this helps explain some of the intricacies of zlib. Chris@4:

Chris@4: Without further adieu, here is the program zpipe.c: Chris@4:


Chris@4: /* zpipe.c: example of proper use of zlib's inflate() and deflate()
Chris@4:    Not copyrighted -- provided to the public domain
Chris@4:    Version 1.4  11 December 2005  Mark Adler */
Chris@4: 
Chris@4: /* Version history:
Chris@4:    1.0  30 Oct 2004  First version
Chris@4:    1.1   8 Nov 2004  Add void casting for unused return values
Chris@4:                      Use switch statement for inflate() return values
Chris@4:    1.2   9 Nov 2004  Add assertions to document zlib guarantees
Chris@4:    1.3   6 Apr 2005  Remove incorrect assertion in inf()
Chris@4:    1.4  11 Dec 2005  Add hack to avoid MSDOS end-of-line conversions
Chris@4:                      Avoid some compiler warnings for input and output buffers
Chris@4:  */
Chris@4:

Chris@4: We now include the header files for the required definitions. From Chris@4: stdio.h we use fopen(), fread(), fwrite(), Chris@4: feof(), ferror(), and fclose() for file i/o, and Chris@4: fputs() for error messages. From string.h we use Chris@4: strcmp() for command line argument processing. Chris@4: From assert.h we use the assert() macro. Chris@4: From zlib.h Chris@4: we use the basic compression functions deflateInit(), Chris@4: deflate(), and deflateEnd(), and the basic decompression Chris@4: functions inflateInit(), inflate(), and Chris@4: inflateEnd(). Chris@4:


Chris@4: #include <stdio.h>
Chris@4: #include <string.h>
Chris@4: #include <assert.h>
Chris@4: #include "zlib.h"
Chris@4:

Chris@4: This is an ugly hack required to avoid corruption of the input and output data on Chris@4: Windows/MS-DOS systems. Without this, those systems would assume that the input and output Chris@4: files are text, and try to convert the end-of-line characters from one standard to Chris@4: another. That would corrupt binary data, and in particular would render the compressed data unusable. Chris@4: This sets the input and output to binary which suppresses the end-of-line conversions. Chris@4: SET_BINARY_MODE() will be used later on stdin and stdout, at the beginning of main(). Chris@4:


Chris@4: #if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
Chris@4: #  include <fcntl.h>
Chris@4: #  include <io.h>
Chris@4: #  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
Chris@4: #else
Chris@4: #  define SET_BINARY_MODE(file)
Chris@4: #endif
Chris@4:

Chris@4: CHUNK is simply the buffer size for feeding data to and pulling data Chris@4: from the zlib routines. Larger buffer sizes would be more efficient, Chris@4: especially for inflate(). If the memory is available, buffers sizes Chris@4: on the order of 128K or 256K bytes should be used. Chris@4:


Chris@4: #define CHUNK 16384
Chris@4:

Chris@4: The def() routine compresses data from an input file to an output file. The output data Chris@4: will be in the zlib format, which is different from the gzip or zip Chris@4: formats. The zlib format has a very small header of only two bytes to identify it as Chris@4: a zlib stream and to provide decoding information, and a four-byte trailer with a fast Chris@4: check value to verify the integrity of the uncompressed data after decoding. Chris@4:


Chris@4: /* Compress from file source to file dest until EOF on source.
Chris@4:    def() returns Z_OK on success, Z_MEM_ERROR if memory could not be
Chris@4:    allocated for processing, Z_STREAM_ERROR if an invalid compression
Chris@4:    level is supplied, Z_VERSION_ERROR if the version of zlib.h and the
Chris@4:    version of the library linked do not match, or Z_ERRNO if there is
Chris@4:    an error reading or writing the files. */
Chris@4: int def(FILE *source, FILE *dest, int level)
Chris@4: {
Chris@4:

Chris@4: Here are the local variables for def(). ret will be used for zlib Chris@4: return codes. flush will keep track of the current flushing state for deflate(), Chris@4: which is either no flushing, or flush to completion after the end of the input file is reached. Chris@4: have is the amount of data returned from deflate(). The strm structure Chris@4: is used to pass information to and from the zlib routines, and to maintain the Chris@4: deflate() state. in and out are the input and output buffers for Chris@4: deflate(). Chris@4:


Chris@4:     int ret, flush;
Chris@4:     unsigned have;
Chris@4:     z_stream strm;
Chris@4:     unsigned char in[CHUNK];
Chris@4:     unsigned char out[CHUNK];
Chris@4:

Chris@4: The first thing we do is to initialize the zlib state for compression using Chris@4: deflateInit(). This must be done before the first use of deflate(). Chris@4: The zalloc, zfree, and opaque fields in the strm Chris@4: structure must be initialized before calling deflateInit(). Here they are Chris@4: set to the zlib constant Z_NULL to request that zlib use Chris@4: the default memory allocation routines. An application may also choose to provide Chris@4: custom memory allocation routines here. deflateInit() will allocate on the Chris@4: order of 256K bytes for the internal state. Chris@4: (See zlib Technical Details.) Chris@4:

Chris@4: deflateInit() is called with a pointer to the structure to be initialized and Chris@4: the compression level, which is an integer in the range of -1 to 9. Lower compression Chris@4: levels result in faster execution, but less compression. Higher levels result in Chris@4: greater compression, but slower execution. The zlib constant Z_DEFAULT_COMPRESSION, Chris@4: equal to -1, Chris@4: provides a good compromise between compression and speed and is equivalent to level 6. Chris@4: Level 0 actually does no compression at all, and in fact expands the data slightly to produce Chris@4: the zlib format (it is not a byte-for-byte copy of the input). Chris@4: More advanced applications of zlib Chris@4: may use deflateInit2() here instead. Such an application may want to reduce how Chris@4: much memory will be used, at some price in compression. Or it may need to request a Chris@4: gzip header and trailer instead of a zlib header and trailer, or raw Chris@4: encoding with no header or trailer at all. Chris@4:

Chris@4: We must check the return value of deflateInit() against the zlib constant Chris@4: Z_OK to make sure that it was able to Chris@4: allocate memory for the internal state, and that the provided arguments were valid. Chris@4: deflateInit() will also check that the version of zlib that the zlib.h Chris@4: file came from matches the version of zlib actually linked with the program. This Chris@4: is especially important for environments in which zlib is a shared library. Chris@4:

Chris@4: Note that an application can initialize multiple, independent zlib streams, which can Chris@4: operate in parallel. The state information maintained in the structure allows the zlib Chris@4: routines to be reentrant. Chris@4:


Chris@4:     /* allocate deflate state */
Chris@4:     strm.zalloc = Z_NULL;
Chris@4:     strm.zfree = Z_NULL;
Chris@4:     strm.opaque = Z_NULL;
Chris@4:     ret = deflateInit(&strm, level);
Chris@4:     if (ret != Z_OK)
Chris@4:         return ret;
Chris@4:

Chris@4: With the pleasantries out of the way, now we can get down to business. The outer do-loop Chris@4: reads all of the input file and exits at the bottom of the loop once end-of-file is reached. Chris@4: This loop contains the only call of deflate(). So we must make sure that all of the Chris@4: input data has been processed and that all of the output data has been generated and consumed Chris@4: before we fall out of the loop at the bottom. Chris@4:


Chris@4:     /* compress until end of file */
Chris@4:     do {
Chris@4:

Chris@4: We start off by reading data from the input file. The number of bytes read is put directly Chris@4: into avail_in, and a pointer to those bytes is put into next_in. We also Chris@4: check to see if end-of-file on the input has been reached. If we are at the end of file, then flush is set to the Chris@4: zlib constant Z_FINISH, which is later passed to deflate() to Chris@4: indicate that this is the last chunk of input data to compress. We need to use feof() Chris@4: to check for end-of-file as opposed to seeing if fewer than CHUNK bytes have been read. The Chris@4: reason is that if the input file length is an exact multiple of CHUNK, we will miss Chris@4: the fact that we got to the end-of-file, and not know to tell deflate() to finish Chris@4: up the compressed stream. If we are not yet at the end of the input, then the zlib Chris@4: constant Z_NO_FLUSH will be passed to deflate to indicate that we are still Chris@4: in the middle of the uncompressed data. Chris@4:

Chris@4: If there is an error in reading from the input file, the process is aborted with Chris@4: deflateEnd() being called to free the allocated zlib state before returning Chris@4: the error. We wouldn't want a memory leak, now would we? deflateEnd() can be called Chris@4: at any time after the state has been initialized. Once that's done, deflateInit() (or Chris@4: deflateInit2()) would have to be called to start a new compression process. There is Chris@4: no point here in checking the deflateEnd() return code. The deallocation can't fail. Chris@4:


Chris@4:         strm.avail_in = fread(in, 1, CHUNK, source);
Chris@4:         if (ferror(source)) {
Chris@4:             (void)deflateEnd(&strm);
Chris@4:             return Z_ERRNO;
Chris@4:         }
Chris@4:         flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
Chris@4:         strm.next_in = in;
Chris@4:

Chris@4: The inner do-loop passes our chunk of input data to deflate(), and then Chris@4: keeps calling deflate() until it is done producing output. Once there is no more Chris@4: new output, deflate() is guaranteed to have consumed all of the input, i.e., Chris@4: avail_in will be zero. Chris@4:


Chris@4:         /* run deflate() on input until output buffer not full, finish
Chris@4:            compression if all of source has been read in */
Chris@4:         do {
Chris@4:

Chris@4: Output space is provided to deflate() by setting avail_out to the number Chris@4: of available output bytes and next_out to a pointer to that space. Chris@4:


Chris@4:             strm.avail_out = CHUNK;
Chris@4:             strm.next_out = out;
Chris@4:

Chris@4: Now we call the compression engine itself, deflate(). It takes as many of the Chris@4: avail_in bytes at next_in as it can process, and writes as many as Chris@4: avail_out bytes to next_out. Those counters and pointers are then Chris@4: updated past the input data consumed and the output data written. It is the amount of Chris@4: output space available that may limit how much input is consumed. Chris@4: Hence the inner loop to make sure that Chris@4: all of the input is consumed by providing more output space each time. Since avail_in Chris@4: and next_in are updated by deflate(), we don't have to mess with those Chris@4: between deflate() calls until it's all used up. Chris@4:

Chris@4: The parameters to deflate() are a pointer to the strm structure containing Chris@4: the input and output information and the internal compression engine state, and a parameter Chris@4: indicating whether and how to flush data to the output. Normally deflate will consume Chris@4: several K bytes of input data before producing any output (except for the header), in order Chris@4: to accumulate statistics on the data for optimum compression. It will then put out a burst of Chris@4: compressed data, and proceed to consume more input before the next burst. Eventually, Chris@4: deflate() Chris@4: must be told to terminate the stream, complete the compression with provided input data, and Chris@4: write out the trailer check value. deflate() will continue to compress normally as long Chris@4: as the flush parameter is Z_NO_FLUSH. Once the Z_FINISH parameter is provided, Chris@4: deflate() will begin to complete the compressed output stream. However depending on how Chris@4: much output space is provided, deflate() may have to be called several times until it Chris@4: has provided the complete compressed stream, even after it has consumed all of the input. The flush Chris@4: parameter must continue to be Z_FINISH for those subsequent calls. Chris@4:

Chris@4: There are other values of the flush parameter that are used in more advanced applications. You can Chris@4: force deflate() to produce a burst of output that encodes all of the input data provided Chris@4: so far, even if it wouldn't have otherwise, for example to control data latency on a link with Chris@4: compressed data. You can also ask that deflate() do that as well as erase any history up to Chris@4: that point so that what follows can be decompressed independently, for example for random access Chris@4: applications. Both requests will degrade compression by an amount depending on how often such Chris@4: requests are made. Chris@4:

Chris@4: deflate() has a return value that can indicate errors, yet we do not check it here. Why Chris@4: not? Well, it turns out that deflate() can do no wrong here. Let's go through Chris@4: deflate()'s return values and dispense with them one by one. The possible values are Chris@4: Z_OK, Z_STREAM_END, Z_STREAM_ERROR, or Z_BUF_ERROR. Z_OK Chris@4: is, well, ok. Z_STREAM_END is also ok and will be returned for the last call of Chris@4: deflate(). This is already guaranteed by calling deflate() with Z_FINISH Chris@4: until it has no more output. Z_STREAM_ERROR is only possible if the stream is not Chris@4: initialized properly, but we did initialize it properly. There is no harm in checking for Chris@4: Z_STREAM_ERROR here, for example to check for the possibility that some Chris@4: other part of the application inadvertently clobbered the memory containing the zlib state. Chris@4: Z_BUF_ERROR will be explained further below, but Chris@4: suffice it to say that this is simply an indication that deflate() could not consume Chris@4: more input or produce more output. deflate() can be called again with more output space Chris@4: or more available input, which it will be in this code. Chris@4:


Chris@4:             ret = deflate(&strm, flush);    /* no bad return value */
Chris@4:             assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
Chris@4:

Chris@4: Now we compute how much output deflate() provided on the last call, which is the Chris@4: difference between how much space was provided before the call, and how much output space Chris@4: is still available after the call. Then that data, if any, is written to the output file. Chris@4: We can then reuse the output buffer for the next call of deflate(). Again if there Chris@4: is a file i/o error, we call deflateEnd() before returning to avoid a memory leak. Chris@4:


Chris@4:             have = CHUNK - strm.avail_out;
Chris@4:             if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
Chris@4:                 (void)deflateEnd(&strm);
Chris@4:                 return Z_ERRNO;
Chris@4:             }
Chris@4:

Chris@4: The inner do-loop is repeated until the last deflate() call fails to fill the Chris@4: provided output buffer. Then we know that deflate() has done as much as it can with Chris@4: the provided input, and that all of that input has been consumed. We can then fall out of this Chris@4: loop and reuse the input buffer. Chris@4:

Chris@4: The way we tell that deflate() has no more output is by seeing that it did not fill Chris@4: the output buffer, leaving avail_out greater than zero. However suppose that Chris@4: deflate() has no more output, but just so happened to exactly fill the output buffer! Chris@4: avail_out is zero, and we can't tell that deflate() has done all it can. Chris@4: As far as we know, deflate() Chris@4: has more output for us. So we call it again. But now deflate() produces no output Chris@4: at all, and avail_out remains unchanged as CHUNK. That deflate() call Chris@4: wasn't able to do anything, either consume input or produce output, and so it returns Chris@4: Z_BUF_ERROR. (See, I told you I'd cover this later.) However this is not a problem at Chris@4: all. Now we finally have the desired indication that deflate() is really done, Chris@4: and so we drop out of the inner loop to provide more input to deflate(). Chris@4:

Chris@4: With flush set to Z_FINISH, this final set of deflate() calls will Chris@4: complete the output stream. Once that is done, subsequent calls of deflate() would return Chris@4: Z_STREAM_ERROR if the flush parameter is not Z_FINISH, and do no more processing Chris@4: until the state is reinitialized. Chris@4:

Chris@4: Some applications of zlib have two loops that call deflate() Chris@4: instead of the single inner loop we have here. The first loop would call Chris@4: without flushing and feed all of the data to deflate(). The second loop would call Chris@4: deflate() with no more Chris@4: data and the Z_FINISH parameter to complete the process. As you can see from this Chris@4: example, that can be avoided by simply keeping track of the current flush state. Chris@4:


Chris@4:         } while (strm.avail_out == 0);
Chris@4:         assert(strm.avail_in == 0);     /* all input will be used */
Chris@4:

Chris@4: Now we check to see if we have already processed all of the input file. That information was Chris@4: saved in the flush variable, so we see if that was set to Z_FINISH. If so, Chris@4: then we're done and we fall out of the outer loop. We're guaranteed to get Z_STREAM_END Chris@4: from the last deflate() call, since we ran it until the last chunk of input was Chris@4: consumed and all of the output was generated. Chris@4:


Chris@4:         /* done when last data in file processed */
Chris@4:     } while (flush != Z_FINISH);
Chris@4:     assert(ret == Z_STREAM_END);        /* stream will be complete */
Chris@4:

Chris@4: The process is complete, but we still need to deallocate the state to avoid a memory leak Chris@4: (or rather more like a memory hemorrhage if you didn't do this). Then Chris@4: finally we can return with a happy return value. Chris@4:


Chris@4:     /* clean up and return */
Chris@4:     (void)deflateEnd(&strm);
Chris@4:     return Z_OK;
Chris@4: }
Chris@4:

Chris@4: Now we do the same thing for decompression in the inf() routine. inf() Chris@4: decompresses what is hopefully a valid zlib stream from the input file and writes the Chris@4: uncompressed data to the output file. Much of the discussion above for def() Chris@4: applies to inf() as well, so the discussion here will focus on the differences between Chris@4: the two. Chris@4:


Chris@4: /* Decompress from file source to file dest until stream ends or EOF.
Chris@4:    inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
Chris@4:    allocated for processing, Z_DATA_ERROR if the deflate data is
Chris@4:    invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
Chris@4:    the version of the library linked do not match, or Z_ERRNO if there
Chris@4:    is an error reading or writing the files. */
Chris@4: int inf(FILE *source, FILE *dest)
Chris@4: {
Chris@4:

Chris@4: The local variables have the same functionality as they do for def(). The Chris@4: only difference is that there is no flush variable, since inflate() Chris@4: can tell from the zlib stream itself when the stream is complete. Chris@4:


Chris@4:     int ret;
Chris@4:     unsigned have;
Chris@4:     z_stream strm;
Chris@4:     unsigned char in[CHUNK];
Chris@4:     unsigned char out[CHUNK];
Chris@4:

Chris@4: The initialization of the state is the same, except that there is no compression level, Chris@4: of course, and two more elements of the structure are initialized. avail_in Chris@4: and next_in must be initialized before calling inflateInit(). This Chris@4: is because the application has the option to provide the start of the zlib stream in Chris@4: order for inflateInit() to have access to information about the compression Chris@4: method to aid in memory allocation. In the current implementation of zlib Chris@4: (up through versions 1.2.x), the method-dependent memory allocations are deferred to the first call of Chris@4: inflate() anyway. However those fields must be initialized since later versions Chris@4: of zlib that provide more compression methods may take advantage of this interface. Chris@4: In any case, no decompression is performed by inflateInit(), so the Chris@4: avail_out and next_out fields do not need to be initialized before calling. Chris@4:

Chris@4: Here avail_in is set to zero and next_in is set to Z_NULL to Chris@4: indicate that no input data is being provided. Chris@4:


Chris@4:     /* allocate inflate state */
Chris@4:     strm.zalloc = Z_NULL;
Chris@4:     strm.zfree = Z_NULL;
Chris@4:     strm.opaque = Z_NULL;
Chris@4:     strm.avail_in = 0;
Chris@4:     strm.next_in = Z_NULL;
Chris@4:     ret = inflateInit(&strm);
Chris@4:     if (ret != Z_OK)
Chris@4:         return ret;
Chris@4:

Chris@4: The outer do-loop decompresses input until inflate() indicates Chris@4: that it has reached the end of the compressed data and has produced all of the uncompressed Chris@4: output. This is in contrast to def() which processes all of the input file. Chris@4: If end-of-file is reached before the compressed data self-terminates, then the compressed Chris@4: data is incomplete and an error is returned. Chris@4:


Chris@4:     /* decompress until deflate stream ends or end of file */
Chris@4:     do {
Chris@4:

Chris@4: We read input data and set the strm structure accordingly. If we've reached the Chris@4: end of the input file, then we leave the outer loop and report an error, since the Chris@4: compressed data is incomplete. Note that we may read more data than is eventually consumed Chris@4: by inflate(), if the input file continues past the zlib stream. Chris@4: For applications where zlib streams are embedded in other data, this routine would Chris@4: need to be modified to return the unused data, or at least indicate how much of the input Chris@4: data was not used, so the application would know where to pick up after the zlib stream. Chris@4:


Chris@4:         strm.avail_in = fread(in, 1, CHUNK, source);
Chris@4:         if (ferror(source)) {
Chris@4:             (void)inflateEnd(&strm);
Chris@4:             return Z_ERRNO;
Chris@4:         }
Chris@4:         if (strm.avail_in == 0)
Chris@4:             break;
Chris@4:         strm.next_in = in;
Chris@4:

Chris@4: The inner do-loop has the same function it did in def(), which is to Chris@4: keep calling inflate() until has generated all of the output it can with the Chris@4: provided input. Chris@4:


Chris@4:         /* run inflate() on input until output buffer not full */
Chris@4:         do {
Chris@4:

Chris@4: Just like in def(), the same output space is provided for each call of inflate(). Chris@4:


Chris@4:             strm.avail_out = CHUNK;
Chris@4:             strm.next_out = out;
Chris@4:

Chris@4: Now we run the decompression engine itself. There is no need to adjust the flush parameter, since Chris@4: the zlib format is self-terminating. The main difference here is that there are Chris@4: return values that we need to pay attention to. Z_DATA_ERROR Chris@4: indicates that inflate() detected an error in the zlib compressed data format, Chris@4: which means that either the data is not a zlib stream to begin with, or that the data was Chris@4: corrupted somewhere along the way since it was compressed. The other error to be processed is Chris@4: Z_MEM_ERROR, which can occur since memory allocation is deferred until inflate() Chris@4: needs it, unlike deflate(), whose memory is allocated at the start by deflateInit(). Chris@4:

Chris@4: Advanced applications may use Chris@4: deflateSetDictionary() to prime deflate() with a set of likely data to improve the Chris@4: first 32K or so of compression. This is noted in the zlib header, so inflate() Chris@4: requests that that dictionary be provided before it can start to decompress. Without the dictionary, Chris@4: correct decompression is not possible. For this routine, we have no idea what the dictionary is, Chris@4: so the Z_NEED_DICT indication is converted to a Z_DATA_ERROR. Chris@4:

Chris@4: inflate() can also return Z_STREAM_ERROR, which should not be possible here, Chris@4: but could be checked for as noted above for def(). Z_BUF_ERROR does not need to be Chris@4: checked for here, for the same reasons noted for def(). Z_STREAM_END will be Chris@4: checked for later. Chris@4:


Chris@4:             ret = inflate(&strm, Z_NO_FLUSH);
Chris@4:             assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
Chris@4:             switch (ret) {
Chris@4:             case Z_NEED_DICT:
Chris@4:                 ret = Z_DATA_ERROR;     /* and fall through */
Chris@4:             case Z_DATA_ERROR:
Chris@4:             case Z_MEM_ERROR:
Chris@4:                 (void)inflateEnd(&strm);
Chris@4:                 return ret;
Chris@4:             }
Chris@4:

Chris@4: The output of inflate() is handled identically to that of deflate(). Chris@4:


Chris@4:             have = CHUNK - strm.avail_out;
Chris@4:             if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
Chris@4:                 (void)inflateEnd(&strm);
Chris@4:                 return Z_ERRNO;
Chris@4:             }
Chris@4:

Chris@4: The inner do-loop ends when inflate() has no more output as indicated Chris@4: by not filling the output buffer, just as for deflate(). In this case, we cannot Chris@4: assert that strm.avail_in will be zero, since the deflate stream may end before the file Chris@4: does. Chris@4:


Chris@4:         } while (strm.avail_out == 0);
Chris@4:

Chris@4: The outer do-loop ends when inflate() reports that it has reached the Chris@4: end of the input zlib stream, has completed the decompression and integrity Chris@4: check, and has provided all of the output. This is indicated by the inflate() Chris@4: return value Z_STREAM_END. The inner loop is guaranteed to leave ret Chris@4: equal to Z_STREAM_END if the last chunk of the input file read contained the end Chris@4: of the zlib stream. So if the return value is not Z_STREAM_END, the Chris@4: loop continues to read more input. Chris@4:


Chris@4:         /* done when inflate() says it's done */
Chris@4:     } while (ret != Z_STREAM_END);
Chris@4:

Chris@4: At this point, decompression successfully completed, or we broke out of the loop due to no Chris@4: more data being available from the input file. If the last inflate() return value Chris@4: is not Z_STREAM_END, then the zlib stream was incomplete and a data error Chris@4: is returned. Otherwise, we return with a happy return value. Of course, inflateEnd() Chris@4: is called first to avoid a memory leak. Chris@4:


Chris@4:     /* clean up and return */
Chris@4:     (void)inflateEnd(&strm);
Chris@4:     return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
Chris@4: }
Chris@4:

Chris@4: That ends the routines that directly use zlib. The following routines make this Chris@4: a command-line program by running data through the above routines from stdin to Chris@4: stdout, and handling any errors reported by def() or inf(). Chris@4:

Chris@4: zerr() is used to interpret the possible error codes from def() Chris@4: and inf(), as detailed in their comments above, and print out an error message. Chris@4: Note that these are only a subset of the possible return values from deflate() Chris@4: and inflate(). Chris@4:


Chris@4: /* report a zlib or i/o error */
Chris@4: void zerr(int ret)
Chris@4: {
Chris@4:     fputs("zpipe: ", stderr);
Chris@4:     switch (ret) {
Chris@4:     case Z_ERRNO:
Chris@4:         if (ferror(stdin))
Chris@4:             fputs("error reading stdin\n", stderr);
Chris@4:         if (ferror(stdout))
Chris@4:             fputs("error writing stdout\n", stderr);
Chris@4:         break;
Chris@4:     case Z_STREAM_ERROR:
Chris@4:         fputs("invalid compression level\n", stderr);
Chris@4:         break;
Chris@4:     case Z_DATA_ERROR:
Chris@4:         fputs("invalid or incomplete deflate data\n", stderr);
Chris@4:         break;
Chris@4:     case Z_MEM_ERROR:
Chris@4:         fputs("out of memory\n", stderr);
Chris@4:         break;
Chris@4:     case Z_VERSION_ERROR:
Chris@4:         fputs("zlib version mismatch!\n", stderr);
Chris@4:     }
Chris@4: }
Chris@4:

Chris@4: Here is the main() routine used to test def() and inf(). The Chris@4: zpipe command is simply a compression pipe from stdin to stdout, if Chris@4: no arguments are given, or it is a decompression pipe if zpipe -d is used. If any other Chris@4: arguments are provided, no compression or decompression is performed. Instead a usage Chris@4: message is displayed. Examples are zpipe < foo.txt > foo.txt.z to compress, and Chris@4: zpipe -d < foo.txt.z > foo.txt to decompress. Chris@4:


Chris@4: /* compress or decompress from stdin to stdout */
Chris@4: int main(int argc, char **argv)
Chris@4: {
Chris@4:     int ret;
Chris@4: 
Chris@4:     /* avoid end-of-line conversions */
Chris@4:     SET_BINARY_MODE(stdin);
Chris@4:     SET_BINARY_MODE(stdout);
Chris@4: 
Chris@4:     /* do compression if no arguments */
Chris@4:     if (argc == 1) {
Chris@4:         ret = def(stdin, stdout, Z_DEFAULT_COMPRESSION);
Chris@4:         if (ret != Z_OK)
Chris@4:             zerr(ret);
Chris@4:         return ret;
Chris@4:     }
Chris@4: 
Chris@4:     /* do decompression if -d specified */
Chris@4:     else if (argc == 2 && strcmp(argv[1], "-d") == 0) {
Chris@4:         ret = inf(stdin, stdout);
Chris@4:         if (ret != Z_OK)
Chris@4:             zerr(ret);
Chris@4:         return ret;
Chris@4:     }
Chris@4: 
Chris@4:     /* otherwise, report usage */
Chris@4:     else {
Chris@4:         fputs("zpipe usage: zpipe [-d] < source > dest\n", stderr);
Chris@4:         return 1;
Chris@4:     }
Chris@4: }
Chris@4:

Chris@4:

Chris@4: Copyright (c) 2004, 2005 by Mark Adler
Last modified 11 December 2005 Chris@4: Chris@4: