zlib Usage Example

cannam@89: cannam@89: cannam@89: cannam@89: cannam@89: zlib Usage Example cannam@89: cannam@89: cannam@89: cannam@89:

zlib Usage Example

cannam@89: We often get questions about how the deflate() and inflate() functions should be used. cannam@89: Users wonder when they should provide more input, when they should use more output, cannam@89: what to do with a Z_BUF_ERROR, how to make sure the process terminates properly, and cannam@89: so on. So for those who have read zlib.h (a few times), and cannam@89: would like further edification, below is an annotated example in C of simple routines to compress and decompress cannam@89: from an input file to an output file using deflate() and inflate() respectively. The cannam@89: annotations are interspersed between lines of the code. So please read between the lines. cannam@89: We hope this helps explain some of the intricacies of zlib. cannam@89:

cannam@89: Without further adieu, here is the program zpipe.c: cannam@89:


cannam@89: /* zpipe.c: example of proper use of zlib's inflate() and deflate()
cannam@89:    Not copyrighted -- provided to the public domain
cannam@89:    Version 1.4  11 December 2005  Mark Adler */
cannam@89: 
cannam@89: /* Version history:
cannam@89:    1.0  30 Oct 2004  First version
cannam@89:    1.1   8 Nov 2004  Add void casting for unused return values
cannam@89:                      Use switch statement for inflate() return values
cannam@89:    1.2   9 Nov 2004  Add assertions to document zlib guarantees
cannam@89:    1.3   6 Apr 2005  Remove incorrect assertion in inf()
cannam@89:    1.4  11 Dec 2005  Add hack to avoid MSDOS end-of-line conversions
cannam@89:                      Avoid some compiler warnings for input and output buffers
cannam@89:  */
cannam@89:

cannam@89: We now include the header files for the required definitions. From cannam@89: stdio.h we use fopen(), fread(), fwrite(), cannam@89: feof(), ferror(), and fclose() for file i/o, and cannam@89: fputs() for error messages. From string.h we use cannam@89: strcmp() for command line argument processing. cannam@89: From assert.h we use the assert() macro. cannam@89: From zlib.h cannam@89: we use the basic compression functions deflateInit(), cannam@89: deflate(), and deflateEnd(), and the basic decompression cannam@89: functions inflateInit(), inflate(), and cannam@89: inflateEnd(). cannam@89:


cannam@89: #include <stdio.h>
cannam@89: #include <string.h>
cannam@89: #include <assert.h>
cannam@89: #include "zlib.h"
cannam@89:

cannam@89: This is an ugly hack required to avoid corruption of the input and output data on cannam@89: Windows/MS-DOS systems. Without this, those systems would assume that the input and output cannam@89: files are text, and try to convert the end-of-line characters from one standard to cannam@89: another. That would corrupt binary data, and in particular would render the compressed data unusable. cannam@89: This sets the input and output to binary which suppresses the end-of-line conversions. cannam@89: SET_BINARY_MODE() will be used later on stdin and stdout, at the beginning of main(). cannam@89:


cannam@89: #if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
cannam@89: #  include <fcntl.h>
cannam@89: #  include <io.h>
cannam@89: #  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
cannam@89: #else
cannam@89: #  define SET_BINARY_MODE(file)
cannam@89: #endif
cannam@89:

cannam@89: CHUNK is simply the buffer size for feeding data to and pulling data cannam@89: from the zlib routines. Larger buffer sizes would be more efficient, cannam@89: especially for inflate(). If the memory is available, buffers sizes cannam@89: on the order of 128K or 256K bytes should be used. cannam@89:


cannam@89: #define CHUNK 16384
cannam@89:

cannam@89: The def() routine compresses data from an input file to an output file. The output data cannam@89: will be in the zlib format, which is different from the gzip or zip cannam@89: formats. The zlib format has a very small header of only two bytes to identify it as cannam@89: a zlib stream and to provide decoding information, and a four-byte trailer with a fast cannam@89: check value to verify the integrity of the uncompressed data after decoding. cannam@89:


cannam@89: /* Compress from file source to file dest until EOF on source.
cannam@89:    def() returns Z_OK on success, Z_MEM_ERROR if memory could not be
cannam@89:    allocated for processing, Z_STREAM_ERROR if an invalid compression
cannam@89:    level is supplied, Z_VERSION_ERROR if the version of zlib.h and the
cannam@89:    version of the library linked do not match, or Z_ERRNO if there is
cannam@89:    an error reading or writing the files. */
cannam@89: int def(FILE *source, FILE *dest, int level)
cannam@89: {
cannam@89:

cannam@89: Here are the local variables for def(). ret will be used for zlib cannam@89: return codes. flush will keep track of the current flushing state for deflate(), cannam@89: which is either no flushing, or flush to completion after the end of the input file is reached. cannam@89: have is the amount of data returned from deflate(). The strm structure cannam@89: is used to pass information to and from the zlib routines, and to maintain the cannam@89: deflate() state. in and out are the input and output buffers for cannam@89: deflate(). cannam@89:


cannam@89:     int ret, flush;
cannam@89:     unsigned have;
cannam@89:     z_stream strm;
cannam@89:     unsigned char in[CHUNK];
cannam@89:     unsigned char out[CHUNK];
cannam@89:

cannam@89: The first thing we do is to initialize the zlib state for compression using cannam@89: deflateInit(). This must be done before the first use of deflate(). cannam@89: The zalloc, zfree, and opaque fields in the strm cannam@89: structure must be initialized before calling deflateInit(). Here they are cannam@89: set to the zlib constant Z_NULL to request that zlib use cannam@89: the default memory allocation routines. An application may also choose to provide cannam@89: custom memory allocation routines here. deflateInit() will allocate on the cannam@89: order of 256K bytes for the internal state. cannam@89: (See zlib Technical Details.) cannam@89:

cannam@89: deflateInit() is called with a pointer to the structure to be initialized and cannam@89: the compression level, which is an integer in the range of -1 to 9. Lower compression cannam@89: levels result in faster execution, but less compression. Higher levels result in cannam@89: greater compression, but slower execution. The zlib constant Z_DEFAULT_COMPRESSION, cannam@89: equal to -1, cannam@89: provides a good compromise between compression and speed and is equivalent to level 6. cannam@89: Level 0 actually does no compression at all, and in fact expands the data slightly to produce cannam@89: the zlib format (it is not a byte-for-byte copy of the input). cannam@89: More advanced applications of zlib cannam@89: may use deflateInit2() here instead. Such an application may want to reduce how cannam@89: much memory will be used, at some price in compression. Or it may need to request a cannam@89: gzip header and trailer instead of a zlib header and trailer, or raw cannam@89: encoding with no header or trailer at all. cannam@89:

cannam@89: We must check the return value of deflateInit() against the zlib constant cannam@89: Z_OK to make sure that it was able to cannam@89: allocate memory for the internal state, and that the provided arguments were valid. cannam@89: deflateInit() will also check that the version of zlib that the zlib.h cannam@89: file came from matches the version of zlib actually linked with the program. This cannam@89: is especially important for environments in which zlib is a shared library. cannam@89:

cannam@89: Note that an application can initialize multiple, independent zlib streams, which can cannam@89: operate in parallel. The state information maintained in the structure allows the zlib cannam@89: routines to be reentrant. cannam@89:


cannam@89:     /* allocate deflate state */
cannam@89:     strm.zalloc = Z_NULL;
cannam@89:     strm.zfree = Z_NULL;
cannam@89:     strm.opaque = Z_NULL;
cannam@89:     ret = deflateInit(&strm, level);
cannam@89:     if (ret != Z_OK)
cannam@89:         return ret;
cannam@89:

cannam@89: With the pleasantries out of the way, now we can get down to business. The outer do-loop cannam@89: reads all of the input file and exits at the bottom of the loop once end-of-file is reached. cannam@89: This loop contains the only call of deflate(). So we must make sure that all of the cannam@89: input data has been processed and that all of the output data has been generated and consumed cannam@89: before we fall out of the loop at the bottom. cannam@89:


cannam@89:     /* compress until end of file */
cannam@89:     do {
cannam@89:

cannam@89: We start off by reading data from the input file. The number of bytes read is put directly cannam@89: into avail_in, and a pointer to those bytes is put into next_in. We also cannam@89: check to see if end-of-file on the input has been reached. If we are at the end of file, then flush is set to the cannam@89: zlib constant Z_FINISH, which is later passed to deflate() to cannam@89: indicate that this is the last chunk of input data to compress. We need to use feof() cannam@89: to check for end-of-file as opposed to seeing if fewer than CHUNK bytes have been read. The cannam@89: reason is that if the input file length is an exact multiple of CHUNK, we will miss cannam@89: the fact that we got to the end-of-file, and not know to tell deflate() to finish cannam@89: up the compressed stream. If we are not yet at the end of the input, then the zlib cannam@89: constant Z_NO_FLUSH will be passed to deflate to indicate that we are still cannam@89: in the middle of the uncompressed data. cannam@89:

cannam@89: If there is an error in reading from the input file, the process is aborted with cannam@89: deflateEnd() being called to free the allocated zlib state before returning cannam@89: the error. We wouldn't want a memory leak, now would we? deflateEnd() can be called cannam@89: at any time after the state has been initialized. Once that's done, deflateInit() (or cannam@89: deflateInit2()) would have to be called to start a new compression process. There is cannam@89: no point here in checking the deflateEnd() return code. The deallocation can't fail. cannam@89:


cannam@89:         strm.avail_in = fread(in, 1, CHUNK, source);
cannam@89:         if (ferror(source)) {
cannam@89:             (void)deflateEnd(&strm);
cannam@89:             return Z_ERRNO;
cannam@89:         }
cannam@89:         flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
cannam@89:         strm.next_in = in;
cannam@89:

cannam@89: The inner do-loop passes our chunk of input data to deflate(), and then cannam@89: keeps calling deflate() until it is done producing output. Once there is no more cannam@89: new output, deflate() is guaranteed to have consumed all of the input, i.e., cannam@89: avail_in will be zero. cannam@89:


cannam@89:         /* run deflate() on input until output buffer not full, finish
cannam@89:            compression if all of source has been read in */
cannam@89:         do {
cannam@89:

cannam@89: Output space is provided to deflate() by setting avail_out to the number cannam@89: of available output bytes and next_out to a pointer to that space. cannam@89:


cannam@89:             strm.avail_out = CHUNK;
cannam@89:             strm.next_out = out;
cannam@89:

cannam@89: Now we call the compression engine itself, deflate(). It takes as many of the cannam@89: avail_in bytes at next_in as it can process, and writes as many as cannam@89: avail_out bytes to next_out. Those counters and pointers are then cannam@89: updated past the input data consumed and the output data written. It is the amount of cannam@89: output space available that may limit how much input is consumed. cannam@89: Hence the inner loop to make sure that cannam@89: all of the input is consumed by providing more output space each time. Since avail_in cannam@89: and next_in are updated by deflate(), we don't have to mess with those cannam@89: between deflate() calls until it's all used up. cannam@89:

cannam@89: The parameters to deflate() are a pointer to the strm structure containing cannam@89: the input and output information and the internal compression engine state, and a parameter cannam@89: indicating whether and how to flush data to the output. Normally deflate will consume cannam@89: several K bytes of input data before producing any output (except for the header), in order cannam@89: to accumulate statistics on the data for optimum compression. It will then put out a burst of cannam@89: compressed data, and proceed to consume more input before the next burst. Eventually, cannam@89: deflate() cannam@89: must be told to terminate the stream, complete the compression with provided input data, and cannam@89: write out the trailer check value. deflate() will continue to compress normally as long cannam@89: as the flush parameter is Z_NO_FLUSH. Once the Z_FINISH parameter is provided, cannam@89: deflate() will begin to complete the compressed output stream. However depending on how cannam@89: much output space is provided, deflate() may have to be called several times until it cannam@89: has provided the complete compressed stream, even after it has consumed all of the input. The flush cannam@89: parameter must continue to be Z_FINISH for those subsequent calls. cannam@89:

cannam@89: There are other values of the flush parameter that are used in more advanced applications. You can cannam@89: force deflate() to produce a burst of output that encodes all of the input data provided cannam@89: so far, even if it wouldn't have otherwise, for example to control data latency on a link with cannam@89: compressed data. You can also ask that deflate() do that as well as erase any history up to cannam@89: that point so that what follows can be decompressed independently, for example for random access cannam@89: applications. Both requests will degrade compression by an amount depending on how often such cannam@89: requests are made. cannam@89:

cannam@89: deflate() has a return value that can indicate errors, yet we do not check it here. Why cannam@89: not? Well, it turns out that deflate() can do no wrong here. Let's go through cannam@89: deflate()'s return values and dispense with them one by one. The possible values are cannam@89: Z_OK, Z_STREAM_END, Z_STREAM_ERROR, or Z_BUF_ERROR. Z_OK cannam@89: is, well, ok. Z_STREAM_END is also ok and will be returned for the last call of cannam@89: deflate(). This is already guaranteed by calling deflate() with Z_FINISH cannam@89: until it has no more output. Z_STREAM_ERROR is only possible if the stream is not cannam@89: initialized properly, but we did initialize it properly. There is no harm in checking for cannam@89: Z_STREAM_ERROR here, for example to check for the possibility that some cannam@89: other part of the application inadvertently clobbered the memory containing the zlib state. cannam@89: Z_BUF_ERROR will be explained further below, but cannam@89: suffice it to say that this is simply an indication that deflate() could not consume cannam@89: more input or produce more output. deflate() can be called again with more output space cannam@89: or more available input, which it will be in this code. cannam@89:


cannam@89:             ret = deflate(&strm, flush);    /* no bad return value */
cannam@89:             assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
cannam@89:

cannam@89: Now we compute how much output deflate() provided on the last call, which is the cannam@89: difference between how much space was provided before the call, and how much output space cannam@89: is still available after the call. Then that data, if any, is written to the output file. cannam@89: We can then reuse the output buffer for the next call of deflate(). Again if there cannam@89: is a file i/o error, we call deflateEnd() before returning to avoid a memory leak. cannam@89:


cannam@89:             have = CHUNK - strm.avail_out;
cannam@89:             if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
cannam@89:                 (void)deflateEnd(&strm);
cannam@89:                 return Z_ERRNO;
cannam@89:             }
cannam@89:

cannam@89: The inner do-loop is repeated until the last deflate() call fails to fill the cannam@89: provided output buffer. Then we know that deflate() has done as much as it can with cannam@89: the provided input, and that all of that input has been consumed. We can then fall out of this cannam@89: loop and reuse the input buffer. cannam@89:

cannam@89: The way we tell that deflate() has no more output is by seeing that it did not fill cannam@89: the output buffer, leaving avail_out greater than zero. However suppose that cannam@89: deflate() has no more output, but just so happened to exactly fill the output buffer! cannam@89: avail_out is zero, and we can't tell that deflate() has done all it can. cannam@89: As far as we know, deflate() cannam@89: has more output for us. So we call it again. But now deflate() produces no output cannam@89: at all, and avail_out remains unchanged as CHUNK. That deflate() call cannam@89: wasn't able to do anything, either consume input or produce output, and so it returns cannam@89: Z_BUF_ERROR. (See, I told you I'd cover this later.) However this is not a problem at cannam@89: all. Now we finally have the desired indication that deflate() is really done, cannam@89: and so we drop out of the inner loop to provide more input to deflate(). cannam@89:

cannam@89: With flush set to Z_FINISH, this final set of deflate() calls will cannam@89: complete the output stream. Once that is done, subsequent calls of deflate() would return cannam@89: Z_STREAM_ERROR if the flush parameter is not Z_FINISH, and do no more processing cannam@89: until the state is reinitialized. cannam@89:

cannam@89: Some applications of zlib have two loops that call deflate() cannam@89: instead of the single inner loop we have here. The first loop would call cannam@89: without flushing and feed all of the data to deflate(). The second loop would call cannam@89: deflate() with no more cannam@89: data and the Z_FINISH parameter to complete the process. As you can see from this cannam@89: example, that can be avoided by simply keeping track of the current flush state. cannam@89:


cannam@89:         } while (strm.avail_out == 0);
cannam@89:         assert(strm.avail_in == 0);     /* all input will be used */
cannam@89:

cannam@89: Now we check to see if we have already processed all of the input file. That information was cannam@89: saved in the flush variable, so we see if that was set to Z_FINISH. If so, cannam@89: then we're done and we fall out of the outer loop. We're guaranteed to get Z_STREAM_END cannam@89: from the last deflate() call, since we ran it until the last chunk of input was cannam@89: consumed and all of the output was generated. cannam@89:


cannam@89:         /* done when last data in file processed */
cannam@89:     } while (flush != Z_FINISH);
cannam@89:     assert(ret == Z_STREAM_END);        /* stream will be complete */
cannam@89:

cannam@89: The process is complete, but we still need to deallocate the state to avoid a memory leak cannam@89: (or rather more like a memory hemorrhage if you didn't do this). Then cannam@89: finally we can return with a happy return value. cannam@89:


cannam@89:     /* clean up and return */
cannam@89:     (void)deflateEnd(&strm);
cannam@89:     return Z_OK;
cannam@89: }
cannam@89:

cannam@89: Now we do the same thing for decompression in the inf() routine. inf() cannam@89: decompresses what is hopefully a valid zlib stream from the input file and writes the cannam@89: uncompressed data to the output file. Much of the discussion above for def() cannam@89: applies to inf() as well, so the discussion here will focus on the differences between cannam@89: the two. cannam@89:


cannam@89: /* Decompress from file source to file dest until stream ends or EOF.
cannam@89:    inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
cannam@89:    allocated for processing, Z_DATA_ERROR if the deflate data is
cannam@89:    invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
cannam@89:    the version of the library linked do not match, or Z_ERRNO if there
cannam@89:    is an error reading or writing the files. */
cannam@89: int inf(FILE *source, FILE *dest)
cannam@89: {
cannam@89:

cannam@89: The local variables have the same functionality as they do for def(). The cannam@89: only difference is that there is no flush variable, since inflate() cannam@89: can tell from the zlib stream itself when the stream is complete. cannam@89:


cannam@89:     int ret;
cannam@89:     unsigned have;
cannam@89:     z_stream strm;
cannam@89:     unsigned char in[CHUNK];
cannam@89:     unsigned char out[CHUNK];
cannam@89:

cannam@89: The initialization of the state is the same, except that there is no compression level, cannam@89: of course, and two more elements of the structure are initialized. avail_in cannam@89: and next_in must be initialized before calling inflateInit(). This cannam@89: is because the application has the option to provide the start of the zlib stream in cannam@89: order for inflateInit() to have access to information about the compression cannam@89: method to aid in memory allocation. In the current implementation of zlib cannam@89: (up through versions 1.2.x), the method-dependent memory allocations are deferred to the first call of cannam@89: inflate() anyway. However those fields must be initialized since later versions cannam@89: of zlib that provide more compression methods may take advantage of this interface. cannam@89: In any case, no decompression is performed by inflateInit(), so the cannam@89: avail_out and next_out fields do not need to be initialized before calling. cannam@89:

cannam@89: Here avail_in is set to zero and next_in is set to Z_NULL to cannam@89: indicate that no input data is being provided. cannam@89:


cannam@89:     /* allocate inflate state */
cannam@89:     strm.zalloc = Z_NULL;
cannam@89:     strm.zfree = Z_NULL;
cannam@89:     strm.opaque = Z_NULL;
cannam@89:     strm.avail_in = 0;
cannam@89:     strm.next_in = Z_NULL;
cannam@89:     ret = inflateInit(&strm);
cannam@89:     if (ret != Z_OK)
cannam@89:         return ret;
cannam@89:

cannam@89: The outer do-loop decompresses input until inflate() indicates cannam@89: that it has reached the end of the compressed data and has produced all of the uncompressed cannam@89: output. This is in contrast to def() which processes all of the input file. cannam@89: If end-of-file is reached before the compressed data self-terminates, then the compressed cannam@89: data is incomplete and an error is returned. cannam@89:


cannam@89:     /* decompress until deflate stream ends or end of file */
cannam@89:     do {
cannam@89:

cannam@89: We read input data and set the strm structure accordingly. If we've reached the cannam@89: end of the input file, then we leave the outer loop and report an error, since the cannam@89: compressed data is incomplete. Note that we may read more data than is eventually consumed cannam@89: by inflate(), if the input file continues past the zlib stream. cannam@89: For applications where zlib streams are embedded in other data, this routine would cannam@89: need to be modified to return the unused data, or at least indicate how much of the input cannam@89: data was not used, so the application would know where to pick up after the zlib stream. cannam@89:


cannam@89:         strm.avail_in = fread(in, 1, CHUNK, source);
cannam@89:         if (ferror(source)) {
cannam@89:             (void)inflateEnd(&strm);
cannam@89:             return Z_ERRNO;
cannam@89:         }
cannam@89:         if (strm.avail_in == 0)
cannam@89:             break;
cannam@89:         strm.next_in = in;
cannam@89:

cannam@89: The inner do-loop has the same function it did in def(), which is to cannam@89: keep calling inflate() until has generated all of the output it can with the cannam@89: provided input. cannam@89:


cannam@89:         /* run inflate() on input until output buffer not full */
cannam@89:         do {
cannam@89:

cannam@89: Just like in def(), the same output space is provided for each call of inflate(). cannam@89:


cannam@89:             strm.avail_out = CHUNK;
cannam@89:             strm.next_out = out;
cannam@89:

cannam@89: Now we run the decompression engine itself. There is no need to adjust the flush parameter, since cannam@89: the zlib format is self-terminating. The main difference here is that there are cannam@89: return values that we need to pay attention to. Z_DATA_ERROR cannam@89: indicates that inflate() detected an error in the zlib compressed data format, cannam@89: which means that either the data is not a zlib stream to begin with, or that the data was cannam@89: corrupted somewhere along the way since it was compressed. The other error to be processed is cannam@89: Z_MEM_ERROR, which can occur since memory allocation is deferred until inflate() cannam@89: needs it, unlike deflate(), whose memory is allocated at the start by deflateInit(). cannam@89:

cannam@89: Advanced applications may use cannam@89: deflateSetDictionary() to prime deflate() with a set of likely data to improve the cannam@89: first 32K or so of compression. This is noted in the zlib header, so inflate() cannam@89: requests that that dictionary be provided before it can start to decompress. Without the dictionary, cannam@89: correct decompression is not possible. For this routine, we have no idea what the dictionary is, cannam@89: so the Z_NEED_DICT indication is converted to a Z_DATA_ERROR. cannam@89:

cannam@89: inflate() can also return Z_STREAM_ERROR, which should not be possible here, cannam@89: but could be checked for as noted above for def(). Z_BUF_ERROR does not need to be cannam@89: checked for here, for the same reasons noted for def(). Z_STREAM_END will be cannam@89: checked for later. cannam@89:


cannam@89:             ret = inflate(&strm, Z_NO_FLUSH);
cannam@89:             assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
cannam@89:             switch (ret) {
cannam@89:             case Z_NEED_DICT:
cannam@89:                 ret = Z_DATA_ERROR;     /* and fall through */
cannam@89:             case Z_DATA_ERROR:
cannam@89:             case Z_MEM_ERROR:
cannam@89:                 (void)inflateEnd(&strm);
cannam@89:                 return ret;
cannam@89:             }
cannam@89:

cannam@89: The output of inflate() is handled identically to that of deflate(). cannam@89:


cannam@89:             have = CHUNK - strm.avail_out;
cannam@89:             if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
cannam@89:                 (void)inflateEnd(&strm);
cannam@89:                 return Z_ERRNO;
cannam@89:             }
cannam@89:

cannam@89: The inner do-loop ends when inflate() has no more output as indicated cannam@89: by not filling the output buffer, just as for deflate(). In this case, we cannot cannam@89: assert that strm.avail_in will be zero, since the deflate stream may end before the file cannam@89: does. cannam@89:


cannam@89:         } while (strm.avail_out == 0);
cannam@89:

cannam@89: The outer do-loop ends when inflate() reports that it has reached the cannam@89: end of the input zlib stream, has completed the decompression and integrity cannam@89: check, and has provided all of the output. This is indicated by the inflate() cannam@89: return value Z_STREAM_END. The inner loop is guaranteed to leave ret cannam@89: equal to Z_STREAM_END if the last chunk of the input file read contained the end cannam@89: of the zlib stream. So if the return value is not Z_STREAM_END, the cannam@89: loop continues to read more input. cannam@89:


cannam@89:         /* done when inflate() says it's done */
cannam@89:     } while (ret != Z_STREAM_END);
cannam@89:

cannam@89: At this point, decompression successfully completed, or we broke out of the loop due to no cannam@89: more data being available from the input file. If the last inflate() return value cannam@89: is not Z_STREAM_END, then the zlib stream was incomplete and a data error cannam@89: is returned. Otherwise, we return with a happy return value. Of course, inflateEnd() cannam@89: is called first to avoid a memory leak. cannam@89:


cannam@89:     /* clean up and return */
cannam@89:     (void)inflateEnd(&strm);
cannam@89:     return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
cannam@89: }
cannam@89:

cannam@89: That ends the routines that directly use zlib. The following routines make this cannam@89: a command-line program by running data through the above routines from stdin to cannam@89: stdout, and handling any errors reported by def() or inf(). cannam@89:

cannam@89: zerr() is used to interpret the possible error codes from def() cannam@89: and inf(), as detailed in their comments above, and print out an error message. cannam@89: Note that these are only a subset of the possible return values from deflate() cannam@89: and inflate(). cannam@89:


cannam@89: /* report a zlib or i/o error */
cannam@89: void zerr(int ret)
cannam@89: {
cannam@89:     fputs("zpipe: ", stderr);
cannam@89:     switch (ret) {
cannam@89:     case Z_ERRNO:
cannam@89:         if (ferror(stdin))
cannam@89:             fputs("error reading stdin\n", stderr);
cannam@89:         if (ferror(stdout))
cannam@89:             fputs("error writing stdout\n", stderr);
cannam@89:         break;
cannam@89:     case Z_STREAM_ERROR:
cannam@89:         fputs("invalid compression level\n", stderr);
cannam@89:         break;
cannam@89:     case Z_DATA_ERROR:
cannam@89:         fputs("invalid or incomplete deflate data\n", stderr);
cannam@89:         break;
cannam@89:     case Z_MEM_ERROR:
cannam@89:         fputs("out of memory\n", stderr);
cannam@89:         break;
cannam@89:     case Z_VERSION_ERROR:
cannam@89:         fputs("zlib version mismatch!\n", stderr);
cannam@89:     }
cannam@89: }
cannam@89:

cannam@89: Here is the main() routine used to test def() and inf(). The cannam@89: zpipe command is simply a compression pipe from stdin to stdout, if cannam@89: no arguments are given, or it is a decompression pipe if zpipe -d is used. If any other cannam@89: arguments are provided, no compression or decompression is performed. Instead a usage cannam@89: message is displayed. Examples are zpipe < foo.txt > foo.txt.z to compress, and cannam@89: zpipe -d < foo.txt.z > foo.txt to decompress. cannam@89:


cannam@89: /* compress or decompress from stdin to stdout */
cannam@89: int main(int argc, char **argv)
cannam@89: {
cannam@89:     int ret;
cannam@89: 
cannam@89:     /* avoid end-of-line conversions */
cannam@89:     SET_BINARY_MODE(stdin);
cannam@89:     SET_BINARY_MODE(stdout);
cannam@89: 
cannam@89:     /* do compression if no arguments */
cannam@89:     if (argc == 1) {
cannam@89:         ret = def(stdin, stdout, Z_DEFAULT_COMPRESSION);
cannam@89:         if (ret != Z_OK)
cannam@89:             zerr(ret);
cannam@89:         return ret;
cannam@89:     }
cannam@89: 
cannam@89:     /* do decompression if -d specified */
cannam@89:     else if (argc == 2 && strcmp(argv[1], "-d") == 0) {
cannam@89:         ret = inf(stdin, stdout);
cannam@89:         if (ret != Z_OK)
cannam@89:             zerr(ret);
cannam@89:         return ret;
cannam@89:     }
cannam@89: 
cannam@89:     /* otherwise, report usage */
cannam@89:     else {
cannam@89:         fputs("zpipe usage: zpipe [-d] < source > dest\n", stderr);
cannam@89:         return 1;
cannam@89:     }
cannam@89: }
cannam@89:

cannam@89:

cannam@89: Copyright (c) 2004, 2005 by Mark Adler
Last modified 11 December 2005 cannam@89: cannam@89: