Many programs, games especially, need to access large numbers of datafiles. These data files generally take up quite a bit of disk space not only for the storage of their data, but as "wasted" disk space due to cluster size limitations (on FAT12/FAT16 filesystems -- Fat32 has alleviated some of this wastage by using smaller cluster sizes). Traditionally, game programmers have relied on one of two main methods of data storage:
The drawback to the first solution is the wasted disk space problem, as well as the problem of slower installations (it's much slower to create one thousand 1KB files than it is to create a single 1MB file) and access (DOS/Win95 uses a linear search through the directory to open files -- the more files you have in a single directory, the longer it takes to open the files further down the list).
The second solution provides it's own pitfalls, first is that you must write all your own image/sound/etc. loading routines which use a custom API for accessing the archived data. A further drawback is that you have to write your own archive utility to build the archives in the first place.
The ZPP library addresses all of these concerns:
<map>
object, file lookups are done with an in-memory
binary search instead of a linear filesystem search, allowing for quick
lookup of files contained in the archive. An implementation could be easily provided using the SGI
STL <hash_map>
for even faster, albeit "unordered"
lookup.
iostream
and streambuf
classes (with a few caveats, see below).
In addition to providing "transparent" access to members of .ZIP archives, ZPP provides a versioning feature which can be used to implement "patch" zip files to fix problems found after the release of a program without having to re-issue an entire .ZIP file.
These benefits are a compelling reason to use .ZIP archives instead of "rolling-your-own". Indeed, recently a couple of popular games (Starsiege: Tribes and QuakeIII) have moved to using .ZIP files (previously, the Quake series went the custom route with the famous ".WAD" files). The extensions are different, (".PK3" in the case of Quake III, and ".VOL" files in the case of "Starsiege: Tribes") but the files are standard .ZIP file and can be manipulated with off-the-shelf tools -- this can have a BIG impact on time to market for a game because the tools do not have to be written to manipulate archives; well developed and supported tools already exist in the marketplace and can be had for free (Info ZIP), or for a small fee (PkZip).
ZPP uses the free ZLIB library to handle it's decompression / compression. It is compatible with files stored in .ZIP archives using the "STORE" method and the "DEFLATE" method.
zppZipArchive
zppZipFileInfo
zppZipReader
For interface to the rest of the world, objects of the following classes are created:
zppstreambuf
(derived from streambuf)
izppstream
(derived from public istream)
The zppZipArchive class encapsulates all of the information about a .ZIP archive file (XXXX.zip). It also supports global searching of files and text attributes (see below) attached to the .zip file comment which are parsed by the library and accessed through member functions in the zppZipArchive class.
zppZipArchive objects contain a vector of zppZipFileInfo objects, each one representing the information required to uncompress a particular member of a .ZIP archive.
The zppZipReader class encapsulates the decompression state for a single file, and is the real worker behind the zppstreambuf class.
The zppstreambuf class is an iostream wrapper around the zppZipReader class. It supports all of the standard "istream" functions for reading and can be passed to any file parsing routine which uses istreams.
IMPORTANT NOTE: because the compressed file data is decompressed on the
fly, zppstreambuf::seekg()
will fail if the offset is not 0.
A future enhancement to the library will allow seeking on a file
which is not compressed (ZIP refers to this as the "stored" compression
method), or to have an entire file decompressed to memory when it's
zppZipReader
object is created.
The izppstream class wraps a zppstreambuf for formatted input, and is a standard "istream" type class.
The unit of parsing is a line. The line-end character can be UNIX style (a single NL, ASCII 10), MAC style (a single CR, ASCII 13) or DOS/Windows style (CR, ASCII 13, followed by NL, ASCII 10). Continuation lines are not supported.
The first line of the file MUST be "%ZPP%
" (without the
quotes). If the ZPP library does not find this signal string, it will
not parse the attribute file.
Each line of the attribute file defines a single key/data pair. Blank lines (lines consisting of only spaces and a line-terminator) are ignored. Comment lines, beginning with '#', are also ignored.
A Key/value line is of one of the following four forms:
KEY = DATA
KEY = "DATA"
KEY = 'DATA'
KEY = (DATA)
In all three forms, the KEY is made up of non-whitespace characters and is terminated by the first space character. A whitespace character is ASCII 32 (space), or ASCII 10 (tab).
After the KEY, there can be optional whitespace. An EQUALS sign "=" separates the KEY from the DATA. After the "=", any whitespace will be skipped.
In the first form, the data associated with the key is all non-whitespace characters up to the end-of-line or first whitespace encountered. Note that the data cannot have embedded whitespace in it.
The final three forms are used when there is a need to embed whitespace in the data. The data starts after the quote or parenthesis character, and continues to a matching, closing quote or parenthesis character. The data string cannot contain the character used to bracket it, and there is no way to escape an embedded quote character, thus the three acceptable forms.
Example attributes:
NOTE: Attribute names are CASE SENSITIVE.ZPP_PRIORITY = 2 ProgramTitle = "Mike's Wonderful program" Quote3 = ("Hey, bob's dog bit me!", said Joe.)
The zppZipArchive::findAttr() member function is used to locate attributes.
ThezppZipArchive myZip("foo.zip"); string z; z = myZip.findAttr("Quote3"); if (z == "") { cout << "Quote3 attribute not found" << endl; } else { cout << "Quote3 = " << z << endl; }
zppZipArchive::attrExists()
member function can be used
to determine if an attribute exists.
NOTE: Attributes beginning with the string "ZPP_" are reserved for use by the ZPP library. There are currently two such attributes used: ZPP_PRIORITY (see "Priorities", below), and ZPP_DIR_PREFIX (see "Directory Prefix", below).
When programs are distributed, there frequently arises the need to replace files which contained errors in the original distribution. With all files packed inside of archives, it can be difficult to patch a multi-megabyte file in place on the user's system, so the ZPP library has the ability to assign a priority to a ZIP archive. When ZIP archives are opened by the ZPP library, all contained files are added to a global map of files with an attached priority. The priority comes from a class static variable which can be set before the zppZipArchive object is constructed, or from a ZPP_PRIORITY attribute (see Attributes) above contained in the .ZIP file when it's opened. Files of higher priority replace those of lower priority in the global map of files.
NOTE: files not in a .ZIP archive on the hard disk have a higher priority than any .ZIP file -- they will be found first before the open .ZIP archives are searched.
zppZipArchive
object
created, a string can optionally be prepended to all files contained
within. This string is taken from the attribute "ZPP_DIR_PREFIX"
contained in the zip comment.
This feature can be used, to move all of the files contained in the zip file "down" in the file hierarchy without having to have a separate subdirectory for them, or storing that subdirectory name when the archive is built. For example, you might have a single zip file for each level of a game "level1.zip", "level2.zip", etc. and the files stored in the .ZIP file are of the form "images/shot.bmp", "map.bsp", etc. If both level1.zip and level2.zip contained the same file names, it might get confusing which file you're trying to access, so the level1.zip archive could contain a ZPP_DIR_PREFIX="Level1/" attribute: The files contained within now become "Level1/images/shot.bmp", "Level1/map.bsp" for the level1.zip
To use the library in your own programs, put the generated library (zpp.lib)
and it's header files (zpp.h, zpplib.h, zreader.h, izstream.h) where your
compiler can find them. Add #include "zpp.h"
to your program,
and you're off and running. See the example source code included in the
archive.
zppZipArchive
object is associated with each .ZIP archive
opened. When the zppZipArchive
object is constructed, the
name of the file to open is passed in to the constructor:
zppZipArchive(string &_fn, ios::openmode _mode = ios::in, bool _makeGlobal = true);
The _fn
parameter is the name of the .ZIP file to open.
The _mode
argument currently MUST be
ios::in
as only reading of .ZIP files is supported.
ios::bin
is OR'ed into the file mode on platforms where
it's needed to access files in an untranslated way.
The _makeGlobal
argument, if true
(the default),
causes all files in the archive to be added to the global list of files (see
Opening Files, below).
When the zppZipArchive
object is constructed, some basic
integrity checks are done on the .ZIP file (placement of the central
directory, making sure that the archive is complete, not a part of a
multi-part archive, as these are not supported), and then the table of
contents (list of files) is read in and stored in a
<vector>
of zppZipFileInfo
objects.
Each file's name is also added to a <map>
for quick
lookup on opening.
In the case of a "global" .zip file (one constructed with
_makeGlobal
set to true
), references to the
files in the archive are inserted into a global map.
If present, the .ZIP archive's attributes are parsed and put in a map for future access.
To open up all archives in a particular directory (or matching a particular wildcard), use the class static zppZipArchive::openAll() function. This function relies on a function in util.cpp which enumerates a directory matching a wildcard. Since directory enumeration is inherently a non-portable operation, this file will need to be modified to support different operating systems.
iostream
objects for the file: the zppstreambuf
object is the underlying
stream buffer object, derived from istreambuf
, and is used
for raw file I/O. For formatted file I/O, a izppstream
object
can be constructed. Derived from istream
, it can be used
in nearly all places that an ifstream
object can be used.
When opening files, there are two methods that can be used to decide
which file to open: first, a file can be opened from a specific .ZIP
archive (zppZipArchive
object), or, a file can be opened
by searching the global list of files. Whether or not a file is
opened from the global list of files depends on if a zppZipArchive
object is passed to the zppstreambuf
or izppstream
constructor.
Both the zppstreambuf
and the izppstream
objects
support an "open and close" model as well as open-at-construction time model.
The underlying ZLIB library is very portable, and should port cleanly to any platform.
If you are interested in porting ZPP to another platform, please use the SourceForge forum interface to contact the developers so that we can coordinate efforts.
The original license from Michael Cuddy contained the following:
"This code is based on code from the ZPP library, by Jean-Loup Gailly and Mark Adler. They wrote the real meat-and-potatoes code here, I just wrapped it up with some semantic sugar. See the links section for the official ZLib home page.
You may NOT use this code in any mission-critical, or life-support project; I wouldn't trust my code with my life, and neither should you."