Reading PNG Images (PNG: The Definitive Guide)

13.1. A libpng-Based, PNG-Reading Demo Program

In order to provide a concrete demonstration of how to use libpng to read PNG images, I have written a complete (albeit basic) PNG viewer in Standard C. It consists of two main source files: a platform-independent one that includes all of the PNG- and libpng-specific code (readpng.c), and a platform-dependent file that contains all of the user interface and display code. The idea is that the PNG code (the ``back end'') is generic enough that it can be dropped into almost any image-reading C program, whether a viewer, editor, converter, or something else; it is the part that is of primary interest to us. The platform-dependent code (``front end'') is functional--yes, it really works!--but it is not complete or robust enough to be considered a final product.

The back-end code was written for libpng version 1.0.3, but it should work with any 1.x release of the library. Later releases of libpng may add new interfaces, but the functions used here are expected to remain available more or less indefinitely, for backward compatibility. As for the front-end code, two versions are currently available: one for the X Window System (rpng-x.c; mainly for Unix systems, but also potentially VMS and OS/2), and one for Windows 95/98 and NT (rpng-win.c). I will avoid getting into the details of these as much as possible, but where it is unavoidable, I will either use excerpts that are common to both or else point out the differences between the two versions. Complete source listings for both flavors can be found at http://www.libpng.org/pub/png/pngbook.html.

The basic PNG reader has the following features: it is file-based, it reads and displays a single image and then quits, and it is concerned only with reading and decoding that image--it has nothing better to do and can afford to wait on file input/output (I/O) and other potentially slow but non-CPU-intensive tasks. In other words, its characteristics are typical of standalone image viewers, converters, and many image editors, but not of web browsers. Browsers usually read from a network, which is often extremely slow compared to disk access (for example, due to limited modem bandwidth or just congested Internet sites), and they are usually busy formatting text and decoding several images at the same time--they do have something better to do than to wait for the rest of the file to show up. I'll address these issues in Chapter 14, "Reading PNG Images Progressively", with the second demo program.

13.2. Preliminaries

Before we get into the heart of our basic demo program, I'll touch on a couple of mundane but nevertheless important details. The first is the libpng header file, png.h, which defines all of the libpng datatypes, declares all of the public function prototypes, and includes some useful macros. It must be included in any module that makes libpng function calls; in our case, we've segregated all of those in readpng.c, so that's the only place we need to include png.h:

#include "png.h"

Because png.h includes zlib.h, we need not include it explicitly, and most programs need not even worry about it, since there is rarely a need for the user's program to call zlib routines directly. But in our case we do want to make sure zlib.h is included somewhere. The reason for this is the second mundane detail: programs tend to be updated over time, and this often involves plugging in a newer version of a support library like libpng or zlib. When following up on a bug report, particularly with regard to software for which the source code is available (like the demo programs in this book), it is generally useful to know as much as possible about the version that exhibits the bug. In the presence of shared (dynamically linked) libraries, that's even more important. So as part of our demo program's usage screen--the poor man's version of an ``about box''--we call a very small routine in readpng.c that indicates not only the versions of libpng and zlib with which it was compiled, but also the versions it is currently using:

void readpng_version_info()
{
    fprintf(stderr, "   Compiled with libpng %s; using libpng %s.\n",
      PNG_LIBPNG_VER_STRING, png_libpng_ver);
    fprintf(stderr, "   Compiled with zlib %s; using zlib %s.\n",
      ZLIB_VERSION, zlib_version);
}

The uppercase values here are macros defined in the png.h and zlib.h header files; they indicate the compile-time versions. The lowercase variables are globals exported by the two libraries, so they give the versions actually in use at the time the program is run. Ideally, each pair of version numbers will match, but it is not unusual for the user, and sometimes even the programmer, to be caught by an unsuspected mismatch.

13.3. readpng_init()

The ``real'' code in the basic PNG reader begins when the image file is opened (in binary mode!) and its stream pointer passed to our libpng-initialization routine, readpng_init(). readpng_init() also takes two pointers to long integers representing the height and width of the image:

int readpng_init(FILE *infile, long *pWidth, long *pHeight)

We can get away with using longs instead of unsigned longs because the PNG specification requires that image dimensions not exceed 2³¹ - 1.[99] readpng_init() returns a status value; zero will be used to indicate success, and various nonzero values will indicate different errors.

[99] Of course, an image with dimensions that big is likely to exhaust the real and virtual memory on most systems, but we won't worry about that here.

The first thing we do in readpng_init() is read the first 8 bytes of the file and make sure they match the PNG signature bytes; if they don't, there is no need to waste time setting up libpng, allocating memory and so forth. Ordinarily one would read a block of 512 bytes or more, but libpng does its own buffered reads and requires that no more than 8 bytes have been read before handing off control. So 8 bytes it is:

    uch sig[8];
  
    fread(sig, 1, 8, infile);
    if (!png_check_sig(sig, 8))
        return 1;   /* bad signature */

There are two things to note here. The first is the use of the uch typedef, which stands for unsigned char; we use it for brevity and will likewise employ ush and ulg for unsigned short and unsigned long, respectively.[100] The second is that png_check_sig() and its slightly more general sibling png_sig_cmp() are unique among libpng routines in that they require no reference to any structures, nor any knowledge of the state of the PNG stream.

[100] Other typedefs, such as uchar and u_char, are more common and recognizable, but these are sometimes also defined by system header files. Unlike macros, there is no way to test for the existence of a C typedef, and a repeated or conflicting typedef definition is treated as an error by most compilers.

Assuming the file checked out with a proper PNG signature, the next thing to do is set up the PNG structs that will hold all of the basic information associated with the PNG image. The use of two or three structs instead of one is historical baggage; a future, incompatible version of the library is likely to hide one or both from the user and perhaps instead employ an image ID tag to keep track of multiple images. But for now two are necessary:

    png_ptr = png_create_read_struct(PNG_LIBPNG_VER_STRING, NULL, NULL,
      NULL);
    if (!png_ptr)
        return 4;   /* out of memory */
  
    info_ptr = png_create_info_struct(png_ptr);
    if (!info_ptr) {
        png_destroy_read_struct(&png_ptr, NULL, NULL);
        return 4;   /* out of memory */
    }

The struct at which png_ptr points is used internally by libpng to keep track of the current state of the PNG image at any given moment; info_ptr is used to indicate what its state will be after all of the user-requested transformations are performed. One can also allocate a second information struct, usually referenced via an end_ptr variable; this can be used to hold all of the PNG chunk information that comes after the image data, in case it is important to keep pre- and post-IDAT information separate (as in an image editor, which should preserve as much of the existing PNG structure as possible). For this application, we don't care where the chunk information comes from, so we will forego the end_ptr information struct and direct everything to info_ptr.

One or both of png_ptr and info_ptr are used in all remaining libpng calls, so we have simply declared them global in this case:

static png_structp png_ptr;
static png_infop info_ptr;

Global variables don't work in reentrant programs, where the same routines may get called in parallel to handle different images, but this demo program is explicitly designed to handle only one image at a time.

The Dark Side

Let's take a brief break in order to make a couple of points about programming practices, mostly bad ones. The first is that old versions of libpng (pre-1.0) required one to allocate memory for the two structs manually, via malloc() or a similar function. This is strongly discouraged now. The reason is that libpng continues to evolve, and in an environment with shared or dynamically linked libraries (DLLs), a program that was compiled with an older version of libpng may suddenly find itself using a new version with larger or smaller structs. The png_create_XXXX_struct() functions allow the version of the library that is actually being used to allocate the proper structs for itself, avoiding many problems down the road.

Similarly, old versions of libpng encouraged or even required the user to access members of the structs directly--for example, the image height might be available as info_ptr->height or png_ptr->height or even (as in this case) both! This was bad, not only because similar struct members sometimes had different values that could change at different times, but also because any program that is compiled to use such an approach effectively assumes that the same struct member is always at the same offset from the beginning of the struct. This is not a serious problem if the libpng routines are statically linked, although there is some danger that things will break if the program is later recompiled with a newer version of libpng. But even if libpng itself never changes the definition of the struct's contents, a user who compiles a new DLL version with slightly different compilation parameters--for example, with structure-packing turned on--may have suddenly shifted things around so they appear at new offsets. libpng can also be compiled with certain features disabled, which in turn eliminates the corresponding structure members from the definition of the structs and therefore alters the offsets of any later structure members. And I already mentioned that libpng is evolving: new things get added to the structs periodically, and perhaps an existing structure member is found to have been defined with an incorrect size, which is then corrected. The upshot is that direct access to struct members is very, very bad. Don't do it. Don't let your friends do it. We certainly won't be doing it here.

The pointers are now set up and pointing at allocated structs of the proper sizes--or else we've returned to the main program with an error. The next step is to set up a small amount of generic error-handling code. Instead of depending on error codes returned from each of its component functions, libpng employs a more efficient but rather uglier approach involving the setjmp() and longjmp() functions. Defined in the standard C header file setjmp.h (which is automatically included in pngconf.h, itself included in png.h), these routines effectively amount to a giant goto statement that can cross function boundaries. This avoids a lot of conditional testing (if (error) return error;), but it can make the program flow harder to understand in the case of errors. Nevertheless, that's what libpng uses by default, so that's what we use here:

    if (setjmp(png_ptr->jmpbuf)) {
        png_destroy_read_struct(&png_ptr, &info_ptr, NULL);
        return 2;
    }

The way to read this code fragment is as follows: the first time through, the setjmp() call saves the state of the program (registers, stack, and so on) in png_ptr->jmpbuf and returns successfully--that is, with a return value of zero--thus avoiding the contents of the if-block. But if an error later occurs and libpng invokes longjmp() on the same copy of png_ptr->jmpbuf, control suddenly returns to the if-block as if setjmp() had just returned, but this time with a nonzero return value. The if-test then evaluates to TRUE, so the PNG structs are destroyed and we return to the main program.

But wait! Didn't I just finish lecturing about the evils of direct access to structure members? Yet here I am, referring to the jmpbuf member of the main PNG struct. The reason is that there is simply no other way to get a pointer to the longjmp buffer in any release of libpng through version 1.0.3. And, sadly, there may not be any clean and backward-compatible way to work around this limitation in future releases, either. The unfortunate fact is that the ANSI committee responsible for defining the C language and standard C library managed to standardize jmp_buf in such a way that one cannot reliably pass pointers to it, nor can one be certain that its size is constant even on a single system. In particular, if a certain macro is defined when libpng is compiled but not for a libpng-using application, then jmp_buf may have different sizes when the application calls setjmp() and when libpng calls longjmp(). The resulting inconsistency is more likely than not to cause the application to crash.

The solution, which is already possible with current libpng releases and will probably be required as of some future version, is to install a custom error handler. This is simply a user function that libpng calls instead of its own longjmp()-based error handler whenever an error is encountered; like longjmp(), it is not expected to return. But there is no problem at all if the custom error handler itself calls longjmp(): since this happens within the application's own code space, its concept of jmp_buf is completely consistent with that of the code that calls setjmp() elsewhere in the application. Indeed, there is no longer any need to use the jmpbuf element of the main libpng struct with this approach--the application can maintain its own jmp_buf. I will demonstrate this safer approach in Chapter 14, "Reading PNG Images Progressively".

Note the use of png_destroy_read_struct() to let libpng free any memory associated with the PNG structs. We used it earlier, too, for cases in which creating the info struct failed; then we passed png_ptr and two NULLs. Here we pass png_ptr, info_ptr and one NULL. Had we allocated the second info struct (end_ptr), the third argument would point at it, or, more precisely, at its pointer, so that end_ptr itself could be set to NULL after the struct is freed.

Having gotten all of the petty housekeeping details out of the way, we next set up libpng so it can read the PNG file, and then we begin doing so:

    png_init_io(png_ptr, infile);
    png_set_sig_bytes(png_ptr, 8);
    png_read_info(png_ptr, info_ptr);

The png_init_io() function takes our file stream pointer (infile) and stores it in the png_ptr struct for later use. png_set_sig_bytes() lets libpng know that we already checked the 8 signature bytes, so it should not expect to find them at the current file pointer location.

png_read_info() is the first libpng call we've seen that does any real work. It reads and processes not only the PNG file's IHDR chunk but also any other chunks up to the first IDAT (i.e., everything before the image data). For colormapped images this includes the PLTE chunk and possibly tRNS and bKGD chunks. It typically also includes a gAMA chunk; perhaps cHRM, sRGB, or iCCP; and often tIME and some tEXt chunks. All this information is stored in the information struct and some in the PNG struct, too, but for now, all we care about is the contents of IHDR--specifically, the image width and height:

    png_get_IHDR(png_ptr, info_ptr, &width, &height, &bit_depth,
      &color_type, NULL, NULL, NULL);
    *pWidth = width;
    *pHeight = height;
  
    return 0;

Once again, since this is a single-image program, I've been lazy and used global variables not only for the image dimensions but also for the image's bit depth (bits per sample--R, G, B, A, or gray--or per palette index, not per pixel) and color type. The image dimensions are also passed back to the main program via the last two arguments of readpng_init(). The other two variables will be used later. If we were interested in whether the image is interlaced or what compression and filtering methods it uses, we would use actual values instead of NULLs for the last three arguments to png_get_IHDR(). Note that the PNG 1.0 and 1.1 specifications define only a single allowed value (0) for either the compression type or the filtering method. In this context, compression type 0 is the deflate method with a maximum window size of 32 KB, and filtering method 0 is PNG's per-row adaptive method with five possible filter types. See Chapter 9, "Compression and Filtering", for details.

That wraps up our readpng_init() function. Back in the main program, various things relating to the windowing system are initialized, but before the display window itself is created, we potentially make one more readpng call to see if the image includes its own background color. In fact, this function could have been incorporated into readpng_init(), particularly if all program parameters used by the back-end readpng functions and the front-end display routines were passed via an application-specific struct, but we didn't happen to set things up that way. Also, note that this second readpng call is unnecessary if the user has already specified a particular background color to be used. In this program, a simple command-line argument is used, but a more sophisticated application might employ a graphical color wheel, RGB sliders, or some other color-choosing representation.

13.4. readpng_get_bgcolor()

In any case, assuming the user did not specify a background color, we call readpng_get_bgcolor() to check the PNG file for one. It takes as arguments pointers to three unsigned character values:

int readpng_get_bgcolor(uch *red, uch *green, uch *blue)

As before, we start with a setjmp() block to handle libpng errors, then check whether the PNG file had a bKGD chunk:

    if (!png_get_valid(png_ptr, info_ptr, PNG_INFO_bKGD))
        return 1;

Assuming the png_get_valid() call returned a nonzero value, we next have libpng give us a pointer to a small struct containing the bKGD color information:

    png_color_16p pBackground;
 
    png_get_bKGD(png_ptr, info_ptr, &pBackground);

(pBackground was defined at the top of the function.) pBackground now points at a png_color_16 struct, which is defined as follows:

typedef struct png_color_16_struct
{
    png_byte index;
    png_uint_16 red;
    png_uint_16 green;
    png_uint_16 blue;
    png_uint_16 gray;
} png_color_16;

As suggested by the struct members' names, not all of them are valid with all PNG image types. The first member, index, is only valid with palette-based images, for example, and gray is only valid with grayscale images. But it is one of libpng's handy little features (presently undocumented) that the red, green, and blue struct members are always valid, and those happen to be precisely the values we want.

The other thing to note, however, is that the elements we need are defined as png_uint_16, i.e., as 16-bit (or larger) unsigned integers. That suggests that the color values we get back may depend on the bit depth of the image, which is indeed the case. In fact, this is true regardless of whether the calling program requested libpng to convert 16-bit values or 1-, 2-, and 4-bit values to 8-bit; this is another currently undocumented tidbit. We'll be feeding all of these little gotchas back to the libpng maintainer, however, so one can assume that the documentation will be slightly more complete by the time this book is published.

Since we'll be dealing only with 8-bit samples in this program, and, in particular, since the arguments to readpng_get_bgcolor() are pointers to unsigned (8-bit) characters, we need to shift the high-order bits down in the case of 16-bit data or expand them in the case of low-bit-depth values (only possible with grayscale images). And either way, we need to pass the values back to the main program. Thus:

    if (bit_depth == 16) {
        *red   = pBackground->red   >> 8;
        *green = pBackground->green >> 8;
        *blue  = pBackground->blue  >> 8;
    } else if (color_type == PNG_COLOR_TYPE_GRAY && bit_depth < 8) {
        if (bit_depth == 1)
            *red = *green = *blue = pBackground->gray? 255 : 0;
        else if (bit_depth == 2)   /* i.e., max value is 3 */
            *red = *green = *blue = (255/3) * pBackground->gray;
        else /* bit_depth == 4 */  /* i.e., max value is 15 */
            *red = *green = *blue = (255/15) * pBackground->gray;
    } else {
        *red   = pBackground->red;
        *green = pBackground->green;
        *blue  = pBackground->blue;
    }
  
    return 0;

With that, the main program now has enough information to create an image window of the proper size and fill it with the background color, which it does. The top row of Figure C-5 in the color insert shows the two cases: the middle image is displayed with the background color specified in the PNG file itself, while the image on the right is shown with a user-specified background color.

The main program next calls the heart of the readpng code: readpng_get_image(), which sets the desired libpng transformations, allocates a PNG image buffer, decodes the image, and returns a pointer to the raw data. Before we look at that in detail, we should first discuss some of the design decisions that led to it.

13.5. Design Decisions

We decided at the outset that we didn't want to deal with a lot of PNG bit depths; we have plenty of that in the front-end code (at least for the X version...sigh). Being fond of alpha transparency and the nice effects it can produce, we did want to retain full transparency information, however. In both cases, we were willing to sacrifice a minimal memory footprint in favor of simplicity and, to some extent, speed. Thus, we chose to expand or reduce all PNG image types to 24-bit RGB, optionally with a full 8-bit alpha channel. In other words, the output would always be either three channels (RGB) or four channels (RGBA).

Handling all alpha blending on our own, in the front end, is not strictly necessary. In the case of a flat background color, which is all I've discussed so far, libpng can be instructed to blend the background color (either from the PNG file or as supplied by the user) with the foreground pixels, thereby eliminating the alpha channel; the relevant function is png_set_background(). The result would have been a single output format to deal with: three-channel, 24-bit RGB. But we had in mind from the outset the possibility of loading or generating a complete background image, not just a background color, and libpng currently has no provision for blending two images.

13.6. Gamma and Color Correction

Since this routine is also where any gamma and color correction (recall Chapter 10, "Gamma Correction and Precision Color") would take place, we should step back a moment and look at how the main program deals with that. First I have a confession: I did not attempt any color correction. (Truly, I am scum.) But this does not excuse you, the reader, from supporting it, at least in higher-end applications! The X Window System's base library, Xlib, has included the X Color Management System since X11R5; it is accessed via the Xcms functions, an extensive API supporting everything from color-space conversion to gamut compression. Apple supports the ColorSync system on the Macintosh and will be releasing a version for Windows. And Microsoft, if not already supporting the sRGB color space natively in recent releases of Windows, certainly can be assumed to do so in coming releases; they and Hewlett-Packard collaborated on the original sRGB proposal.

But where color correction can be a little tricky, gamma correction is quite straightforward. All one needs is the ``gamma'' value (exponent) of the user's display system and that of the PNG file itself. If the PNG file does not include a gAMA or sRGB chunk, there is little to be done except perhaps ask the user for a best-guess value; a PNG decoder is likely to do more harm than good if it attempts to guess on its own. We will simply forego any attempt at gamma correction, in that case. But on the assumption that most PNG files will be well behaved and include gamma information, we included the following code at the beginning of the main program:

    double LUT_exponent;
    double CRT_exponent = 2.2;
    double default_display_exponent;
  
#if defined(NeXT)
    LUT_exponent = 1.0 / 2.2;
    /*
    if (some_next_function_that_returns_gamma(&next_gamma))
        LUT_exponent = 1.0 / next_gamma;
     */
#elif defined(sgi)
    LUT_exponent = 1.0 / 1.7;
    /* there doesn't seem to be any documented function to
     * get the "gamma" value, so we do it the hard way */
    infile = fopen("/etc/config/system.glGammaVal", "r");
    if (infile) {
        double sgi_gamma;
  
        fgets(fooline, 80, infile);
        fclose(infile);
        sgi_gamma = atof(fooline);
        if (sgi_gamma > 0.0)
            LUT_exponent = 1.0 / sgi_gamma;
    }
#elif defined(Macintosh)
    LUT_exponent = 1.8 / 2.61;
    /*
    if (some_mac_function_that_returns_gamma(&mac_gamma))
        LUT_exponent = mac_gamma / 2.61;
     */
#else
    LUT_exponent = 1.0;   /* assume no LUT:  most PCs */
#endif
  
    default_display_exponent = LUT_exponent * CRT_exponent;

The goal here is to make a reasonably well informed guess as to the overall display system's exponent (``gamma''), which, as you'll recall from Chapter 10, "Gamma Correction and Precision Color", is the product of the lookup table's exponent and that of the monitor. Essentially all monitors have an exponent of 2.2, so I've assumed that throughout. And almost all PCs and many workstations forego the lookup table (LUT), effectively giving them a LUT exponent of 1.0; the result is that their overall display-system exponent is 2.2. This is reflected by the last line in the ifdef block.

A few well-known systems have LUT exponents quite different from 1.0. The most extreme of these is the NeXT cube (and subsequent noncubic models), which has a lookup table with a 1/2.2 exponent, resulting in an overall exponent of 1.0 (i.e., it has a ``linear transfer function''). Although some third-party utilities can modify the lookup table (with a ``gamma'' value whose inverse is the LUT exponent, as on SGI systems), there appears to be no system facility to do so and no portable method of determining what value a third-party panel might have loaded. So we assume 1.0 in all cases when the NeXT-specific macro NeXT is defined.

Silicon Graphics workstations and Macintoshes also have nonidentity lookup tables, but in both cases the LUT exponent can be varied by system utilities. Unfortunately, in both cases the value is varied via a parameter called ``gamma'' that matches neither the LUT exponent nor the other system's usage. On SGI machines, the ``gamma'' value is the inverse of the LUT exponent (as on the NeXT) and can be obtained either via a command (gamma) or from a system configuration file (/etc/config/system.glGammaVal); there is no documented method to retrieve the value directly via a system function call. Here we have used the file-based method. If we read it successfully, the overall system exponent is calculated accordingly; if not, we assume the default value used on factory-shipped SGI systems: ``gamma'' of 1.7, which implies a display-system exponent of 2.2/1.7, or 1.3. Note, however, that what is being determined is the exponent of the console attached to the system running the program, not necessarily that of the actual display. That is, X programs can display on remote systems, and the exponent of the remote display system might be anything. One could attempt to determine whether the display is local by checking the DISPLAY environment variable, but to do so correctly could involve several system calls (uname(), gethostbyname(), etc.) and is beyond the scope of this demo program. A user-level work-around is to set the SCREEN_GAMMA variable appropriately; I'll describe that in just a moment.

The Macintosh ``gamma'' value is proportional to the LUT exponent, but it is multiplied by an additional constant factor of 2.61. The default gamma is 1.8, leading to an overall exponent of (1.8/2.61) × 2.2, or 1.5. Since neither of the two front ends (X or Windows) is designed to work on a Mac, the code inside the Macintosh if-def (and the Macintosh macro itself) is intended for illustration only, not as a serious example of ready-to-compile code. Indeed, a standard component of Mac OS 8.5 is Apple's ColorSync color management system (also available as an add-on for earlier systems), which is the recommended way to handle both gamma and color correction on Macs.

It is entirely possible that the user has calibrated the display system more precisely than is reflected in the preceding code, or perhaps has a system unlike any of the ones we have described. The main program also gives the user the option of specifying the display system's exponent directly, either with an environment variable (SCREEN_GAMMA is suggested by the libpng documentation) or by direct input. For the latter, we have once again resorted to the simple expedient of a command-line option, but a more elegant program might pop up a dialog box of some sort, or even provide a calibration screen. In any case, our main program first checks for the environment variable:

    if ((p = getenv("SCREEN_GAMMA")) != NULL)
        display_exponent = atof(p);
    else
        display_exponent = default_display_exponent;

If the variable is found, it is used; otherwise, the previously calculated default exponent is used. Then the program processes the command-line options and, if the -gamma option is found, its argument replaces all previously obtained values.

That turned out to be a moderately lengthy explanation of the demo program's approach to gamma correction (or, more specifically, to finding the correct value for the display system's exponent), mostly because of all the different ways the value can be found: system-specific educated guesses at the time of compilation, system-specific files or API calls at runtime, an environment variable, or direct user input. The actual code is only about 20 lines long.

13.7. readpng_get_image()

Once the display-system exponent is found, it is passed to the readpng code as the first argument to readpng_get_image():

uch *readpng_get_image(double display_exponent, int *pChannels,
                       ulg *pRowbytes)

As with the previous two readpng routines, readpng_get_image() first installs the libpng error-handler code (setjmp()). It then sets up all of the transformations that correspond to the design decisions described earlier, starting with these three:

    if (color_type == PNG_COLOR_TYPE_PALETTE)
        png_set_expand(png_ptr);
    if (color_type == PNG_COLOR_TYPE_GRAY && bit_depth < 8)
        png_set_expand(png_ptr);
    if (png_get_valid(png_ptr, info_ptr, PNG_INFO_tRNS))
        png_set_expand(png_ptr);

The astute reader will have noticed something odd in the first block: the same function, png_set_expand(), is called several times, in different contexts but with identical arguments. Indeed, this is perhaps the single most confusing issue in all versions of libpng up through 1.0.3. In the first case, png_set_expand() is used to set a flag that will force palette images to be expanded to 24-bit RGB. In the second case, it indicates that low-bit-depth grayscale images are to be expanded to 8 bits. And in the third case, the function is used to expand any tRNS chunk data into a full alpha channel. Note that the third case can apply to either of the first two, as well. That is, either a palette image or a grayscale image may have a transparency chunk; in each case, png_set_expand() would be called twice in succession, for different purposes (though with the same effect--the function merely sets a flag, independent of context). A less confusing approach would be to create separate functions for each purpose:

    /* These functions are FICTITIOUS!  They DO NOT EXIST in any
     * version of libpng to date (through 1.0.3). */

    if (color_type == PNG_COLOR_TYPE_PALETTE)
        png_set_palette_to_rgb(png_ptr);
    if (color_type == PNG_COLOR_TYPE_GRAY && bit_depth < 8)
        png_set_gray_1_2_4_to_8(png_ptr);
    if (png_get_valid(png_ptr, info_ptr, PNG_INFO_tRNS))
        png_set_tRNS_to_alpha(png_ptr);

With luck, these functions will be accepted for libpng version 1.0.4 (and later).

Getting back to the real code, the next pair of transformations involves calls to two new functions, one to reduce images with 16-bit samples (e.g., 48-bit RGB) to 8 bits per sample and one to expand grayscale images to RGB. Fortunately these are appropriately named:

    if (bit_depth == 16)
        png_set_strip_16(png_ptr);
    if (color_type == PNG_COLOR_TYPE_GRAY ||
        color_type == PNG_COLOR_TYPE_GRAY_ALPHA)
        png_set_gray_to_rgb(png_ptr);

The final transformation sets up the gamma-correction code, but only if the file contains gamma information itself:

    double  gamma;
  
    if (png_get_gAMA(png_ptr, info_ptr, &gamma))
        png_set_gamma(png_ptr, display_exponent, gamma);

Once again, the declaration of gamma is included here for context; it actually occurs at the beginning of the function. The conditional approach toward gamma correction is on the assumption that guessing incorrectly is more harmful than doing no correction at all; alternatively, the user could be queried for a best-guess value. This approach was chosen because a simple viewer such as we describe here is probably more likely to be used for images created on the local system than for images coming from other systems, for which a web browser might be the usual viewer. An alternate approach, espoused by drafts of the sRGB specification, is to assume that all unlabeled images exist in the sRGB space, which effectively gives them gamma values of 0.45455. On a PC-like system with no lookup table, the two approaches amount to the same thing: multiply the image's gamma of 0.45455 by the display-system exponent of 2.2, and you get an overall exponent of 1.0--i.e., no correction is necessary. But on a Macintosh, SGI, or NeXT system, the sRGB recommendation would result in additional processing that would tend to darken images. This would effectively favor images created on PCs over (unlabeled) images created on the local system. The upshot is that one is making assumptions either way; which approach is more acceptable is likely to be a matter of personal taste. Note that the PNG 1.1 Specification recommends that the viewer ``choose a likely default gamma value, but allow the user to select a new one if the result proves too dark or too light.''

In any case, once we've registered all of our desired transformations, we request that libpng update the information struct appropriately via the png_read_update_info() function. Then we get the values for the number of channels and the size of each row in the image, allocate memory for the main image buffer, and set up an array of pointers:

    png_uint_32  i, rowbytes;
    png_bytep  row_pointers[height];
  
    png_read_update_info(png_ptr, info_ptr);
  
    *pRowbytes = rowbytes = png_get_rowbytes(png_ptr, info_ptr);
    *pChannels = (int)png_get_channels(png_ptr, info_ptr);
  
    if ((image_data = (uch *)malloc(rowbytes*height)) == NULL) {
        png_destroy_read_struct(&png_ptr, &info_ptr, NULL);
        return NULL;
    }
  
    for (i = 0;  i < height;  ++i)
        row_pointers[i] = image_data + i*rowbytes;

The only slightly strange feature here is the row_pointers[] array, which is something libpng needs for its processing. In this program, where we have allocated one big block for the image, the array is somewhat unnecessary; libpng could just take a pointer to image_data and calculate the row offsets itself. But the row-pointers approach offers the programmer the freedom to do things like setting up the image for line doubling (by incrementing each row pointer by 2*rowbytes) or even eliminating the image_data array entirely in favor of per-row progressive processing on a single row buffer. Of course, it is also quite a convenient way to deal with reading and displaying the image.

In fact, that was the last of the preprocessing to be done. The next step is to go ahead and read the entire image into the array we just allocated:

    png_read_image(png_ptr, row_pointers);

The readpng routine can return at this point, but we added one final libpng call for completeness. png_read_end() checks the remainder of the image for correctness and optionally reads the contents of any chunks appearing after the IDATs (typically tEXt or tIME) into the indicated information struct. If one has no need for the post-IDAT chunk data, as in our case, the second argument can be NULL:

    png_read_end(png_ptr, NULL);
  
    return image_data;

13.8. readpng_cleanup()

With that, readpng_get_image() returns control to our main program, which closes the input file and promptly calls another readpng routine to clean up all allocated memory (except for the image data itself, of course):

void readpng_cleanup(int free_image_data)
{
    if (free_image_data && image_data) {
        free(image_data);
        image_data = NULL;
    }
  
    if (png_ptr && info_ptr) {
        png_destroy_read_struct(&png_ptr, &info_ptr, NULL);
        png_ptr = NULL;
        info_ptr = NULL;
    }
}

That is, the main program calls readpng_cleanup() with a zero (FALSE) argument here so that image_data is not freed. If it had waited to clean up until after the user requested the program to end, it would have passed a nonzero (TRUE) argument instead. Setting png_ptr and info_ptr to NULL is unnecessary here, since png_destroy_read_struct() does that for us; but we do it anyway, since it's a habit that tends to save on debugging time in the long run.

13.9. Compositing and Displaying the Image

What one does at this point is, of course, entirely application-specific. Our main program calls a display routine that simply puts the pixels on the screen, first compositing against the desired background color if the final image has four channels (i.e., if it includes an alpha channel). Then it waits for the user to quit the program, at which point it destroys the window, frees any allocated memory, and exits.

The compositing step is perhaps interesting; it employs a macro copied from the png.h header file, albeit renamed to avoid problems, should png.h ever be included in the main program file, and using equivalent typedefs:

#define alpha_composite(composite, fg, alpha, bg) {              \
    ush temp = ((ush)(fg)*(ush)(alpha) +                         \
                (ush)(bg)*(ush)(255 - (ush)(alpha)) + (ush)128); \
    (composite) = (uch)((temp + (temp >> 8)) >> 8);              \
}

The unique thing about this macro is that it does exact alpha blending on 8-bit samples (for example, the red components of a foreground pixel and a background pixel) without performing any division. This macro and its 16-bit-per-sample sibling have been tested on a number of PC and workstation architectures and found to be anywhere from 2 to 13 times faster than the standard approach, which divides by 255 or 65,535, depending on sample size. Of course, hardware-assisted alpha compositing will always be faster than doing it in software; many 3D accelerator cards provide this function, and often they can be used even in 2D applications. Approximate methods (which divide by 256 of 65,536 by bit-shifting) are another fast alternative when absolute accuracy is not important, but note that such an approach may leave a visible border between opaque and slightly transparent regions.

13.10. Getting the Source Code

All of the source files for the rpng demo program (rpng-x.c, rpng-win.c, readpng.c, readpng.h, and makefiles) are available both in print and electronically, under a BSD-like Open Source license. The files will be available for download from the following URL for the foreseeable future:

http://www.libpng.org/pub/png/pngbook.html

Bug fixes, new features and ports, and other contributions may be integrated into the code, time permitting.

libpng source code is available from the following URLs:

http://www.libpng.org/pub/png/libpng.html
https://libpng.sourceforge.io/

zlib source code is available from the following site:

http://www.zlib.org/

13.11. Alternative Approaches

It should go without saying that the program presented here is among the simplest of many possibilities. It would also have been possible to write it monolithically, either as a single readpng function or even as inlined code within main(), which is precisely how the sample code in the libpng documentation reads. Libpng allows user-defined I/O routines (in place of standard file I/O), custom memory allocators, and alternate error handlers to be installed, although there is currently no provision for an error-handling function that returns control to the libpng routine that called it.

There are also other options for the platform-dependent front ends, of course; reading an image from a file is often undesirable. One method in particular is worth mentioning, since it does not appear to be documented anywhere else at the time of this writing. On the 32-bit Windows platform, a ``private'' clipboard may be used to transfer PNG images between applications. The data format is simply the normal PNG stream, beginning with the signature bytes and ending with the IEND chunk. An application like rpng-win would register the private clipboard and then read PNG data from it in the usual way. The following code fragment outlines the essential steps:

    UINT clipbd_format = RegisterClipboardFormat("PNG");
  
    if (clipbd_format == 0) {
        /* call failed:  use GetLastError() for extended info */
    } else if (OpenClipboard(NULL)) {
        HANDLE handle = GetClipboardData(clipbd_format);
  
        if (handle == NULL) {
            /* call failed:  use GetLastError() for info */
        } else {
            int data_length = GlobalSize(handle);   /* upper bound */
  
            if (data_length == 0) {
                /* call failed:  use GetLastError() for info */
            } else {
                BYTE *data_ptr = GlobalLock(handle);

                if (data_ptr == NULL) {
                    /* call failed:  use GetLastError() for info */
                } else {

                    /*================================================*/
                    /* copy PNG data immediately, but don't flag an   */
                    /* error if there are some extra bytes after IEND */
                    /*================================================*/

                    if (GlobalUnlock(handle) == 0) {
                        /* call failed:  use GetLastError() for info */
                    }
                }
            }
        }
        if (CloseClipboard()) {
            /* call failed:  use GetLastError() for info */
        }
    } else {
        /* another window has the clipboard open */
        /* (can use GetOpenClipboardWindow() to get handle to it) */
    }

That one can do something like this in principle isn't new or unusual; what is new is that the "PNG" clipboard has already been implemented in some Microsoft apps, including Office 2000. All any other application needs in order to interoperate via this clipboard is its name and data format, which I've just described. Thanks to John Bowler for providing this information to the PNG Development Group.

In the next chapter, I'll look at a more radical alternative to the basic PNG decoder: a version that feeds libpng data at its own pace, rather than letting libpng read (and possibly wait for) as much data as it wants. Progressive viewers are at the heart of most online browsers, so we'll look at how to write one for PNG images.

Chapter 13. Reading PNG Images

Contents: