PNG Basics (PNG: The Definitive Guide)

The fundamental building block of PNG images is the chunk. With the exception of the first 8 bytes in the file (and we'll come back to those shortly), a PNG image consists of nothing but chunks.

8.5. PNG Image Types

I noted earlier that not all possible combinations of PNG image types and features are allowed by the specification. Let's take a closer look at the basic types and their features.

8.5.1. Palette-Based

Palette-based images, also known as colormapped or index-color images, use the PLTE chunk and are supported in four pixel depths: 1, 2, 4, and 8 bits, corresponding to a maximum of 2, 4, 16, or 256 palette entries. Unlike GIF images, however, fewer than the maximum number of entries may be present. On the other hand, GIF does support pixel depths of 3, 5, 6, and 7 bits; 6-bit (64-color) images, in particular, are common on the World Wide Web.

TIFF also supports palette images, but baseline TIFF allows only 4- and 8-bit pixel depths. Perhaps a more useful comparison is with the superset of baseline TIFF that is supported by Sam Leffler's free libtiff, which has become the software industry's unofficial standard for TIFF decoding. libtiff supports palette bit depths of 1, 2, 4, 8, and 16 bits. Unlike PNG and GIF, however, the TIFF palette always uses 16-bit integers for each red, green, and blue value, and as with GIF, all 2^{bit depth} entries must be present in the file. Nor is there any provision for compression of the palette data--so a 16-bit TIFF palette would require 384 KB all by itself.

8.5.2. Palette-Based with Transparency

The PNG spec forbids the use of a full alpha channel with palette-based images, but it does allow ``cheap alpha'' via the transparency chunk, tRNS. As its name implies--the first letter is lowercase--tRNS is an ancillary chunk, which means the image is still viewable even if the decoder somehow fails to recognize the chunk.[62] The structure of tRNS depends on the image type, but for palette-based images it is exactly analogous to the PLTE chunk. It may contain as many transparency entries as there are palette entries (more than that would not make any sense) or as few as one, and it must come after PLTE and before the first IDAT. In effect, it transforms the palette from an RGB lookup table to an RGBA table, which implies a potential factor-of-four savings in file size over a full 32-bit RGBA image. The icicle image used as a basis for Figure C-1 in the color insert is an RGBA-palette image; it is ``only'' 3.85 times smaller than the 32-bit original due to dithering (which hurts compression).

[62] Once again, the distinction between critical and ancillary chunks is largely irrelevant for chunks defined in the specification, since presumably they are known by all decoders. But even the names of standard chunks were chosen in accordance with the rules, as if they might be encountered by a particularly simple-minded PNG decoder. In fact, this was done in order to test the chunk-naming rules themselves: would a decoder that relied only on them behave sensibly? The answer was ``yes.''

By comparison, GIF supports only binary transparency, wherein a single palette color is marked as completely transparent, while all others are fully opaque. GIF has a tiny advantage in that the transparent entry can live anywhere in the palette, whereas a single PNG transparency entry should come first--all tRNS entries before the transparent one must exist and must have the value 255 (fully opaque), which would be redundant and therefore a waste of space. But the code necessary to rearrange the palette so that all non-opaque entries come before any opaque ones is simple to write, and the benefits of PNG's more flexible transparency scheme far outweigh this minor drawback.

The TIFF format supports at least three kinds of transparency information, two involving an interleaved alpha channel (extra samples) and the third involving a completely separate subimage (or subfile) that is used as a bilevel transparency mask. Baseline TIFF does not require support for any of them, but libtiff supports the two interleaved flavors directly, and could probably be manhandled into some level of support for the subfile approach, although the transparency mask is ``typically at a higher resolution than the main image if the main image is grayscale or color,'' according to the TIFF 6.0 specification. On the other hand, with the possible exception of user-designed TIFF tags, there is no support at all for ``cheap alpha,'' i.e., marking one or more palette entries as partially or completely transparent.

8.5.3. Grayscale

PNG grayscale images support the widest range of pixel depths of any image type. Depths of 1, 2, 4, 8, and 16 bits are supported, covering everything from simple black-and-white scans to full-depth medical and raw astronomical images.[63]

[63] Calibrated astronomical image data is usually stored as 32-bit or 64-bit floating-point values, and some raw data is represented as 32-bit integers. Neither format is directly supported by PNG, although one could, in principle, design an ancillary chunk to hold the proper conversion information. Conversion of data with more than 16 bits of dynamic range would be a lossy transformation, however--at least, barring the abuse of PNG's alpha channel or RGB capabilities.

There is no direct comparison with GIF images, although it is certainly possible to store grayscale data in a palette image for both GIF and PNG. The only place a gray palette is commonly distinguished from a regular color one, however, is in VRML97 texture maps. Baseline TIFF images, on the other hand, support 1-bit ``bilevel'' and 4- and 8-bit grayscale depths. Nonbaseline TIFF allows arbitrary bit depths, but libtiff accepts only 1-, 2-, 4-, 8-, and 16-bit images. TIFF also supports an inverted grayscale, wherein 0 represents white and the maximum pixel value represents black.

The most common form of JPEG (the one that uses ``lossy'' compression, in which some information in the image is thrown away) likewise supports grayscale images in depths of 8 and 12 bits. In addition, there are two variants that use truly lossless compression and support any depth from 2 to 16 bits: the traditional version, known simply as ``lossless JPEG,'' and an upcoming second-generation flavor called ``JPEG-LS.''[64] But the first is extremely rare, and is supported by almost no one, despite having been standardized years ago, and the second is also currently unsupported (although that is to be expected for a new format). Lossy JPEG is very well supported, thanks largely to the Independent JPEG Group's free libjpeg (which, like libtiff, has become the de facto standard for JPEG encoding and decoding)--but, of course, it's lossy. Note that libjpeg can be compiled to support either 8-bit or 12-bit JPEG, but not both at the same time. Thus, from a practical standpoint, only 8-bit, lossy grayscale is supported.

[64] Be aware that even at the highest quality settings, the common form of JPEG is never lossless, regardless of whether the setting claims 100% or something similar.

8.5.4. Grayscale with Transparency

PNG supports two kinds of transparency with grayscale and RGB images. The first is a palette-style ``cheap transparency,'' in which a single color or gray value is marked as being fully transparent. I noted earlier that the structure of tRNS depends on the image type; for grayscale images of any pixel depth, the chunk contains a 2-byte, unscaled gray value--that is, the maximum allowed value is still 2^{bit depth}-1, even though it is stored as a 16-bit integer. This approach is very similar to GIF-style transparency in palette images and incurs only 14 bytes overhead in file size. There is no corresponding TIFF image type, and standard JPEG does not support any transparency.

8.5.5. Grayscale with Alpha Channel

The second kind of transparency supported by grayscale images is an alpha channel. This is a more expensive approach in terms of file size--for grayscale, it doubles the number of image bytes--but it allows the user much greater freedom in setting individual pixels to particular levels of partial transparency. Only 8-bit and 16-bit grayscale images may have an alpha channel, which must match the bit depth of the gray channel.

The full TIFF specification supports two kinds of interleaved ``extra samples'' for transparency: associated and unassociated alpha (though not at the same time). Unlike PNG, TIFF's alpha channel may be of a different bit depth from the main image data--in fact, every channel in a TIFF image may have an arbitrary depth. TIFF also offers the explicit possibility of treating a ``subfile,'' or secondary image within the file, as a transparency mask, though such masks are only 1 bit deep, and therefore support only completely opaque or completely transparent pixels.

Baseline TIFF does not require support for any of this, however. Current versions of libtiff can read an interleaved alpha channel as generic ``extra samples,'' but it is up to the application to interpret the samples correctly. The library does not support images with channels of different depths, and although it could be manipulated into reading a secondary grayscale subfile (which the application could interpret as a full alpha channel), that would be a user-defined extension--i.e., specific to the application and not supported by any other software.

As I just noted, standard JPEG (by which I mean the common JPEG File Interchange Format, or JFIF files) has no provision for transparency. The JPEG standard itself does allow extra channels, one of which could be treated as an alpha channel, but this would be fairly pointless. Not only would it require one to use a non-standard, unsupported file format for storage, there would also tend to be visual artifacts, since lossy JPEG is not well suited to the types of alpha masks one typically finds (unless the mask's quality setting were boosted considerably, at a cost in file size). But see Chapter 12, "Multiple-Image Network Graphics" for details on a MNG subformat called JNG that combines a lossy JPEG image in JFIF format with a PNG-style, lossless alpha channel.

8.5.6. RGB

RGB (truecolor) PNGs, like grayscale with alpha, are supported in only two depths: 8 and 16 bits per sample, corresponding to 24 and 48 bits per pixel. This is the image type most commonly used by image-editing applications like Adobe Photoshop. Note that pixels are stored in RGB order. (BGR is the other popular format, especially on Windows-based systems.)

Truecolor PNG images may also include a palette (PLTE) chunk, though the specialized suggested-palette (sPLT) chunk described in Chapter 11, "PNG Options and Extensions" is often more appropriate. But if present, the palette encodes a suggested set of colors to which the image may be quantized if the decoder cannot display in truecolor; the suggestion is presumed to be a good one, so decoders are encouraged to use it if they can. Of course, multi-image viewers such as web browsers often resort to a fixed palette for simplicity and rendering speed.

Baseline TIFF requires support only for 24-bit RGB, but libtiff supports 1, 2, 4, 8, and 16 bits per sample. Ordinary JPEG stores only 24-bit RGB,[65] though 36-bit RGB is possible with the seldom-supported 12-bit extension. The also seldom-supported lossless flavor of JPEG can, in theory, store any sample depth from 2 to 16 bits, thus 6 to 48 bits per RGB pixel.

[65] Technically, color JPEGs are almost always encoded internally in the YC_bC_r color space and converted to or from RGB by the decoder or encoder software.

8.5.7. RGB with Transparency

As mentioned previously, PNG supports cheap transparency in RGB images via the tRNS chunk. The format is similar to that for grayscale images, except now the chunk contains three unscaled, 16-bit values (red, green, and blue), and the corresponding RGB pixel is treated as fully transparent. This option adds only 18 bytes to the image, and there are no corresponding TIFF or JPEG image types.

8.5.8. RGB with Alpha Channel

Finally, we have truecolor images with an alpha channel, also known as the RGBA image type. As with RGB and gray+alpha, PNG supports 8 and 16 bits per sample for RGBA or 32 and 64 bits per pixel, respectively. Pixels are always stored in RGBA order, and the alpha channel is not premultiplied.

The use of PLTE for a suggested quantization palette is allowed here as well, but note that since the tRNS chunk is prohibited in RGBA images, the suggested palette can only encode a recommended quantization for the RGB data or for the RGBA data composited against the image's background color (see the discussion of bKGD in Chapter 11, "PNG Options and Extensions"), not for the raw RGBA data. Disallowing tRNS is arguably an unnecessary restriction in the PNG specification; while a suggested RGBA palette would not necessarily be useful when compositing the image against a varied background (the different background pixel values would likely mix with the foreground pixels to form more than 256 colors), it would be helpful for cases where the background is a solid color. In fact, this restriction was recognized and addressed by an extension to the specification approved late in 1996: the suggested-palette chunk, sPLT, which is discussed in Chapter 11, "PNG Options and Extensions".

Although baseline TIFF does not require support for an alpha channel, libtiff supports RGBA images with 1, 2, 4, 8, or 16 bits per sample; both associated and unassociated alpha channels are supported. JPEG has no direct support for alpha transparency, but MNG offers a way around that (see Chapter 12, "Multiple-Image Network Graphics").

Decimal Value	ASCII Interpretation
137	A byte with its most significant bit set (``8-bit character'')
80	P
78	N
71	G
13	Carriage-return (CR) character, a.k.a. CTRL-M or ^M
10	Line-feed (LF) character, a.k.a. CTRL-J or ^J
26	CTRL-Z or ^Z
10	Line-feed (LF) character, a.k.a. CTRL-J or ^J

Chapter 8. PNG Basics

Contents:

8.1. Chunks

8.2. PNG Signature

8.3. A Word on Color Representation

8.4. The Simplest PNG