1.1 Imaging Basics | Applied C++: Practical Techniques for Building Better Software

This section provides a brief overview of some of the concepts we use when referring to image processing. If you are familiar with digital images and their properties, you can skip this section and proceed to Chapter 2.

An image processing application is any program that takes an image as input, performs some manipulation of that image, and produces a resulting image that may or may not be different from the original image.

An image has a specific meaning in image processing. The final image framework will handle many types of images, such as 8-bit and 32-bit grayscale and color images. However, as an introduction to image properties, we'll start with 8-bit grayscale images.

You can think of a grayscale image as a picture taken with black and white film. An 8-bit grayscale image is made up of pixels (picture elements, also referred to as pels ) in varying levels of gray with no color content. A pixel contains the image information for a single point in the image, and has a discrete value between 0 (black) and 255 (white). Values between 0 and 255 are varying levels of gray.

We specify the image size by its width (x-axis) and height (y-axis) in pixels, with the image origin (0,0) being in the top left corner. By convention, the image origin is the top left corner because this makes it easier to map the coordinates of a pixel to a physical memory location where that pixel is stored. Typically, increasing pixel locations (left to right) is in the same direction as memory. These properties are illustrated in FIgure 1.2.

Figure 1.2. Grayscale Image Properties

graphics/01fig02.gif

While most of us choose to take color pictures with our digital cameras , grayscale images are important for a number of applications. Industrial applications, such as machine vision, x-rays, and medical imaging rely on grayscale imaging to provide information. In the case of machine vision, inspections for dimensional measurements like width, height, and circumference are often taken from grayscale images. This is because grayscale images can be processed three times faster than color images, which means greater throughput on production lines. Grayscale images are also the standard for mass detection and analysis in x-ray technology.

In our initial prototype, we represent a grayscale image as a two-dimensional array of pixels. We represent each pixel as an 8-bit unsigned quantity ( unsigned char ).

Other sophisticated image processing applications require color images. Color may be required for accurate analysis, for example in applications where sorting occurs based on color (such as in pharmaceutical applications that use color to represent different types of pills). In addition, color pictures are the most frequently captured images with digital cameras.

1.1.1 RGB Images

While they can be represented in many ways, in our prototype a color image is a 24-bit image where each pixel element consists of 8 bits of Red, 8 bits of Green, and 8 bits of Blue, each with a value between 0 and 255 (where 0 means no color and 255 means pure color). This is called RGB (Red-Green-Blue) color space. Combining the light of these three primary colors can produce all the colors detectable by humans . It is intuitive to use the RGB color space to process color images, because the algorithms that work on grayscale images extend easily to color images simply by tripling the processing.

As we mentioned, there are many other ways to represent an RGB value, and some are based on human perception. In these color models, more bits are reserved for the green channel, with fewer bits for the red and blue channels because our eyes have greater sensitivity to green than to either red or blue. If your application produces images that are viewed by people, this is an important issue. For applications employing machine analysis of images, this is not as important.

Usually, the representation is a compromise between storage requirements and required resolution. For example, a 16-bit RGB image requires only two- thirds the storage of a 24-bit RGB image. Since you cannot allocate the same number of bits for red, green, and blue in a 16-bit color space, it is typical to allocate the additional bit to the green channel. This is referred to as the 5:6:5 color model , because 5 bits are allocated to the red channel, 6 bits to green, and 5 bits to blue.

1.1.2 HSI Images

An alternative color space is HSI (short for Hue-Saturation-Intensity). This color space mimics how the human eye perceives color. The hue of a color is its location within the visible portion of the electromagnetic spectrum of radiation. For example, the color may be aqua, teal, or green. You can think of hue as an angle on a color wheel , where all the colors are mapped on the wheel. Figure 1.3 shows a color wheel where red is at zero degrees, which is a common representation. Note that the arrows are provided to show angle; their length is not significant.

Figure 1.3. Hue on a Color Wheel

graphics/01fig03.gif

The saturation of the color is determined by how much the color is mixed with gray and white. A fully saturated color contains no gray or white, only the color. As gray and white are mixed into the color, it becomes less saturated. For example, the color of a lemon is a fully saturated yellow, and the color of a banana is a less saturated yellow.

The intensity of a color (also called luminance in some graphics programs) is the brightness of the color.

While it may be easier to describe colors using the HSI color space, it is not as fast for our purposes. As with grayscale images, the size of a color image is its width and height in pixels. Similarly, its x,y origin (0,0) is the top left corner. The properties of color images that we use in our prototype are shown in Figure 1.4.

Figure 1.4. Colour Image Properties Using RGB Color Space

graphics/01fig04.gif