Chapter 41: Objective Video Quality Assessment


Zhou Wang, Hamid R. Sheikh, and Alan C. Bovik
Department of Electrical and Computer Engineering
The University of Texas at Austin

Austin, Texas, USA
<zhouwang@ieee.org>, <hamid.sheikh@ieee.org>, <bovik@ece.utexas.edu>

1. Introduction

Digital video data, stored in video databases and distributed through communication networks, is subject to various kinds of distortions during acquisition, compression, processing, transmission and reproduction. For example, lossy video compression techniques, which are almost always used to reduce the bandwidth needed to store or transmit video data, may degrade the quality during the quantization process. For another instance, the digital video bitstreams delivered over error-prone channels, such as wireless channels, may be received imperfectly due to the impairment occurred during transmission. Package-switched communication networks, such as the Internet, can cause loss or severe delay of received data packages, depending on the network conditions and the quality of services. All these transmission errors may result in distortions in the received video data. It is therefore imperative for a video service system to be able to realize and quantify the video quality degradations that occur in the system, so that it can maintain, control and possibly enhance the quality of the video data. An effective image and video quality metric is crucial for this purpose.

The most reliable way of assessing the quality of an image or video is subjective evaluation, because human beings are the ultimate receivers in most applications. The mean opinion score (MOS), which is a subjective quality measurement obtained from a number of human observers, has been regarded for many years as the most reliable form of quality measurement. However, the MOS method is too inconvenient, slow and expensive for most applications.

The goal of objective image and video quality assessment research is to design quality metrics that can predict perceived image and video quality automatically.

Generally speaking, an objective image and video quality metric can be employed in three ways:

  1. It can be used to monitor image quality for quality control systems. For example, an image and video acquisition system can use the quality metric to monitor and automatically adjust itself to obtain the best quality image and video data. A network video server can examine the quality of the digital video transmitted on the network and control video streaming.

  2. It can be employed to benchmark image and video processing systems and algorithms. If multiple video processing systems are available for a specific task, then a quality metric can help in determining which one of them provides the best quality results.

  3. It can be embedded into an image and video processing system to optimize the algorithms and the parameter settings. For instance, in a visual communication system, a quality metric can help optimal design of the prefiltering and bit assignment algorithms at the encoder and the optimal reconstruction, error concealment and postfiltering algorithms at the decoder.

Objective image and video quality metrics can be classified according to the availability of the original image and video signal, which is considered to be distortion-free or perfect quality, and may be used as a reference to compare a distorted image or video signal against. Most of the proposed objective quality metrics in the literature assume that the undistorted reference signal is fully available. Although "image and video quality" is frequently used for historical reasons, the more precise term for this type of metric would be image and video similarity or fidelity measurement, or full-reference (FR) image and video quality assessment. It is worth noting that in many practical video service applications, the reference images or video sequences are often not accessible. Therefore, it is highly desirable to develop measurement approaches that can evaluate image and video quality blindly. Blind or no-reference (NR) image and video quality assessment turns out to be a very difficult task, although human observers usually can effectively and reliably assess the quality of distorted image or video without using any reference. There exists a third type of image quality assessment method, in which the original image or video signal is not fully available. Instead, certain features are extracted from the original signal and transmitted to the quality assessment system as side information to help evaluate the quality of the distorted image or video. This is referred to as reduced-reference (RR) image and video quality assessment.

Currently, the most widely used FR objective image and video distortion/quality metrics are mean squared error (MSE) and peak signal-to-noise ratio (PSNR), which are defined as:

(41.1)

(41.2)

where N is the number of pixels in the image or video signal, and xi and yi are the i-th pixels in the original and the distorted signals, respectively. L is the dynamic range of the pixel values. For an 8bits/pixel monotonic signal, L is equal to 255. MSE and PSNR are widely used because they are simple to calculate, have clear physical meanings, and are mathematically easy to deal with for optimization purposes (MSE is differentiable, for example). However, they have been widely criticized as well for not correlating well with perceived quality measurement [1–8]. In the last three to four decades, a great deal of effort has been made to develop objective image and video quality assessment methods (mostly for FR quality assessment), which incorporate perceptual quality measures by considering human visual system (HVS) characteristics. Some of the developed models are commercially available. However, image and video quality assessment is still far from being a mature research topic. In fact, only limited success has been reported from evaluations of sophisticated HVS-based FR quality assessment models under strict testing conditions and a broad range of distortion and image types [3,9–11].

This chapter will mainly focus on the basic concepts, ideas and approaches for FR image and video quality assessment. It is worth noting that a dominant percentage of proposed FR quality assessment models share a common error sensitivity based philosophy, which is motivated from psychophysical vision science research. Section 2 reviews the background and various implementations of this philosophy and also attempts to point out the limitations of this approach. In Section 3, we introduce a new way to think about the problem of image and video quality assessment and provide some preliminary results of a novel structural distortion based FR quality assessment method. Section 4 introduces the current status of NR/RR quality assessment research. In Section 5, we discuss the issues that are related to the validation of image and video quality metrics, including the recent effort by the video quality experts group (VQEG) in developing, validating and standardizing FR/RR/NR video quality metrics for television and multimedia applications. Finally, Section 6 makes some concluding remarks and provides a vision for future directions of image and video quality assessment.




Handbook of Video Databases. Design and Applications
Handbook of Video Databases: Design and Applications (Internet and Communications)
ISBN: 084937006X
EAN: 2147483647
Year: 2003
Pages: 393

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net