A mobile video telephony system is a real-time system of the conversational type. It is real-time because the playback of continuous media, such as audio and video, must occur in an isochronous fashion. A video telephony application is different from a streaming application because the former has the following properties:
Bidirectional data transfer: The media flow is always carried from a source mobile videophone to a destination videophone, and vice versa. In this perspective, the flow of data is symmetric between the two end-points.
Real-time media encoding: Each videophone must have encoding and decoding capabilities. Speech and video signals must be encoded and transmitted in real-time to the other peer end. This requirement implies that mobile devices need a higher processing power because of the additional encoding capability (devices for mobile streaming require only decoding capability). Real-time encoding must be performed efficiently and with the shortest delays.
Delay sensitivity: Mobile video telephony systems are real-time with conversational features. This implies that a high level of interactivity between the two endpoints is a must to guarantee that the system is usable for speech and video conversations. A conversation can be held only if the end-to-end delays are very tight and preferably constant. For instance, the characteristic of conversationality and dialog interactivity between two parties would be lost in the case of end-to-end delays larger than few hundred milliseconds. This is the most-critical success factor for a mobile video telephony service. In order to guarantee low end-to-end delays, both network and mobile stations must be optimized for processing of conversational traffic. A very important factor in mobile videophone systems is error resilience: any mechanism for error detection and correction/concealment must be run within the maximum delay budget allowed. For this reason, retransmission algorithms at the network or application level cannot normally be used, and forward error correction (FEC) or error concealment algorithms are the only possible choice for providing error resilience against bit errors (or packet losses) produced by the air interface.
A mobile video telephony system consists mainly of two mobile videophones, used by the end users, and the mobile network. Figure 21.1 describes the high-level architecture of a typical mobile video telephony system over an IP-based mobile network. We will follow an end-to-end approach, analyzing the system in its different parts.
Figure 21.1: A typical mobile video telephony system.
Mobile videophone A is connected to the mobile network through a logical connection established between the network and the mobile station addresses called Packet Data Protocol (PDP) context. PDP uses physical transport channels in the downlink and uplink directions to enable data transfer in the two directions. The mobile device has the capability to roam (i.e., upon mobility, change the network operator without affecting the received service), provided there is always radio coverage to guarantee the service. The mobile videophone is equipped with ordinary telephony hardware (microphone and speaker) and video hardware (camera and display).
The speech and video content is created in a live fashion from the microphone and camera input. This is encoded in real-time by the mobile device and transmitted in the uplink direction toward the network and the other end user. Speech and video data in the opposite direction (downlink) is conveyed from the network to mobile videophone A, which performs data decoding and display/playback of video and speech data. In addition, the videophone sends and receives information for session establishment, QoS control, and media synchronization. The videophone may react promptly upon reception of QoS reports, taking appropriate actions for guaranteeing the best possible media quality at any instant.
The mobile network carries conversational multimedia and control traffic in the uplink and downlink directions, allowing real-time communication between the two mobile videophone users.
Mobile videophone B is placed at the other end of the architecture shown in Figure 21.1. Its functionality is symmetrically identical to that provided by mobile videophone A.