24.3 A Wireless Internet Application for Music Distribution

A comprehensive visual representation of the architecture of the wireless Internet application we have developed is reported in Figure 24.1. The client part of our software application, running on the UMTS device, provides users with the possibility of searching, downloading, and playing out the desired musical resources.

click to expand
Figure 24.1: Wireless application architecture.

Musical resources may be of different types, based on the musical service that is selected by the user. If a consumer wants to enjoy the song-on-demand delivery service, then musical resources are represented by simple musical files (typically encoded with the MP3 format). Those musical files are stored in different Web servers, geographically distributed over the Internet, which act as music repositories. In general, different Web servers can be managed and administrated by music service providers, and also may offer different sets of replicated songs. Simply stated, this replication scenario can be thought of as a loosely coupled replication system where, potentially, different servers support different sets of musical resources. Needless to say, each single song may be replicated within a number of different Web servers.

If the selected service amounts to the mobile karaoke, the corresponding musical resource is represented by a more-complex set of data constituting a karaoke clip. A karaoke clip, in practice, is represented by a textual SMIL-based file with pointers to all the multimedia resources that compose that given clip. All the multimedia resources (specified in the SMIL description file) are stored on different replicated Web servers, geographically distributed over the Internet. As in the case for the song-on-demand delivery service, those Web servers perform as redundant repositories for the multimedia resources that compose the karaoke clip of interest. As seen from Figure 24.1, an intermediate software system (IS) has been interposed between the mobile clients and the Internet-based multimedia repositories. The responsibilities of the IS are

Providing each UMTS device with a wireless access point to the Internet-based music distribution service
Discovering and downloading all those multimedia resources (songs or karaoke clips) that are requested by a user

In essence, the IS is constructed of three main subsystems: (1) an application gateway subsystem, (2) a discovery subsystem, and (3) a download manager. Those subsystems cooperate to support the download of musical resources to the mobile consumer, as detailed in the following discussion.

24.3.1 Search and Download of Musical Resources

The client part of our application represents the interface between consumers and services, and provides a set of search-and-download functions. Taking into account that a typical mobile device (such as a UMTS telephone or a PDA) possesses a very limited memory capacity and disk size, we have made the decision to move all the search-and-discovery functions to the IS side. This means that the client part of our application needs the collaboration of all the IS software subsystems to determine which musical resources are available, and where they are located. This implies also that users must perform the search-and-download activity by stepping through several different phases.

As an example, a complete search-and-download session for karaoke clips steps through three different phases. In the first phase, a user issues a request for a given karaoke clip from his UMTS device to the application gateway subsystem. The request may refer either to a specific song title or author. The gateway subsystem passes this request down to the download manager. The download manager asks the discovery subsystem for the complete list of all the available karaoke clips matching the request issued by the user. The discovery subsystem performs the search of the clips requested by the client, and proceeds as follows.

First, the discovery subsystem tries to establish a relationship between the titles of the songs requested by the user and the SMIL description files that represent the requested songs. (Note that different clips that match the request issued by the user may be stored in different Web servers.)

Once this activity is completed, the discovery subsystem passes to the user (via the application gateway) the list of all the clips (and correspondent SMIL files) that match the initial request. Upon receiving this list, the user chooses one of the clips. This choice activates an automatic process to download the corresponding SMIL file. It is the download manager, now equipped with all the relevant information needed to locate it, that downloads the SMIL file, and finally delivers it to the software application running on the UMTS device. Upon receiving the SMIL file, the software application running on the UMTS device examines that file and, following the specified schedule, calls again for the help of the discovery subsystem and the download manager in order to locate and download all the multimedia resources specified in the SMIL file.

This represents the beginning of the third phase of the discovery/download activity, at the end of which all the multimedia resources are delivered to the UMTS device that can perform playback according to the time schedule defined in the SMIL file. It goes without saying that during this final phase it is the responsibility of the discovery subsystem to individuate the Web locations of all the required multimedia resources, while the download manager carries out the download activity by engaging all the different replica servers that maintain a copy of the requested multimedia resources.

Needless to say, in order for the system to work correctly, a preliminary phase must be carried out, where each karaoke server announces the list of the clips it can make available for sharing. Each karaoke server that wants to add its own repository to our IS system may do that by running a software application called the data collector, which is in charge of communicating to the discovery subsystem the list of the clips along with the associated multimedia resources.

When the song-on-demand service is used, a similar activity is carried out by the system, with the only difference being that the intermediate phase when a SMIL file is searched, downloaded, and interpreted is not needed. Simply put, the mobile user may activate the automatic download of a given song after he has chosen from the list submitted to him at the end of the first phase. Figure 24.2 illustrates the progression of the above-mentioned process (the interface is in Italian), which has been developed in our prototype implementation by resorting to the Visual C++ programming language on a Microsoft Windows Pocket PC platform.

click to expand
Figure 24.2: Search and download over the wireless Internet.

24.3.2 Design Principles

In this section, we highlight the technical attributes that were significant for the development of all the software subsystems we have previously introduced.

We have already mentioned that our wireless Internet application exploits a standard TCP-IP stack for carrying out the communications between the application gateway subsystem and the client part of our application. According to the adopted approach, the client part of our application works as any other Internet-connected device, and the end-to-end connection is guaranteed by using the standard TCP protocol. However, to circumvent all possible problems due to the time-varying characteristics of the wireless link, our wireless application incorporates a session layer developed on the top of the TCP stack. This additional protocol layer provides stability to the download session, which may suffer from possible link outages.

With this in view, Figure 24.3 shows the protocol stack we have designed and developed to support all gateway-related communications. In particular (as shown in the left-most side of the figure) the gateway subsystem communicates with the client part of the application over a UMTS link. As seen in the figure, on the UMTS protocol stack an IP layer, based on the Mobile IP (Version 4) protocol, is implemented. On the top of this Mobile IP level, a standard TCP layer has been built. Finally, to circumvent all the network problems due to the radio link layer, the application layer built on the top of TCP has been designed as constructed out of two different sublayers:

A session layer devoted to organize and manage a download session which may possibly consist of different subsequent communication patterns, in the face of possible link outages
An application layer in charge of supporting the different connections needed to search and download songs

click to expand
Figure 24.3: Wireless Internet application protocol stacks.

It is worth noting that our designed session layer provides users with the possibility of resuming a session that was previously interrupted due to temporary link outages. The session management mechanism we have designed and implemented has a greater importance for the full success of the download activity of musical resources onto a UMTS device. It is easy to understand that very large files (e.g., songs of about 5 Mbps) must be delivered to the UMTS terminal, and this must be carried out in the presence of a wireless cellular access, which typically exhibits scarce connection stability and unpredictable availability. Stated simply, our session layer works as follows: when the UMTS client application opens a connection to the application gateway, the gateway assigns a unique identifier to this new session. If the gateway eventually detects a network failure (i.e., the TCP connection is closed), the download status is saved on the gateway side. In particular, a pointer (to the last received byte of the musical resource) is saved, along with the session identifier. At the same time, the identifier of the suspended session is saved at the client side of the application. As soon as the mobile client application is able to open a new TCP connection to the gateway, the client application tries to resume the interrupted session by exploiting the session identifier that was previously saved.

As a final note, it is important to remember that the session management mechanism we have developed is suitable for recovering sessions that are interrupted due to temporary link outages, but it is not adequate to recover from system failures occurring at the UMTS terminal or at the application gateway subsystem.

The download manager is the real agent responsible for the download process. It has been incorporated in our application built on top of the HTTP protocol. With the aim of maximizing service availability and responsiveness, it makes use of the Web server replica technology ^[16] along with the client-centered load distribution (C²LD) mechanism. ^[17] The C²LD mechanism implements an effective and reliable download strategy that splits the client's requests into several subrequests for fragments of the needed resource. Each of these subrequests is issued concurrently to a different available replica server. The C²LD mechanism is designed so as to adapt dynamically to state changes both in the network and Web servers; in essence, it is able to monitor and select at run-time those replicas with best downloading performances and response times. Figure 24.3 shows the download manager protocol stack. As seen from the figure, the download manager has to communicate with each different Web server replica, and then forks into different processes for each requested resource. Each process uses a C²LD application protocol to carry out download activities from different Web server replicas. It is worth noting that the use of the C²LD mechanism does not force music providers to organize musical repositories, which are all perfect replicas of the same list of musical resources. A musical resource, in fact, may be replicated within only some of the available server replicas of our system.

The software component of our developed system that stores and indexes relevant information about musical resources is the discovery subsystem. The main responsibility of the discovery subsystem is that of performing a sort of naming resolution for musical resources that are requested by clients. In particular, it carries out the folowing activities:

Accepts users' requests to establish a formal relationship between them and the corresponding musical resources stored in the system
Locates the exact Internet location where a given musical resource is stored throughout the entire system composed of replicated Web servers

To carry out this activity, the discovery subsystem calculates for each of the multimedia resources embedded in the system a 32-bit-long identifier (called the checksum). This value is computed on the basis of the file content and other information (such as file name, file creation time, file length, song title, and author). Two different indexes are maintained by the discovery subsystem: one is needed to resolve users' requests and the other is used to locate the corresponding musical resources. ^[18] We have devised a decentralized method for performing the calculation of the checksum. According to this scheme, each host server computes the checksum of its musical files and communicates the results to the discovery system. To minimize both the computational and traffic overheads, each server has to run the data collector locally to provide the possibility to add or delete the referenced musical resources by the discovery system. In essence, the data collector locally performs the checksum computation, and after having computed the checksum of all the files that a given music provider wants to distribute, opens a TCP connection toward the discovery subsystem. As a final task, the data collector application uploads the computed checksums to the discovery subsystem. It is worth noting that the data collector is implemented as a Java application to enhance software portability, and also it meets standard security constraints, as it can only read from the local file system, but it cannot execute local write operations.

Figure 24.4 shows a screenshot of the data collector application. As seen from the figure, two different kinds of information must be specified: the address of the discovery subsystem (under the form of IP and port numbers) and the complete address of the server where the musical resources are stored (including the data path within the local file system).

click to expand
Figure 24.4: Screenshot of the data collector application.

24.3.3 Structuring Karaoke Clips

The SMIL mark-up language is an XML-derived technology designed to integrate continuous media into synchronized multimedia presentations. ^[19] SMIL allows one to (1) manage the timing behavior of the presentation, (2) manage the layout of the presentation on the device screen, and (3) associate hyperlinks with media objects. The design of a SMIL-based presentation is performed according to two different phases: first, the author creates spatial regions that will contain the associated multimedia objects, then those multimedia objects are specified along with the timing schedule of their presentation. A SMIL file contains two main elements: a header (between <head> and </head>) and a body (between <body> and </body>). An SMIL header may specify spatial areas by using the <region> tag. (In a SMIL header, it is possible also to define meta tags that allow one to insert meta-information.) In a SMIL body, it is possible to define which multimedia objects are to be loaded in specific regions. To this aim, tags such as <video> for video files, <audio> for audio files, and <text> for text strings are exploited. The SMIL body is used also to schedule the synchronization of different multimedia objects. Two basic synchronization methods are provided:

Parallel (<par>,</par>): All multimedia objects are executed concurrently in their own regions
Sequential (<seq>,</seq>): With the sequential method, each multimedia object is executed in its own region according to a predefined sequential time schedule

By using the SMIL technology, it is easy to specify a karaoke clip that includes audio, video, and the text that periodically flows following the song melody. An example of a karaoke clip, specified by using SMIL, is presented in Figures 24.5 and 24.6 (the title of the song is "A little respect," by Wheatus). In particular, Figure 24.5 reports a code fragment with the SMIL header. As shown in the figure, two different regions are defined region1_1 and region1_2, respectively. Three meta tags are used to specify title, author of the song, and title of the album that contains the song. Figure 24.6 shows a code fragment representing the body of the SMIL file in which the following three different multimedia objects are executed in parallel:

An audio file (respect.wma)
A video file (respect.wmv), loaded in region1_1
A sequence of textual information flowing on the screen in region1_2 for a limited duration of time, which is specified in seconds by using the dur attribute

    <smil>      <meta  name="Titolo" content="A little respect" />      <meta  name="Autore" content="Wheatus" />      <meta  name="Key1" content="Wheatus"/>      <head>        <layout>           <root-layout/>           <region  top="76%" left="2                height="24%" width="100%"/>           <region  top="1" left="15"                   height="75%" width="100%"/>        </layout>      </head>

Figure 24.5: Header of a karaoke SMIL file.

    <body>       <par>          <video src="/books/2/494/1/html/2/respect.wmv"                 region="region1_2" fit = "slice" repeatCount="6">          </video>          <audio src="/books/2/494/1/html/2/respect.wma"></audio>          <seq>             <text begin="6s" dur= "7s" region="region1_1">                I tried to discover a little something to make me             </text>             <text dur = "10s" region="region1_1">                sweeter Oh baby refrain from breaking my heart             </text>             …             <text dur = "4s" region="region1_1">I hear you calling</text>             <text dur = "17s" region="region1_1">                Oh baby please give a little respect to me.             </text>          </seq>       </par>    </body>    </smil>

Figure 24.6: Body of a karaoke SMIL file.

Summing up, the wireless karaoke service we have provided exploits the SMIL technology, thus allowing users to enjoy:

A search session where karaoke clips may be searched by simply indicating a part of the song title or a part of the author name
A download session during which the SMIL file and, subsequently, the associated multimedia resources are downloaded
A playout session when the SMIL player plays back the multimedia objects according to the time schedule specified in the SMIL file

^[16]Ingham, D., Shrivastava, S.K., and Panzieri, F., Constructing dependable Web services, IEEE Internet Computing, 4 (1), 25–33, 2000.

^[17]Ghini, V., Panzieri, F., and Roccetti, M., Client-centered load distribution: a mechanism for constructing responsive Web services, in Proc. 34th IEEE Hawaii International Conference on System Sciences, Maui, 2001.

^[18]Roccetti, M. et al., The structuring of a wireless internet application for a music-on-demand service on UMTS device, in Proc. ACM Symposium on Applied Computing, ACM Press, Madrid, 2002, 1066–1073.

^[19]W3 Recommendation, Synchronized Multimedia Integration Language (SMIL) 2.0 Specification, http://www.w3.org/TR/smil20/, 2001.