Data Paths in Synchronization | SyncML: Synchronizing and Managing Your Mobile Data


Team-Fly

	SyncML®: Synchronizing and Managing Your Mobile Data By Uwe Hansmann, Riku Mettälä, Apratim Purakayastha, Peter Thompson, Phillipe Kahn
	Table of Contents

	Chapter 12. The SyncML Server

In designing a SyncML Server, it is important to understand the paths of dataflow from Clients to the actual back-end datastores. In some cases, all updates to data pass through the synchronization Server. Thus, from the perspective of the Server, the data has a single path from Clients to datastores, running through the Server itself. This is called single-path synchronization. In some other cases, there may be updates to data that do not pass through the synchronization Server. This is called multiple-path synchronization. These two kinds of synchronization occur commonly in real usage scenarios. The synchronization data paths have profound effects on the design of a SyncML Server, especially on the Data Management part of the Server function.

Single-Path Synchronization

Figure 12-4 shows a schematic for single-path synchronization. The individual Clients could be such diverse entities as mobile phones, PDAs, or desktop PCs. The data items being synchronized could be calendar events, email, or music files. Every update, however, must pass through the synchronization Server. This is a common scenario in many Service Provider Servers. Consumers carry mobile devices or desktop PCs and synchronize through the Server. External applications (e.g., an email Server) also update the email datastore through the synchronization Server. The basic underlying philosophy in single-path synchronization is that the synchronization Server is the central focal point of all operations. Hence, all updates must pass through the synchronization Server.

Figure 12-4. Single-path synchronization, where all data updates pass through the synchronization Server.

graphics/12fig04.gif

Since all data updates traverse through the synchronization Server in single-path synchronization, the Sync Engine in the Server can store information about observed updates and detect all conflicts readily. In most cases where user intervention is not required, the Sync Engine also can resolve the conflicts readily. The Data Management function in single-path synchronization can simply be methods to read and write particular physical datastores, such as files or databases. The generation and management of back-end unique identifiers can also be a function included in the Server (perhaps as part of the Sync Engine), as the Server initiates all back-end operations. In this type of synchronization, usually the datastore has no unique or special role except to offer persistence.

Single-path synchronization lends itself well towards building scalable Servers. All conflicts are readily detectable by the Server. In addition, since all updates pass through the Server, conflicting updates from concurrent synchronization sessions can readily be detected and some updates discarded by the Server without incurring the cost of datastore operations. In most cases, it is also not necessary that the email and mp3 Servers send actual data via the synchronization Server. It is often enough to simply send update information to the Server, such that conflicts can be detected and new Client updates can be generated if required. As indicated earlier, this type of synchronization is commonly observed in Service Provider scenarios. Service Provider Servers may take advantage of all the performance benefits of single-path synchronization to build highly scaleable Servers.

Multiple-Path Synchronization

The philosophy behind multiple-path synchronization is that the synchronization Server is not the focal point of all operations. Instead, the data itself is central and the focal point of all operations. Figure 12-5 shows a schematic for multiple-path synchronization. This kind of synchronization is common when existing applications are extended for mobile or disconnected users. For example, it is commonplace to update business inventory data from a traditional desktop PC. In the mobile age, users carrying mobile devices would like to update the same data. The updates generated by the mobile users pass through the synchronization Server. The legacy interface to inventory data still remains, as there is no compelling reason to force the legacy applications to pass through the synchronization Server to fit into the single-path model. Business email also falls in the same category. People are already accessing business email (e.g. Lotus Notes) from connected desktops and directly using email from their desktop applications. The mobile users pass through the synchronization Server but the traditional users have no good reason to do so. It is also not advisable to compel the traditional users to pass through the synchronization Server, as that would entail sacrificing the traditional user experience in favor of the mobile user experience. Changing large numbers of useful and popular traditional applications also may not be economically feasible. These kinds of scenarios are common in the enterprise domain, as evidenced by the examples offered. To gain acceptance in an enterprise, it is important that the SyncML Server support multiple-path synchronization.

Data Management Issues

The primary difference in single-path and multiple-path synchronization is in the complexity of the Data Management function. The Data Management function in single-path synchronization is straightforward. Figure 12-6(a) shows the general structure of such Data Management. Since all updates are guaranteed to pass through the Sync Engine (inside the synchronization Server), the Sync Engine itself can maintain an update history. The update history shown in the figure can also be stored in persistent storage, such as a Relational Database (like the data itself). When the Sync Engine receives Client updates, it can consult the update history to detect conflicts. In the simple case, it can resolve all conflicts (possibly by invoking application code on the Server) and send reconciled updates to the Client, as well as make necessary changes to the physical datastore using the datastore API. In some cases, the Sync Engine may only mark the conflicts and ask the Client (user) to resolve them. The results of the resolution can be simply written back to the physical datastore in a later phase of a synchronization session. The datastore API is usually simple. The API normally only translates the canonical internal object representation of data items to the representation used by the physical datastore.

Figure 12-6. The different kinds of Data Management required to support single-path and multiple-path synchronization.

graphics/12fig06.gif

Data Management for multiple-path synchronization is more complex. A general structure for such Data Management is shown in Figure 12-6(b). The complexity arises from the fact that the Sync Engine is not aware of all the updates made to the physical datastore. Normally, the Server needs to implement Datastore Adapters specific to various different kinds of back-end datastores, such as Relational Databases, Lotus Notes, or Microsoft Exchange. The Datastore Adapters enable the Sync Engine to simply request updates that have been made to the datastore since some previous time. The Sync Engine then detects and resolves conflicts with the Client updates (in the simple case) and sends reconciled updates back to the Client, as well as to the Datastore Adapter. The Datastore Adapter sends the reconciled updates to the back-end store using the datastore API.

The Datastore Adapters must be able to track changes to the back-end datastores and maintain an update history (possibly in persistent storage), even though certain updates to the physical datastore may flow not through the Datastore Adapter, but directly through the datastore API. Different physical datastores support different mechanisms by which a program (such as the adapter) can track updates made to the datastore. For example, Relational Databases support triggers or database log-analysis software to track updates. Lotus Notes Servers also support similar functions. Clearly, the Datastore Adapter is quite specific to the back-end datastore, and a synchronization Server may have multiple such adapters for various supported back-end stores.

Changes to back-end stores may be made concurrently when a synchronization session is in progress. The Datastore Adapter must be able to reconcile the concurrent updates with the updates pertaining to a synchronization session. The Sync Engine typically needs to mark the start and end of a synchronization session with the Datastore Adapter such that the adapter can track updates made to the back-end store concurrently during synchronization. Some back-ends may allow locking the physical datastore from concurrent updates while a synchronization session is in progress. Such restrictions, however, are not applicable in general. The Datastore Adapter must often support mechanisms to queue concurrent back-end updates during synchronization and reconcile them later, perhaps with the next round of synchronization.

Implementing sophisticated Datastore Adapters for multiple-path synchronization is complex. The adapters actually could be the bulk of the work involved in implementing the synchronization Server. For high performance, the adapters often cache updates, and the cache must be kept synchronized with the back-end datastore. Certain adapters are even more complicated due to semantics of data. For example, certain write operations to Relational Databases may fail because they violate the integrity constraints of the data (see constraint violation in Chapter 1). The adapter and the associated Sync Engine must have the means to handle such errors even after the Sync Engine has processed conflicts.


Team-Fly

Top