The Design Goals of SyncML | SyncML: Synchronizing and Managing Your Mobile Data


Team-Fly

	SyncML®: Synchronizing and Managing Your Mobile Data By Uwe Hansmann, Riku Mettälä, Apratim Purakayastha, Peter Thompson, Phillipe Kahn
	Table of Contents

	Chapter 4. SyncML Fundamentals

The design space of SyncML is large but not intractable. Figure 4-1 outlines the design space of SyncML. In reality, it has more dimensions than just the four shown here. The dimensions shown are deemed the most important, as they have considerable influence on the design decisions. They are device, network, data, and synchronization topology. The device dimension is in the increasing order of resource richness and capabilities. On the lower end, there are devices such as cellular phones and PDAs; on the higher end there are personal computers and server-class devices. The network dimension is in the increasing order of bandwidth and decreasing order of latency. On the lower end we have wide-area wireless networks, such as cellular networks, and local low-power wireless networks, such as Bluetooth™ [MB01].^[1] On the higher end we have wireless local-area networks, such as IEEE 802.11 [WLAN02] and regular wireline networks, such as Ethernet. The data dimension includes widely adopted PIM data on the lower end, relational data in the middle, and application-specific data on the higher end. As outlined in Chapter 1, data synchronization may occur among various entities conforming to certain synchronization topologies. The topologies supported have fundamental implications on the design of a synchronization protocol. The topology dimension includes one-to-one synchronization on the lower end and many-to-many synchronization on the higher end.

^[1] Some 3G networks can have higher bandwidth than Bluetooth.

Figure 4-1. The primary design dimensions of SyncML and its overall design space. The space enclosed in the pyramid is the key design thrust of SyncML.

graphics/04fig01.gif

The pyramid in Figure 4-1 identifies the design thrust of SyncML. In summary, SyncML is optimized for mobile devices such as mobile phones and PDAs; SyncML accommodates the characteristics of low-bandwidth, low-reliability, and high-latency wireless networks; SyncML attempts to efficiently support common data types, including PIM data and relational data; and SyncML efficiently supports one-to-one and many-to-one synchronization topologies. It is important to note that the design thrust only indicates what SyncML is intended to support best. SyncML neither rules out nor handicaps operations outside its design thrust. For example, one can gainfully use SyncML to synchronize two servers, but the specification does not take any special steps to make such synchronization efficient. Similarly, one can use SyncML to perform synchronization over a local area network (LAN) or perform synchronization of non-standard or application-specific data. Below we discuss a few specific design goals of SyncML.

Effectiveness over Wireless Networks

Numerous target SyncML devices, such as mobile phones and PDAs, are usually connected wirelessly. Mobile phones are connected via various 2G and 3G wide-area wireless networks. Some PDAs also connect using these networks. In addition, some PDAs and mobile phones will begin to connect using local area wireless technology such as Bluetooth^TM and IEEE 802.11 as these technologies become more prevalent. It is therefore immensely important that SyncML be designed to be effective over wireless networks. Although there is a large set of wireless networks used by mobile devices, the following characteristics of wireless networks are generally true when compared with regular wireline networks, such as Ethernet or T1-based wide area networks:

Limited network bandwidth
High network latency
Low reliability
High communication costs

SyncML attempts to address all the above issues in the design of its components. The following outlines some pertinent design approaches.

Judicious use of bandwidth

XML [HM01] can be verbose. The Wireless Application Protocol (WAP) [ACS+00] forum has defined a standard for binary representation of XML documents called WBXML [WBXML01]. SyncML allows the use of WBXML to transmit messages. Binary encoding reduces overall transmission requirements.

At the cost of complexities in mobile Clients, SyncML encourages Clients to maintain some kind of change log to account for changes made to local datastores. Thus in general, synchronization between a Client and a Server is incremental, involving changes since the last synchronization. This reduces bandwidth requirements tremendously compared to complete synchronization of all datastore entries in the Client. Such complete synchronization is used in SyncML only to handle exceptions such as first-time synchronization or resynchronization after a failure.^[2] Although common SyncML applications may exchange complete data items, SyncML allows the communication of changes to data items instead of complete data items, thereby preserving bandwidth. One can envision the emergence of standard data formats, such as for Relational Data, allowing specification of only the changed columns in a database row. As such formats emerge, SyncML can leverage them to reduce the overall bandwidth requirements of synchronization.

^[2] This is called "slow synchronization." See Chapter 5.

Combating network latency

Wireless networks usually have high network latency. In an environment with high latency, one must avoid a chatty protocol at all costs. In a chatty protocol, individual data items and/or operations are typically communicated separately. In contrast, SyncML allows the batching of data items and operations in one message. Batch transmission of data and operations masks network latency to a large extent, as the processing of a batch of items can continue while the next batch is in transmission.

Addressing low reliability

While batching is beneficial in combating latency, a large batch or Message may not make it across a wireless network in its entirety. This is because wireless networks are relatively unreliable, and it is quite likely that errors or failures may occur during the transmission of a large package. Although this problem may apparently be deemed a low-level network layer problem that can be addressed by packet protocols, TCP/IP (and hence HTTP), one of the common "reliable" protocols that applications use, is unfortunately not especially suited for wireless networks, as it does not allow for incremental progress of transmission. TCP/IP may transfer 699 bytes of a 700-byte message 100 times and still declare overall failure of message delivery. It lacks the ability to "pick up the transmission where it left off" (incremental progress) and just send the remaining one byte in the second round after sending 699 bytes successfully in the first round.

Another common transport, Wireless Session Protocol (WSP) [WSP01], uses WAP gateways that can limit the maximum size of messages. A WAP gateway will not allow a large SyncML Message to pass through. In light of these factors, SyncML allows partitioning of a logical package into smaller physical messages. In such a situation, SyncML implementations become more complex, as they now need to support package assembly and disassembly. Overall, it is a good tradeoff, however, as multiple messages allow SyncML to combat lower reliability and other configuration limitations (e.g., WAP) of wireless networks.

Reducing communication costs

The cost of communication is commonly assessed by the amount of time and/or the amount of data communicated. The combination of WBXML compression, to reduce the number of bytes communicated, and batching, to reduce the overall latency of communication, is aimed at substantially reducing the overall communication costs.

Support Transport Heterogeneity

Not only are there various kinds of physical networks, such as wired and wireless, that differ in characteristics, there are also various kinds of transport protocols that are used over these networks: HTTP, WSP, and Simple Mail Transfer Protocol (SMTP) to name just a few. The SyncML Synchronization Protocol is at a higher semantic level than any of these transport protocols. It is therefore imperative that SyncML be realizable over many transport protocols. From the SyncML perspective, which is that of an "application" on top of the various transport protocols, the protocols offer the following different classes of behaviors:

Persistent connection
Synchronous request-response
Asynchronous request-response

Examples of protocols that offer a persistent connection are TCP/IP and OBEX [OBEX99]. TCP/IP is one of the most commonly used protocols. The OBEX protocol is used for data exchange by the infrared (IrDA^®) and Bluetooth communication technologies. In these protocols, the two communicating parties actually maintain a long-lived connection over which all communication occurs during synchronization. Therefore, the communication has state. This implies that the communicating parties can exchange multiple messages pertaining to a synchronization session efficiently over the same connection. All the associated context of the synchronization, such as authentication, is implicitly attached to the existing connection. In addition, these protocols are symmetric such that any communicating party can initiate synchronization.

HTTP and WSP are popular synchronous request-response protocols. Although HTTP is implemented on TCP/IP, it offers a request-response abstraction to applications. In these protocols, a "client" sends a "request" to a "server" and the server returns a "reply" synchronously. This request-response paradigm works well for a model where clients and servers are substantially asymmetric and clients generally consume information that servers generate. In this model, usually the client initiates synchronization. Specific protocols like WSP, however, support "push," whereby a server can send an alert to a client, indicating that the client should begin synchronization.

In addition to the above, there are asynchronous request-response protocols. Although these protocols are also fundamentally request-response in nature, the response is not sent as an in-band reply to a request. The response is sent out-of-band after an undetermined time period. Popular email protocols such as SMTP, POP3 (Post Office Protocol), and IMAP (Internet Message Access Protocol) are of this nature.

SyncML is intended to run over all the above diverse protocols. The Synchronization Protocol is designed such that it is simply a specification for a series of packages (realized as physical messages) exchanged between a Client and a Server in some order. As discussed above, SyncML allows a package to be communicated as several transport-layer messages. SyncML, however, does not depend on the underlying transport protocol to support any ordering of these messages but enforces ordering itself by sequencing messages. The sending party must receive implicit or explicit acknowledgment for one message of a multiple-message package before it is allowed to send the next message for the same package. The message exchange sequence of SyncML can be easily and efficiently mapped to common synchronous request-response protocols such as HTTP or WSP.

The Synchronization Protocol can also be mapped to asynchronous request-response protocols such as SMTP; however, an implementation will require the client and the server to do frequent polling (analogous to repeatedly checking email) and may not be so efficient.^[3] Therefore, although SyncML supports SMTP-like protocols, that is not within the recommended "operating range" of the design.

^[3] To avoid polling, notification of message arrival can be achieved by using orthogonal means, such as ISDN D-channel. This, however, requires additional capabilities and introduces more complexity.

Protocols such as TCP/IP and OBEX can support SyncML message sequencing in a straightforward manner. Actually, over a persistent connection, message sequencing is not necessary, as the protocol itself will preserve ordering over the same connection. Requiring SyncML message sequencing over these protocols is slightly suboptimal design. SyncML designers knowingly made this tradeoff, as most emerging popular Internet protocols are of the synchronous request-response type and message sequencing is a necessity if SyncML is to operate over such protocols.

The mechanics of using various transport protocols in SyncML is straightforward, as discussed in Chapter 7. For each transport protocol, SyncML requires the specification of a binding that determines how a SyncML package is realized into one or more physical messages over that transport. SyncML defines bindings for popular protocols, such as HTTP, WSP, and OBEX. Other bindings can be defined and used as deemed appropriate.

Support a Rich Set of Networked Data

Since SyncML is aimed at enabling data synchronization for a diverse set of mobile applications, it is natural that it be able to support a diverse set of networked data. In particular, SyncML should support synchronization of the following kinds of data related to popular applications:

Data related to personal information applications, such as contact, calendar, and to-do list information
Data related to collaborative applications, such as email and network news
Data related to Relational Database applications
XML and HTML documents
Any other data represented as a MIME (Multipurpose Internet Mail Extensions) [RFC2045] type

Consider a scenario where an application vendor provides the Server-side application, the Client-side application, and synchronization between these applications. An example of such a scenario is the set of Palm^® personal information management applications. In such situations, the format of the Client-side data, the Server-side data, and that of the data exchanged during synchronization can be entirely private. The application vendor can completely control the data format. The data format can be optimized for storage on the Client and the Server, as well as for transport over the communication link. Now consider a scenario where Client and Server applications are synchronized using software provided by an independent synchronization vendor. The Client and the Server application vendors (sometimes the same entity) have to provide their data format information to the synchronization vendor. The synchronization vendor still reserves the right to transmit the data in a proprietary format during synchronization, as it controls both ends of the synchronization. This approach does not work in SyncML. For truly interoperable synchronization, it is imperative that the data exchange format between the Server and the Client be open and standards-based. The entity that writes the Client synchronization agent may be different than the entity that writes the Server synchronization agent. The only way they can interoperably exchange data is by using standardized formats.

Unfortunately, open standards for data formats, or at least widely adopted formats, do not exist for all the data that may be potentially synchronized using SyncML. Standards for common data formats, such as contact, calendar, and to-do list information, exist and SyncML simply adopts those standard formats. An interoperable SyncML implementation must use the prescribed data formats. For example, the vCalendar [VCAL] or iCalendar [RFC2445] data format should be used for the calendar applications. Standards for Relational Data do not exist but are emerging. Standards for other XML data in general do not exist but are also emerging. The growing interest in XML Schemas will likely act as a catalyst in the standardization of various kinds of data. SyncML adopts an approach whereby data standards are incorporated in SyncML as they emerge.

While SyncML recommends that standard data formats be used for certain applications and enforces the statute for SyncML Servers in the interoperability testing process (Chapter 13), it does not preclude the use of proprietary data formats by applications. It is important for applications to have the choice of using proprietary data formats. Some applications, for example, may like to synchronize arbitrary binary data. Such applications should not be prevented from using SyncML. Implementations that use nonstandard data formats, however, are not certified as interoperable.

It is important to note that the requirement of using standard data formats for synchronization does not mean that the Client and Server applications are forced to use the standard data formats internally within the applications. The calendar application in a mobile phone likely stores calendar data in an efficient binary format. A Server calendar application can also choose to store data in its own efficient format. During synchronization, however, the respective synchronization agents must transform these formats into the standard format and vice versa.

Neutrality to Programming Environments

A de facto standard for mobile data synchronization has to operate over a variety of programming environments. A programming environment consists of the programming language, implied system resources, such as files or datastores, and processing capabilities, such as single or multi-threaded. The programming environment also consists of the networking environment, which is covered above and hence ignored here. In a mobile phone, the available programming language could be the C language, there may be no file system or datastore abstractions but proprietary native ways of storing data, and there may be no facilities for multithreading. In a PDA, the available language could be C++, there may be a file-system available, and multithreading may be supported (e.g., PocketPC-based PDAs). In a network server, languages such as C, C++, and Java™ may be all be available, both file system and datastore abstractions may be available for storage, and there may be rich support for multithreading.

SyncML does not make any assumptions about the programming language supported by a particular platform. It is based on exchanging well-specified, structured XML messages and not on any particular programming environment. The specification only determines the format of the information that is exchanged and the sequence of information exchange. The information exchanged (SyncML packages) can be generated in any way deemed appropriate by a programmer. The reference implementation that accompanies the SyncML specifications is based upon a published C API, but the API per se is not part of the standard.

The issue of neutrality is deeper than just the issue of programming language. By virtue of simply being XML-based, a certain amount of neutrality is achieved. For example, in a network Server, a SyncML message can be processed using a Document Object Model (DOM) [DOM02] tree in parallel with multiple threads traversing or constructing different parts of the tree. In a mobile client where no multithreading support is available, the same message can be serially processed using the Simple API for XML (SAX) [SAX02], which does not require efficient processing of a parse tree.

SyncML is also neutral to available platform storage abstractions. Object-based storage abstractions such as Object Databases are actually well suited to store, retrieve, and directly manipulate SyncML (XML) documents. Such storage abstractions allow the programmer to directly store the in-memory representation of an XML document. Neither common Relational Databases nor file systems offer object-based abstractions for storing XML documents. In most cases, however, the SyncML document is transiently generated and processed in memory during synchronization in such a way that the document itself is not stored persistently. The data within the document, such as a calendar entry, is of course processed and then stored using the optimized internal representation suited to a particular platform.

Support Multiple Synchronization Topologies

There are three kinds of synchronization topologies^[4] : one-to-one, many-to-one, and many-to-many. In the one-to-one data synchronization model, a particular client only synchronizes with a particular server. In the many-to-one data synchronization model, two or more clients synchronize with a single server. In the many-to-many data synchronization model, a group of computers freely synchronize with each other directly. In many-to-many synchronization, there is no notion of a primary server or datastore.

^[4] See Chapter 1 for a more complete discussion of synchronization topologies with usage examples.

The one-to-one interaction model and the many-to-one interaction model are abundantly more common in the context of day-to-day commercial applications. Moreover, the many-to-many model can be indirectly (but inefficiently) achieved using the many-to-one model. This is done by designating one device as a Server and stipulating that the other devices synchronize with the Server and hence indirectly synchronize with each other via the Server. Implementing the many-to-one interaction model (which includes the one-to-one model) is conceptually simpler, and the resulting implementations are orders of magnitude simpler than ones that support the many-to-many interaction model. In many-to-many synchronization, complex data structures such as "version vectors" need to be associated with data items to correctly synchronize data. Maintaining consistency of data identifiers in many-to-many synchronization is also complex without forcing all parties to store identifiers in the same format. The many-to-many model is also especially intractable for the purposes of accounting and failure recovery.

For the above reasons, SyncML is optimized for the many-to-one synchronization topology. It allows the exchange of datastore sync anchors^[5] in the beginning of a synchronization, which indicates the last "timestamp" at which the two computers synchronized. The timestamp could be an actual time value or a logical counter. Based on the exchanged sync anchor values, the associated sync engines could use simple data structures such as change logs (see section below) to determine which data items have changed since the last instance of synchronization.

^[5] See Chapter 5 for a more comprehensive discussion of sync anchors and their usage.

SyncML, however, allows many-to-many synchronization. It allows each data item to have an associated version, which could actually be a version vector required for many-to-many synchronization. It also does not specify the format of the sync anchor explicitly, so the sync anchor could also be a version vector. Furthermore, it allows a SyncML device to play the dual roles of a Server and a Client. The above allowances imply that SyncML can be used for most advanced forms of data synchronization, such as many-to-many synchronization, but is optimized for the more common case of many-to-one synchronization.

Address the Resource Limitations of a Mobile Device

A typical mobile device, such as a cellular phone or a PDA, is resource-limited in many ways. The key limitations include the following constraints:

Limited memory
Limited processing capabilities
Limited battery power
Limited communication capabilities

The constraints on communication arise from the fact that these devices often use wireless networks. The design considerations associated with wireless networks are discussed above and thus are ignored in this section.

Addressing memory limitations

The SyncML specification is extremely sensitive to the implied memory requirements. The static footprint, or overall code size, of the SyncML implementation on a mobile Client device should be low on the order of tens of kilobytes. SyncML does not require devices to validate received packages against the SyncML DTD. Checking syntactic correctness is deemed enough. This allows devices to use simple parsers instead of the more complex parses used to validate XML, which are usually of much larger code size. Although SyncML allows a device to play the dual roles of Client and Server, it normally expects that devices will play only single roles. SyncML assumes that mobile devices will most often play the Client role. In the Client role, many features are made optional that would have to be supported in a Server role. By consciously allowing devices to announce their roles and by having asymmetric requirements for those roles, SyncML allows for focused, lean implementations on mobile Clients.

The dynamic footprint is the overall memory required by a program when the program executes. SyncML is designed to reduce the overall dynamic footprint required during the process of synchronization. It allows Client data identifiers to be smaller than those in a Server. It consciously burdens Server implementations with the task of identifier mapping (see Chapter 5) between various Client identifier formats and the Server identifier format. This enables a Client to use very compact, optimized data identifiers, thereby reducing dynamic memory requirements. Clients can also use SAX parsers for parsing SyncML documents instead of DOM parsers, as SAX parsers require considerably less execution-time memory than DOM parsers. The amount of communication buffer space required during synchronization is a key aspect of dynamic footprint requirements. By allowing packages to be broken into multiple smaller messages, SyncML reduces the communication buffer space required at any one time. SyncML also allows a Client to specify the largest message size that it can process and thereby allows the Client to adapt to its current available memory.

The memory overhead associated with data synchronization includes maintaining a change log. The change log is a logical name for information that a synchronization engine or an application must maintain corresponding to each datastore. The role of the change log is to record which items have changed in a datastore between successive synchronizations. During synchronization, it is expected that the change log be consulted to determine what pertinent changes must be communicated. The actual implementation of a change log may take various forms. It can actually be a physical log of operations that have been made to items in a datastore. Each log entry may contain the type of operation and a timestamp. The timestamp could be an actual time or some logical counter maintained by the synchronization framework. In cases where a datastore in a client device is associated only with one datastore on one server, the change log can also take the form of a change bit associated with every data item. The change bit approach for mobile devices may conserve device storage in some cases where the device is synchronized relatively infrequently and the log-based change log can potentially grow substantially between synchronizations. However, for a client datastore that is synchronized with multiple datastores, a simple change bit does not suffice. Change bits have to be maintained for every synchronization partner for every data item. The storage requirements explode and become infeasible for many mobile devices. For many common synchronization usages, it often makes sense to use an actual log-based change log. In a log-based approach, however, the change log grows continuously and must be pruned from time to time by deleting entries corresponding to changes that have been communicated to all synchronizing partners.

Addressing limitations of processing capabilities

SyncML is sensitive to the limitations of the processing capabilities of typical mobile devices. SyncML encourages the Client to be simple and encourages that detection and resolution of conflicts and interpretation of application data be done on the Server. Also, the ability to opt for a simpler, nonvalidating parser is of substantial assistance to such devices. Activities related to security also tend to be processor-intensive. SyncML allows Clients to use simple password authentication schemes, which do not require much Client-side processing. Higher-end Clients can still use encrypted message digests for authentication. In its current version, SyncML does not mandate any data encryption requirements,^[6] as data encryption and decryption tends to consume substantial processing power.

^[6] Data encryption, although not mandated, is allowed. For example, with the HTTP binding, it is possible to use SSL (Secure Sockets Layer) or TLS (Transport Layer Security).

Addressing battery power limitations

Battery power is a precious resource for many mobile devices. By addressing processing capability restrictions, SyncML partially addresses battery power restrictions as well. Another battery-saving feature of SyncML is server-alerted synchronization. Common usage examples of synchronization often involve mobile clients obtaining data updates from a server. For example, field insurance agents may want to obtain the latest rate quotes, or mobile health-care professionals may want to obtain the latest test results on a patient that they are about to visit. If the client always initiates synchronization, it has to poll the server at certain intervals to get the latest server updates, not knowing if there are any pertinent updates. This wastes battery power (along with the user's time and money). Therefore, for certain clients, such as mobile phones, SyncML allows a Server to alert a Client to begin synchronization. Such an alert function may be implemented on the underlying "push" functionality of a transport protocol, when available (e.g., WSP). Allowing server-alerted synchronization potentially conserves the battery power of a mobile device, in contrast to repeated polling. SyncML also does not require that the device or a particular datastore be "locked-out" during synchronization. It allows reads and updates to continue as synchronization progresses. By allowing the user to perform productive work while synchronization continues, it optimizes overall usage of battery power, as well as preserving the continuity of user experience.

Allow Building of Scalable Servers

Typically, SyncML Servers are expected to serve a large number of Clients. For an Internet service provider, the SyncML Server is expected to serve hundreds of thousands of Clients, of which tens of thousands may be simultaneous users. For an enterprise Server, the SyncML Server must serve thousands of Clients. Therefore, it is of critical importance that SyncML Servers be scalable. The SyncML specification takes a few explicit steps to allow for the scaling of SyncML Servers.

Batch processing of data

As indicated above in the discussion regarding latency of wireless communication, SyncML encourages "batch" processing as much as possible. In batch synchronization, the Client typically sends changes made to one or more datastores in a single SyncML package. The processing of data in this manner also allows the building of scalable Servers. First, during the process of synchronization, the Server typically accesses some back-end datastore, such as a Relational Database or a PIM datastore (e.g., Lotus Notes^®, Microsoft Exchange^®). Accessing these datastores and performing multiple operations simultaneously usually provides much better performance than individual unit operations. Second, a SyncML Server may not only batch updates from one synchronization session with one Client, but may do so over multiple sessions with multiple Clients at the same time. This kind of collectively batched operation could dramatically increase throughput with the back-end datastore. Third, batching amortizes constant processing costs associated with SyncML Messages. For example, authentication could be performed just once for one SyncML Message that contains operations for one user across multiple related datastores (assuming the user uses the same credentials for those related datastores).

No implicit ordering constraints

SyncML does not require that commands within a SyncML Message be processed in any particular order.^[7] Commands pertaining to the same or different datastores can be processed in any order. This enables a Server to process a SyncML Message using multiple concurrent threads without appreciable timing coordination among the different threads processing a SyncML Message. A network Server can therefore have hundreds of threads processing SyncML Messages concurrently, thereby increasing overall performance and scalability.

^[7] One exception is commands that are enclosed within the Sequence command. See Chapter 6.

No transactional guarantees

SyncML does not guarantee any transactional semantics^[8] with data synchronization. For example, consider an Add operation from a Client. When the Server synchronization agent processes the Add operation, it can wait until the operation is confirmed by the back-end datastore or it can simply "queue" the operation to another entity that manages back-end operations and move on to the next operation. In the first case, the synchronization agent is guaranteed that the operation is actually complete at the back end (for back ends that provide transactional guarantees, e.g., a Relational Database). In the second case, the synchronization agent does not actually know if the operation has been completed. The first case is clearly more time consuming and could delay synchronization, adding to overall latency in processing Messages. The second case enables quicker processing of SyncML Messages but incurs the risk of back-end rejections or failures after the synchronization is deemed complete. Clearly, back-end failures can occur for multiple reasons. Back-end datastores can also reject updates for numerous reasons, including specific constraint violations such as adding an employee record that indicates a salary of one billion dollars per month. Since SyncML does not provide transactional guarantees, it enables Server implementations to adopt the second approach. In such an approach, after-the-fact back-end failures could simply be treated as new back-end updates during the next round of synchronization. SyncML expects such after-the-fact failures to be relatively few and makes a conscious tradeoff in favor of concurrency, performance, and scalability.

^[8] One exception is the Atomic command. See Chapter 6.

Enable load balancing

Load balancing is a common technique employed in building scalable systems. In this technique, a group of physical server machines act as one logical server. Typically, when a request from a client comes into the system, it is routed to an appropriate, lightly loaded physical server. The "router" balances load across multiple physical servers, thereby increasing the scalability of the overall system. For load balancing to work best, it is important that there not be much history in between two requests from a client. If two requests are strongly correlated and access common intermediate "state" on the server, those requests should be routed to the same physical server. In SyncML, the coordination between two synchronization sessions with one Client is encapsulated in the sync anchors pertaining to each datastore. Sync anchors are concise and hence can be shared between physical Servers using a common datastore. Thus, distinct synchronization requests from the same Client can be routed to any physical Server in a load-balancing cluster if the Server can access the sync anchor corresponding to the Client's datastores. Different packages pertaining to the same synchronization session, however, are best routed to the physical Server to which the first package was routed.

Build a Secure Synchronization Platform

SyncML adopts a practical approach to security. It realizes the importance of security, but more importantly it realizes the tradeoffs between security, usability, and performance. It is clear that a single "one-size-fits-all" security solution is not likely to work, as SyncML application requirements are diverse. Day-to-day personal information management applications will use SyncML. For such applications, security requirements are often not very stringent. Enterprise applications will also use SyncML. For such applications, security requirements could be stringent.

To address these diverse requirements, SyncML allows the use of the Secure Sockets Layer (SSL) protocol for data security, but does not make the use of SSL mandatory. SyncML also allows multiple types of authentication. Clients can authenticate using a simple password mechanism. Clients can also authenticate using encrypted message digests, which are more secure.

SyncML also allows different granularities of authentication. Authentication could range from per Client, to per datastore, to per datastore object. Object-level authentication could be important in applications where different users can have different levels of rights to shared data. For example, a manager can share her calendar with her employees, so all can view and update public entries, but only the manager can view and update private entries.

SyncML does not require data encryption, as encryption can be prohibitive for certain mobile devices. Encryption is allowed, however, if Clients and Servers choose to encrypt the data they exchange across the network.

Build Upon Existing Web Technologies

Leveraging existing technologies and riding on established momentum is key to the success of any effort, especially standards efforts. SyncML chooses to use XML for its Representation Protocol not only for purely technical reasons but also to leverage the momentum around XML. By choosing XML, SyncML is readily able to use a wide array of tools including various parsers. SyncML leverages the MIME standard for data formats. The SyncML packages themselves conform to a registered MIME type.

The initial transport bindings chosen in SyncML are all established wide-area wireline, long-range wireless, or short-range wireless transport. HTTP is the underlying protocol for most of the World Wide Web. WSP is the emerging protocol for a wide class of WAP phones. OBEX is the data exchange protocol for emerging Bluetooth and Universal Serial Bus (USB) devices.

Build a Working Specification

One of SyncML's most important goals from the beginning was not only to design and draft a specification but also build a working reference implementation for part of the specification. The reference implementation was intended for concurrent release with the specification, as a testament to the technical soundness and completeness of the specification.

The SyncML Initiative put a tremendous amount of additional effort into the reference implementation. A C-language programming API was designed on which applications could be written. An actual framework was designed that could be implemented. Two transport bindings, HTTP and OBEX, were implemented. End-to-end demonstration scenarios were designed such that the implementation and the specification could be validated.

The reference implementation has catalyzed many development efforts around SyncML. It provides SyncML supporters a concrete starting point and enables quick application development. Although neither the API nor the reference implementation are actually part of the specification, they have played key roles in the growing acceptance of SyncML.

Promote Interoperability

The overriding goal of SyncML is interoperability. To that effect, Sync-ML has designed a detailed conformance test suite aimed at testing Sync-ML implementations for conformity to certain key aspects of SyncML. In addition to conformance tests, SyncML also defines a process by which interoperability is tested between conformant Clients and Servers. The SyncML Initiative regularly hosts "SyncFests," during which a product must show interoperability with two or more other products (from different companies) to be deemed an interoperable implementation. Chapter 13 covers SyncML interoperability testing in detail.


Team-Fly

Top