A Practical Example

Spread provides a unified programming API available in several programming languages: C, C++, Java, Perl, Python, PHP, Ruby, and others. Detailed documentation about the C API comes distributed with Spread as man pages and is available online for the various other languages. To keep the code short, we will use the Perl interface for our example.

The application that we will use as an example of simple yet efficient usage of Spread is a distributed file cache purging daemon. Consider a file cache that is distributed across several servers. It is a common scenario that we want to either completely remove a file from the cache or replace it with a different version. In both scenarios, we first need to remove the existing version from all caches to force the caching system to refresh its local copy next time the data is needed.

To support this feature we will deploy a cache purging daemon on every cache. The daemon just waits for messages from clients requesting the purge of files that are no longer needed. Of course, we could implement this by having a client that connects to each of the cache servers and requests the purging. However, this method is cumbersome because it requires the client to know the identity of every cache server and to connect to each one of them, one by one.

Our proposed solution has all the cache purging daemons connected through Spread and joining the same group. A client that wants to request the removal of a file needs only to connect to Spread and send a message to the group that the daemons are listening to. Spread takes care of reliably passing the request along to all daemons connected to the group.

To exemplify, we present a sample implementation of the cache purging system written in Perl. First, let's have a look at the daemon outlined in Listing A.3, which we call sppurgecached.

Listing A.3. `sppurgecached`A Spread-Based Cache Purging Daemon

01: #!/usr/bin/perl 02: 03: use strict; 04: use Spread; 05: use Getopt::Long; 06: use POSIX qw/setsid/; 07: use File::Find qw/finddepth/; 08: use IO::File; 09: 10: use vars qw /$daemon @group $cachedir $logfile/; 11: 12: GetOptions("d=s" => \$daemon, 13:            "g=s" => \@group, 14:            "l=s" => \$logfile, 15:            "c=s" => \$cachedir); 16: $daemon ||= '4803@127.0.0.1'; 17: push(@group, 'cachepurge') unless(@group); 18: 19: close(STDIN); 20: if($logfile) { 21:   open LOGFILE, ">>>$logfile" || die "Cannot open $logfile"; 22: } 23: sub __log { syswrite(LOGFILE, shift) if($logfile); } 24: 25: die "You must be root, as I need to chroot" if($>); 26: die "Could not chroot" unless(chroot($cachedir) && chdir('/')); 27: # daemonize 28: close(STDOUT); close(STDERR); 29: fork && exit; setsid; fork && exit; 30: 31: sub removenode { 32:   return if /^\.{1,2}$/; 33:   -d $_ ? rmdir($_) : unlink($_); 34: } 35: 36: while(1) { 37:   my ($m, $g); 38:   eval {  # We eval so we can catch errors and reconnect. 39:     ($m, $g) = Spread::connect( { spread_name => "$daemon", 40:                                   private_name => "scpd_$$" } ); 41:     die "Could not connect to Spread at $daemon" unless $m; 42:     die "Could not join" unless(grep {Spread::join($m, $_)} @group); 43:     __log("Connected to spread: $daemon\n"); 44:     while(my @p = Spread::receive($m)) { 45:       if(@p[0] & Spread::REGULAR_MESS()){ 46:         chomp(my $victim = @p[5]); 47:         __log("[@p[1]] purges $victim\n"); 48:         if(-d $victim) { 49:           # For directories, we recursively delete 50:           finddepth( { postprocess => \&removenode, 51:                        wanted => \&removenode, 52:                        no_chdir => 1 }, $victim); 53:         } else { 54:           unlink($victim); 55:         } 56:       } 57:     } 58:   }; 59:   __log($@) if($@); 60:   Spread::disconnect($m) if($m); 61:   sleep(1); 62: }

The cache purging daemon connects to a Spread daemon and then joins the designated group (lines 3943). The Spread::connect call specifies the address of the daemon that the application connects to as well as a private name by which the application will be identified. The private name must be unique per the Spread daemon. By default, our application assumes that the Spread daemon runs on the standard Spread port on the local host. The group that the spcachepurged daemons listen to is cachepurge, but another group name can be specified using the -g command-line option (lines 1617). Because our program is supposed to remove files at request, we also make sure that it can only do so within the designated cache directory (lines 2526).

The daemon then starts listening for messages by calling the blocking Spread::receive call. The receive call reads both regular messages and membership change notifications, but for the current application we are only interested in REGULAR (data) messages (lines 4445). After a message is received, we check its payload and attempt to remove the file whose name was sent (lines 4655). The daemon then goes back to listening for another request.

Now let's have a look at the sample client on Listing A.4 that connects to the daemon and requests the purge of a file from the distributed cache.

Listing A.4. `spcachepurge`A Simple Local Area Spread Configuration

01: #!/usr/bin/perl 02: 03: use strict; 04: use Spread; 05: use Getopt::Long; 06: use vars qw /$daemon $group/; 07: 08: GetOptions("d=s" => \$daemon, 09:            "g=s" => \$group); 10: $daemon ||= 4803; 11: $group  ||= 'cachepurge'; 12: 13: my ($m, $g) = Spread::connect( { spread_name => "$daemon", 14:                                  private_name => "scp_$$" } ); 15: die "Could not connect to Spread at $daemon\n" unless($m); 16: 17: if(!@ARGV) { 18:   print STDERR "$0 [-d spread] [-g group] file1 ...\n"; 19:   exit; 20: } 21: while(my $file = shift) { 22:   Spread::multicast($m, RELIABLE_MESS, $group, 0, $file); 23: } 24: Spread::disconnect($m);

The client's sole purpose is to inform the cache daemons about the files that it needs deleted. Because we want to avoid cache inconsistencies, the request needs to be reliably delivered to all cache daemons. The client needs to be able to communicate to the spcachepurged daemons that are listening on a dedicated Spread group; therefore, we provide the clients with the option of specifying the Spread daemon that they need to connect to and the name of the group that the daemons are listening on. In a standard configuration the spcachepurge client is being run on one machine in the cluster. Therefore, by default, we set the $daemon variable to connect to the standard Spread port 4803 (line 10). The syntax used for specifying the Spread daemon is the same as the one used to start a Spread daemon. If the client is running on a machine without a Spread daemon, we can specify the proper daemon address: ./sppurgecache -d port@ip. The default communication group can also be overruled by using the -g parameter.

First The client connects to the Spread daemon (lines 1315); then, for each file passed as an argument, it broadcasts a reliable message to the cache group. Because the message is broadcast using the RELIABLE delivery guarantee, all daemons listening to the group will remove the requested file.

This example shows the convenience and efficiency of using the right tool. However, the solution is not as perfect as it appears. Even though we send the purge request as a RELIABLE message, it is possible that one of the purge cache daemons, or its corresponding Spread daemon, was crashed at the time the purge request was made or was disconnected from the rest of the servers due to a temporary network partition. In both cases, when the cache server becomes operational, it will still have the file that was removed on the other servers. Attempting to deal with this scenario in our application is a much more difficult problem and would require both the use of the more expensive SAFE messages as well as adding additional logic into the cache purging daemons. However, this level of precaution is not necessary for an application such as the one we are describing. Instead, given the notion of a cache, we can clear the entire cache upon a restart, thereby making sure that we will serve the correct documents, paying the small price of repopulating the cache on demand.

Understanding the requirements and trade-offs of the distributed problem you are trying to solve and choosing the appropriate tool and approach for the solution is, as mentioned at the beginning of this appendix, essential for developing smart distributed applications.

Listing A.3. sppurgecachedA Spread-Based Cache Purging Daemon

Listing A.4. spcachepurgeA Simple Local Area Spread Configuration

Listing A.3. `sppurgecached`A Spread-Based Cache Purging Daemon

Listing A.4. `spcachepurge`A Simple Local Area Spread Configuration