Hack 73. Mix File and Database Storage | Locking and Performance

Just because you can store files in a database does not mean you should.

On sites that generate dynamic web content, storing everything (even things such as files) in the database can sound attractive. It would give a single metaphor for accessing data. It also offers a simple interface that is free from locking and reliability worries. Databases give you useful features, such as error recovery and transactions. If you use files to store data you have to write code to protect the files yourself. So, having files in the database as well should save you from writing lots of code.

There is a problem, however. Although database systems do support large pieces of data, such as pictures and binary application files, they tend to slow down the database queries. These data items are often big, so transferring them from the database is time consuming. They are implemented using special type definitions, such as BLOB and CLOB, and these store the data on disk using a different algorithm from that used for normal data. This can make retrieval and searching slower. Database backups are much slower due to the huge size of the data set.

Consider the problem of having pictures on your web site. If you had them in the database, you would receive a request for each picture present on a page. Each request would open a database connection, and return the picture and the right MIME type to the requesting browser. It would be much easier just to have a directory of pictures and let the web server handle the pictures itself. You can see this approach in "Store Images in a Database" [Hack #46].

One approach to mixed data requirements is to use the database to store only things which SQL can actually query. Currently images cannot be queried with questions such as WHERE image LIKE 'a red boat' (but people are working on this). Images in a database do give good version control capabilities, but you don't actually need to store images in a database to allow the database to look after your image versioning. You can, after all, store an image filename in a database and have a similar effect.

You should still use a database for large-file storage if the access and modification pattern has many serialization issues, and it is therefore too risky to do your updates with simple files.

9.10.1. Adding and Removing Files

Suppose you need to store and retrieve files as part of a software repository application. You have decided not to use a 100 percent database solution, and you want to store binaries as files in the filesystem. You could use a structure such as that shown in Table 9-5.

Table 9-5. Hybrid file storage
User	App name	Lastmod	Filename	Version
Gordon	Firebox	2007-03-01	50	1
Gordon	Firebox	2007-03-02	51	2
Andrew	Firebox	2007-03-01	52	3

User Gordon wants to upload a new version of Firebox to the repository. You must reserve a filename for the new file, even though the upload has not yet completed. You can do this in a single SQL statement:

INSERT INTO filelist 
 SELECT 'Gordon','Firebox',CURRENT_DATE,MAX(filename)+1,0
 FROM filelist;

You can find the new filename with this query:

SELECT filename FROM filelist WHERE user='Gordon' AND version=0

Once the upload has completed, you run this:

UPDATE filelist
 SET version=(SELECT MAX(version)+1 FROM filelist WHERE user='gordon'
 AND appname='Firebox')
WHERE user='gordon' AND version=0;

You have to do a little housekeeping yourself with this approach. Sometimes uploads will fail, so every day, run this:

SELECT filename FROM filelist WHERE lastmod<(sysdate( )-1) AND version = 0;

Any filenames returned can be assumed to have been involved with partial uploads which have failed, and you could write a script to run over that list and delete the files involved. Once all the file deletions have completed, delete the row from the database.

By using this approach you can store files in a directory, but never have orphaned files in the filesystem. Because you have used single SQL statements there should be no problems with race conditions, unless an individual attempts to upload two files at the same time. If that is a risk you can extend the filelist table with a session ID.

9.10.2. Too Many Files

As the number of files grows, the directory listing for the directory that stores the files grows too. Soon commands will start to complain that the command line is too long, commands will take a long time to find the file in question, and library routines may start to die. Not every operating system likes having 100,000 files in a directory.

The solution is to hash the filename and use that value for a subdirectory name. A simple hashing algorithm example would take the first character of the filename. Keeping with the numeric filenames from the previous example, Figure 9-5 shows a possible layout for these.

Figure 9-5. Hash directory structure

In this structure example, there are only three subdirectories and no sub-subdirectories. All the files and directories can still be listed with normal commands. In Linux, Unix, and Mac OS X (and on Windows using Cygwin), if you really want all the files for one operation, you can use find.

A function for this kind of storage scheme would look like this:

sub hashname {
 my ($filename) = @_;
 my @level;
 if ($filename =~ /^(.)/) {
 return "$1/$filename";
 }
}

In this case, hashname('12650') would return the string 1/12650, which can be used as the relative pathname and filename for the file 12650.

This hashing algorithm is not complex enough for a real system, and is here only to show you a simple example. It will produce unbalanced branches and has only one level of branching. If you take this route, write your own algorithm with multiple directory levels and a more balanced hashing algorithm.

9.10.3. Hacking the Hack

Creating and deleting files using this mechanism is not difficult. But changing a file that already exists will present you with a load of potential problems. You need to think carefully before continuing with a file-based approach. Database systems provide locking, transactions, and reliability, all for free. It is not wise to try to write even a basic transaction system so that you can use simple files. You will end up implementing your own database systems, and there are already enough database systems in the world.

However, if you can guarantee that you will be appending only to the end of the files, you are in luck because you can do this easily and without significant data risks, all without touching the database.

Suppose you have a program which monitors network activity, and when it detects a packet it uses the database to work out which user that packet belongs to. It then appends the packet information to the end of the packet log for that user. A packet log for a single user can quickly grow to 100 MB in size, and you have 2,000 users, so things can get big fast.

You had implemented this using a database-only solution, but now the database is huge and hard to back up. This is doubly a pity because if you lose your database it will be a disaster, but if you lose the packet logs it will be only an annoyance. You can use files to help. In this case, you use the techniques shown earlier in this hack to use a number for the filename, and a hash structure for the directories. All you need is a routine to open the file in append mode, shown here in Perl:

#!/usr/bin/perl
use Fcntl qw(:DEFAULT :flock);

sub openlog {
 my ($filename) = @_;
 my $handle;
 my $name = hashname($filename);
 if ($name =~ m/^([^/])//) {
 mkdir $1 if (! -d $1);
 }
 sysopen($handle,$name,O_WRONLY|O_APPEND|O_CREAT) || return;
 flock($handle,LOCK_EX);
 return $handle;
}

my $filename = '25640';
my $file = openlog($filename);
if ($file) {
 print $file "
". "Packet 56 arrived from 10.0.0.10";
 close $file;
}

There is a slight possibility that a problem or error could cause a partial write to occur in another program running this append routine, so you should always put the newline () at the start of your output rather than at the end. In this way, partial writes damage only the lines they were writing, not your file.

Remember that if you want to read from this file, you should do a flock with LOCK_SH before reading. If you forget, it probably won't matter, but there is the possibility of reading a write which has not yet completed, which could lead to reading partial data.

SQL Fundamentals

Joins, Unions, and Views

Text Handling

Date Handling

Number Crunching

Online Applications

Organizing Data

Storing Small Amounts of Data

Locking and Performance

Reporting

Users and Administration

Wider Access

Index