7.6 Removing Cached Objects

 <  Day Day Up  >  

At some point you may find it necessary to manually remove one or more objects from Squid's cache. This might happen if:

  • One of your users complains about always receiving stale data.

  • Your cache becomes "poisoned" with a forged response.

  • Squid's cache index becomes corrupted after experiencing disk I/O errors or frequent crashes and restarts.

  • You want to remove some large objects to free up room for new data.

  • Squid was caching responses from local servers, and now you don't want it to.

Some of these problems can be solved by forcing a reload in a web browser. However, this doesn't always work. For example, some browsers display certain content types externally by launching another program; that program probably doesn't have a reload button or even know about caches.

You can always use the squidclient program to reload a cached object if necessary. Simply insert the -r option before the URI:

 % squidclient -r http://www.lrrr.org/junk >/tmp/foo 

If you happen to have a refresh_pattern directive with the ignore-reload option set, you and your users may be unable to force a validation of the cached response. In that case, you'll be better off purging the offending object or objects.

7.6.1 Removing Individual Objects

Squid accepts a custom request method for removing cached objects. The PURGE method isn't one of the official HTTP request methods . It is different from DELETE , which Squid forwards to an origin server. A PURGE request asks Squid to remove the object given in the URI. Squid returns either 200 (Ok) or 404 (Not Found).

The PURGE method is somewhat dangerous because it removes cached objects. Squid disables the PURGE method unless you define an ACL for it. Normally you should allow PURGE requests only from localhost and perhaps a small number of trusted hosts . The configuration may look like this:

 acl AdminBoxes src 127.0.0.1 172.16.0.1 192.168.0.1 acl Purge method PURGE http_access allow AdminBoxes Purge http_access deny Purge 

The squidclient program provides an easy way to generate PURGE requests. For example:

 % squidclient -m PURGE http://www.lrrr.org/junk 

Alternatively, you could use something else (such as a Perl script) to generate your own HTTP request. It can be very simple:

 PURGE http://www.lrrr.org/junk HTTP/1.0 Accept: */* 

Note that a URI alone doesn't uniquely identify a cached response. Squid also uses the original request method in the cache key. It may also use other request headers if the response contains a Vary header. When you issue a PURGE request, Squid looks for cached objects originally requested with the GET and HEAD methods. Furthermore, Squid also removes all variants of a response, unless you remove a specific variant by including the appropriate headers in the PURGE request. Squid removes only variants for GET and HEAD requests.

7.6.2 Removing a Group of Objects

Unfortunately, Squid doesn't provide a good mechanism for removing a bunch of objects at once. This often comes up when someone wants to remove all objects belonging to a certain origin server.

Squid lacks this feature for a couple of reasons. First, Squid would have to perform a linear search through all cached objects. This is CPU- intensive and takes a long time. While Squid is searching, your users can experience a performance degradation. Second, Squid keeps MD5s, rather than URIs, in memory. MD5s are one-way hashes, which means, for example, that you can't tell if a given MD5 hash was generated from a URI that contains the string "www.example.com." The only way to know is to recalculate the MD5 from the original URI and see if they match. Because Squid doesn't have the URI, it can't perform the calculation.

So what can you do?

You can use the data in access.log to get a list of URIs that might be in the cache. Then, feed them to squidclient or another utility to generate PURGE requests. For example:

 % awk '{print }' /usr/local/squid/var/logs/access.log \          grep www.example.com \          xargs -n 1 squidclient -m PURGE 

7.6.3 Removing All Objects

In extreme circumstances you may need to wipe out the entire cache, or at least one of the cache directories. First, you must make sure that Squid isn't running.

One of the easiest ways to make Squid forget about all cached objects is to overwrite the swap.state files. Note that you can't simply remove the swap.state files because Squid then scans the cache directories and opens all the object files. You also can't simply truncate swap.state to a zero- sized file. Instead, you should put a single byte there, like this:

 # echo '' > /usr/local/squid/var/cache/swap.state 

When Squid reads the swap.state file, it gets an error because the record that should be there is too short. The next read results in an end-of-file condition, and Squid completes the rebuild procedure without loading any object metadata.

Note that this technique doesn't remove the cache files from your disk. You've only tricked Squid into thinking that the cache is empty. As Squid runs, it adds new files to the cache and may overwrite the old files. In some cases, this might cause your disk to run out of free space. If that happens to you, you need to remove the old files before restarting Squid again.

One way to remove cache files is with rm . However, it often takes a very long time to remove all the files that Squid has created. To get Squid running faster, you can rename the cache directory, create a new one, start Squid, and remove the old one at the same time. For example:

 # squid -k shutdown # cd /usr/local/squid/var # mv cache oldcache # mkdir cache # chown nobody:nobody cache # squid -z # squid -s # rm -rf oldcache & 

Another technique is to simply run newfs (or mkfs ) on the cache filesystem. This works only if you have the cache_dir on its own disk partition.

 <  Day Day Up  >  


Squid
Squid: The Definitive Guide
ISBN: 0596001622
EAN: 2147483647
Year: 2004
Pages: 401
Authors: Duane Wessels

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net