Certification Objective 9.04-The Squid Web Proxy Cache | Linux Patch Management: Keeping Linux Systems Up To Date

Squid is a high-performance HTTP and FTP caching proxy server. It is also known as a Web proxy cache. It can make your network connections more efficient. As it stores data from frequently used Web pages and files, it can often give your users the data they need without their systems having to look to the Internet.

Studies on very busy networks suggest that a Squid server can reduce the size, or bandwidth, of your Internet connection by 10 to 20 percent. That can lead to considerable savings for larger offices.

Squid uses the Inter-Cache Protocol (ICP) for transfers between participating peer and parent/child cache servers. It can be used either as a traditional caching proxy or as a front-end accelerator for a traditional Web server. Squid accepts only HTTP requests but speaks FTP on the server side when FTP objects are requested. For more information, see www.squid-cache.org. One book dedicated to this service is Duane Wessels's Squid: The Definitive Guide, published by O'Reilly.

Key Squid Files and Directories

The Squid RPM package is installed by default when you install the Web Server package group. So if you've installed Apache and have not tinkered with the defaults, the Squid RPM should also be installed on your computer. This RPM package installs a substantial number of files and scripts; some of the key files include the following:

/etc/rc.d/init.d/squid Start/stop script
/etc/squid/ Configuration directory
/etc/sysconfig/squid Other configurable options
/usr/share/doc/squid-versionnumber Documentation, mostly in HTML format
/usr/lib/squid/ Support files and internationalized error messages
/usr/sbin/squid Main Squid daemon
/usr/share/squid Various squid configuration add-ons
/var/log/squid/ Log directory
/var/spool/squid/ Cache directory (once Squid is active, this directory includes hundreds of MBs, and maybe more, in hashed directories)

Starting Squid on Reboot

The Squid Web Proxy is not started by default. To do so, you'll want to activate it using a command such as chkconfig or the Service Configuration utility described in Chapter 3. The easiest way to set Squid to start the next time you boot Linux is with the following command:

 # chkconfig squid on

When the Squid proxy server starts for the first time, the /etc/init.d/squid start script starts the squid daemon. Squid runs as a caching proxy server on port 3128. You can then set up Web browsers on your LAN to point your computer to Squid on port 3128 instead of an external network such as the Internet. In that way, Squid would get the first chance at serving the needs of users on your network.

Basic Squid Configuration

You can configure and customize the way Squid operates through its configuration file, /etc/squid/squid.conf. The default version of this file includes a large number of comments that can help you tune and secure Squid. Since it has more than 4000 lines, this isn't the easiest file to review.

/etc/sysconfig/squid

Squid also works through its /etc/sysconfig/squid file, which specifies switches for the squid daemon when it starts. By default, it disables DNS checking with the -D option, as described by the following directive:

 SQUID_OPTS="-D"

It also specifies a shutdown timeout in seconds with the following directive:

 SQUID_SHUTDOWN_TIMEOUT=100

You can add more options to the SQUID_OPTS directive as specified in the man page for the squid command; other directives are used by the /etc/rc.d/init.d/squid script.

/etc/squid/squid.conf

Now we'll examine the defaults in the main Squid configuration file, squid.conf in the /etc/squid/ directory. Although there are over 4000 lines in this file, only a few are active by default. Most of this file is filled with comments that describe most directives and associated options. You may note a squid.conf.default file, which serves as an effective backup to the original configuration file.

If you're following on your own system, note that several of these directives are split by several dozen lines of comments. In other words, you may need to scroll through the squid.conf configuration file to see the different directives as described in this section.

First, the default port is shown with the following directive:

 http_port 3128

The hierarchy_stoplist directive specifies conditions where Squid doesn't look in its cache and forwards requests directly to the server. In this case, URLs with cgi-bin and ? are not stored and are directly forwarded. The directive is used with the following two commands, which never caches URLs with the same characters and then denies caching to said searches:

 hierarchy_stoplist cgi-bin ? acl QUERY urlpath_regex cgi-bin \? cache deny QUERY

More than 1000 lines later, the refresh_pattern directive specifies when data from a specified server is considered "fresh"; in other words, data that fits in these parameters is taken from the local Squid cache. The following directives specify that FTP data is fresh for at least 1440 minutes. If there's no explicit "freshness" life, as defined by the remote server, and the file was last modified 10 hours ago, 20% means that the data will be considered fresh for another 2 hours. The maximum "freshness" date for FTP data in a Squid cache based on the first directive is 10080 minutes.

 refresh_pattern ^ftp:           1440    20%     10080 refresh_pattern ^gopher:        1440    0%      1440 refresh_pattern .               0       20%     4320

Next, Squid continues to configure ACLs with the acl directive (this is unrelated to the Access Control Lists described in Chapter 14). First, it allows management access from all IP addresses (which we'll limit shortly), using the cache_object protocol. It specifies the localhost variable as a source and the to_localhost variable as a destination address:

 acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8

The following acl directives specify ports through which traffic is cached. The port numbers are all TCP/IP ports, which can be verified in /etc/services:

 acl SSL_ports port 443 563 acl Safe_ports port 80 acl Safe_ports port 21 acl Safe_ports port 443 acl Safe_ports port 70 acl Safe_ports port 210 acl Safe_ports port 1025-65535 acl Safe_ports port 280 acl Safe_ports port 488 acl Safe_ports port 591 acl Safe_ports port 777

Squid as a proxy service can also help protect your network. The following http_access directives support access from the local computer, deny them from all others, and deny requests (with the !, also known as a "bang") that use anything but the aforementioned Safe_ports and SSL_ports variables.

 http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports

After this group of default directives, you can configure access to local networks. If active, the following allows access to the two noted private IP networks:

 #acl our_networks src 192.168.1.0/24 192.168.2.0/24 #http_access allow our_networks

The following directives also allow access from the local computer and deny access to all other computers:

 http_access allow localhost http_access deny all

But communication goes in both directions. If access is allowed, the following directive supports replies:

 http_reply_access allow all

Of course, the Squid Web Proxy follows the InterCache Protocol (ICP), so all queries that follow ICP are allowed:

 icp_access allow all

As you've seen, there are a substantial number of other options shown in comments in the file. It's not possible to cover even a fraction of the available options here. Fortunately, you don't need to know them for the Red Hat exams.

Configuration Options

You need to add three lines to the squid.conf file in the /etc/squid/ directory before activating Squid. For example, if the name of the local computer is Enterprise5a, you'd add the following line:

 visible_hostname Enterprise5a

Make sure you add the line near the associated comment in the file.

Next, to support regular Web (HTTP) access, you'll need to set the http_access directive to allow some arbitrary name. As described earlier, the appropriate location and sample commands are included in the default squid.conf file:

 #acl our_networks src 192.168.1.0/24 192.168.2.0/24 #http_access allow our_networks

In my case, I use the 192.168.0.0/24 network, so I include the following access control list directive:

 acl local_net src 192.168.0.0/24

Next, you'll need to add your local network to the Squid Access Control List. This particular command line uses the local_net variable that I just created and adds the IP addresses of a private network that I've used:

 http_access allow local_net

Now you can save your changes and exit the squid.conf configuration file. You can then create the basic cache directories in /var/spool/squid with the following command:

 # squid -z

Finally, start the Squid service for the first time with the appropriate service command:

 # service squid start

While you're unlikely to have a chance to configure more than one computer with Squid, its power is in connecting the cache from multiple servers. You can configure this with the cache_peer directives, which specify parent and sibling Squid cache servers. If your Linux computer is part of a group of Squid servers, these lines allow your Squid servers to check these other Squid servers before going to the Internet.

Squid first checks its own cache and then queries its siblings and parents for the desired object such as a Web page. If neither the cache host nor its siblings have the object, it asks one of its parents to fetch it from the source. If no parent servers are available, it fetches the object itself.

On the Job

Squid can greatly improve the performance of a corporate intranet. If your company has many employees who surf the Net, a Squid server can reduce your network connection costs by decreasing the bandwidth you need for your Internet connection.

Security Options

When you configure Squid on your system, you need to allow access through appropriate ports and SELinux settings. The simplest approach with respect to Squid configuration assumes that all clients use the default Squid port 3128, using TCP, where traffic is allowed through that port. Alternatively, you can redirect requests to the standard Web server port (80) to Squid port 3128, using an appropriate iptables command.

You can configure the RHEL firewall in these two ways through that port using the techniques described in Chapter 15. Briefly, you can open TCP port 3128 in the firewall using the Security Level Configuration tool; the technique is elementary. If you want to add an iptables command to forward TCP port 80 traffic to TCP port 3128, you would use the following:

 # iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT \ --to-ports 3128

Exam Watch

As described by the Red Hat Exam Prep guide, RHCEs must be able to use iptables to implement packet filtering and/or NAT.

As for SELinux, you'll at least want to use the Security Level Configuration tool to activate the Allow Squid Daemon To Connect To The Network option, also known as the squid_connect_any boolean. Alternatively, you can activate this boolean with the following command:

 # setsebool -P squid_connect_any 1

Exercise 9-6: Configuring Squid to Act as a Proxy Server

This exercise assumes you have a LAN. One of the computers on the LAN is also a server that is connected to the Internet, with Squid properly installed. Then you can configure Squid to act as a proxy for Web and FTP services for your LAN.

Open the Squid configuration file, /etc/squid/squid.conf, in a text editor.
Add the name of your computer to this file. Add the following command near the comments associated with visible_hostname:
```
 visible_hostname computername 
```
Add an http_access command to allow access from your local network. You can set an arbitrary name of your choice for the network, but you'll need to use it in the command afterward. Locate the command near the other http_access commands in this file:
```
 http_access allow lan_net 
```
Configure access from your LAN to Squid with an appropriate acl command. The following command allows access from your lan_net with an IP network address of 172.168.30.0:
```
 acl lan_net src 172.168.30.0/24 
```
Save your changes and exit.
Stop the Squid service if it isn't already running with the service squid stop command.
Create Squid swap directories with the squid -z command.
Start the Squid service with the service squid start command.
Configure a test client such as a Web browser to use your Squid service.
Test your client by using both HTTP and FTP addresses in the browser address field. Use it to retrieve files from various sites on the Internet, such as www.redhat.com and ftp.kernel.org.
If problems occur from external systems, you may have configured a firewall and/or SELinux protection that you don't want. Open the Security Level Configuration tool. If you want a firewall, make sure to disable the firewall for TCP port 3128. If you want an active version of SELinux, make sure to activate the squid_connect_any boolean.