It is well known that the Web is populated by mean and unscrupulous people who want to mess up your site. Many conservative citizens think that a firewall is the way to stop them. The purpose of a firewall is to prevent the Internet from connecting to arbitrary machines or services on your own LAN/WAN. Another purpose, depending on your environment, may be to stop users on your LAN from roaming freely around the Internet.
The term firewall does not mean anything standard. There are lots of ways to achieve the objectives just stated. Two extremes are presented in this section, and there are lots of possibilities in between. This is a big subject: here we are only trying to alert the webmaster to the problems that exist and to sketch some of the ways to solve them. For more information on this subject, see Building Internet Firewalls, by D. Brent Chapman and Elizabeth D. Zwicky (O'Reilly, 2000).
This technique is the simplest firewall. In essence, you restrict packets that come in from the Internet to safe ports. Packet-filter firewalls are usually implemented using the filtering built into your Internet router. This means that no access is given to ports below 1024 except for certain specified ones connecting to safe services, such as SMTP, NNTP, DNS, FTP, and HTTP. The benefit is that access is denied to potentially dangerous services, such as the following:
The advantages of packet filtering are that it's quick and easy. But there are at least two disadvantages:
Even the standard services can have bugs allowing access. Once a single machine is breached, the whole of your network is wide open. The horribly complex program sendmail is a fine example of a service that has, over the years, aided many a cracker.
Someone on the inside, cooperating with someone on the outside, can easily breach the firewall.
Another problem that can't exactly be called a disadvantage is that if you filter packets for a particular service, then you should almost certainly not be running the service of binding it to a backend network so the Internet can't see it — which would then make the packet filter somewhat redundant.
A more extreme firewall implementation involves using separate networks. In essence, you have two packet filters and three separate, physical, networks: Inside, Inbetween (often known as Demilitarized Zone [DMZ]), and Outside (see Figure 11-1). There is a packet-filter firewall between Inside and Inbetween, and between Outside and the Internet. A nonrouting host,[41] known as a bastion host, is situated on Inbetween and Outside. This host mediates all interaction between Inside and the Internet. Inside can only talk to Inbetween, and the Internet can only talk to Outside.
[41]Nonrouting means that it won't forward packets between its two networks. That is, it doesn't act as a router.
Administrators of the bastion host have more or less complete control, not only over network traffic but also over how it is handled. They can decide which packets are permitted (with the packet filter) and also, for those that are permitted, what software on the bastion host can receive them. Also, since many administrators of corporate sites do not trust their users further than they can throw them, they treat Inside as if it were just as dangerous as Outside.
Separate networks take a lot of work to configure and administer, although an increasing number of firewall products are available that may ease the labor. The problem is to bridge the various pieces of software to cause it to work via an intermediate machine, in this case the bastion host. It is difficult to be more specific without going into unwieldy detail, but HTTP, for instance, can be bridged by running an HTTP proxy and configuring the browser appropriately, as we saw in Chapter 9. These days, most software can be made to work by appropriate configuration in conjunction with a proxy running on the bastion host, or else it works transparently. For example, Simple Mail Transfer Protocol (SMTP) is already designed to hop from host to host, so it is able to traverse firewalls without modification. Very occasionally, you may find some Internet software impossible to bridge if it uses a proprietary protocol and you do not have access to the client's source code.
SMTP works by looking for Mail Exchange (MX) records in the DNS corresponding to the destination. So, for example, if you send mail to our son and brother Adam[42] at [email protected], an address that is protected by a firewall, the DNS entry looks like this:
[42]That is, he's the son of one of us and the brother of the other.
# dig MX aldigital.algroup.co.uk ; <<>> DiG 2.0 <<>> MX aldigital.algroup.co.uk ;; ->>HEADER<<- opcode: QUERY , status: NOERROR, id: 6 ;; flags: qr aa rd ra ; Ques: 1, Ans: 2, Auth: 0, Addit: 2 ;; QUESTIONS: ;; aldigital.algroup.co.uk, type = MX, class = IN ;; ANSWERS: aldigital.algroup.co.uk. 86400 MX 5 knievel.algroup.co.uk. aldigital.algroup.co.uk. 86400 MX 7 arachnet.algroup.co.uk. ;; ADDITIONAL RECORDS: knievel.algroup.co.uk. 86400 A 192.168.254.3 arachnet.algroup.co.uk. 86400 A 194.128.162.1 ;; Sent 1 pkts, answer found in time: 0 msec ;; FROM: arachnet.algroup.co.uk to SERVER: default -- 0.0.0.0 ;; WHEN: Wed Sep 18 18:21:34 1996 ;; MSG SIZE sent: 41 rcvd: 135
What does all this mean? The MX records have destinations (knievel and arachnet) and priorities (5 and 7). This means "try knievel first; if that fails, try arachnet." For anyone outside the firewall, knievel always fails, because it is behind the firewall[43] (on Inside and Inbetween), so mail is sent to arachnet, which does the same thing (in fact, because knievel is one of the hosts mentioned, it tries it first then gives up). But it is able to send to knievel, because knievel is on Inbetween. Thus, Adam's mail gets delivered. This mechanism was designed to deal with hosts that are temporarily down or with multiple mail delivery routes, but it adapts easily to firewall traversal.
[43]We know this because one of the authors (BL) is the firewall administrator for this particular system, but, even if we didn't, we'd have a big clue because the network address for knievel is on the network 192.168.254, which is a "throwaway" (RFC 1918) net and thus not permitted to connect to the Internet.
This affects the Apache user in three ways:
Copyright © 2003 O'Reilly & Associates. All rights reserved.