We are not concerned here with firewalls, so we take them for granted. The interesting thing is how we configure the proxy Apache to make life with a firewall tolerable to those behind it.
site.proxy has three subdirectories: cache, proxy, real. The Config file from ... /site. proxy/proxy is as follows:
User webuser Group webgroup ServerName www.butterthlies.com Port 8000 ProxyRequests on CacheRoot /usr/www/APACHE3/site.proxy/cache CacheSize 1000
The points to notice are as follows:
On this site we use ServerName www.butterthlies.com.
The Port number is set to 8000 so we don't collide with the real web server running on the same machine.
We turn ProxyRequests on and provide a directory for the cache, which we will discuss later in this chapter.
CacheRoot is set up in a special directory.
CacheSize is set to 1000 kilobytes.
AllowCONNECT |
AllowCONNECT port [port] ... AllowCONNECT 443 563 Server config, virtual host Compatibility: AllowCONNECT is only available in Apache 1.3.2 and later.
The AllowCONNECT directive specifies a list of port numbers to which the proxy CONNECT method may connect. Today's browsers use this method when a https connection is requested and proxy tunneling over http is in effect.
By default, only the default https port (443) and the default snews port (563) are enabled. Use the AllowCONNECT directive to override this default and allow connections to the listed ports only.
ProxyRequests |
ProxyRequests [on|off] Default: off Server config
This directive turns proxy serving on. Even if ProxyRequests is off, ProxyPass directives are still honored.
ProxyRemote |
ProxyRemote match remote-server Server config
This directive defines remote proxies to this proxy (that is, proxies that should be used for some requests instead of being satisfied directly). match is either the name of a URL scheme that the remote server supports, a partial URL for which the remote server should be used, or * to indicate that the server should be contacted for all requests. remote-server is the URL that should be used to communicate with the remote server (i.e., it is of the form protocol://hostname[:port]). Currently, only HTTP can be used as the protocol for the remote-server. For example:
ProxyRemote ftp http://ftpproxy.mydomain.com:8080 ProxyRemote http://goodguys.com/ http://mirrorguys.com:8000 ProxyRemote * http://cleversite.com
ProxyPass |
ProxyPass path url Server config
This command runs on an ordinary server and translates requests for a named directory and below to a demand to a proxy server. So, on our ordinary Butterthlies site, we might want to pass requests to /secrets onto a proxy server darkstar.com:
ProxyPass /secrets http://darkstar.com
Unfortunately, this is less useful than it might appear, since the proxy does not modify the HTML returned by darkstar.com. This means that URLs embedded in the HTML will refer to documents on the main server unless they have been written carefully. For example, suppose a document one.html is stored on darkstar.com with the URL http://darkstar.com/one.html, and we want it to refer to another document in the same directory. Then the following links will work, when accessed as http://www.butterthlies.com/secrets/one.html:
<A HREF="two.html">Two</A> <A HREF="/secrets/two.html">Two</A> <A HREF="http://darkstar.com/two.html">Two</A>
But this example will not work:
<A HREF="/two.html">Not two</A>
When accessed directly, through http://darkstar.com/one.html, these links work:
<A HREF="two.html">Two</A> <A HREF="/two.html">Two</A> <A HREF="http://darkstar.com/two.html">Two</A>
But the following doesn't:
<A HREF="/secrets/two.html">Two</A>
ProxyDomain |
ProxyDomain domain Server config
This directive tends to be useful only for Apache proxy servers within intranets. The ProxyDomain directive specifies the default domain to which the Apache proxy server will belong. If a request to a host without a fully qualified domain name is encountered, a redirection response to the same host with the configured domain appended will be generated. The point of this is that users on intranets often only type the first part of the domain name into the browser, but the server requires a fully qualified domain name to work properly.
NoProxy |
NoProxy { domain | subnet | ip_addr | hostname } Server config
The NoProxy directive specifies a list of subnets, IP addresses, hosts, and/or domains, separated by spaces. A request to a host that matches one or more of these is always served directly, without forwarding to the configured ProxyRemote proxy server(s).
ProxyPassReverse |
ProxyPassReverse path url Server config, virtual host
A reverse proxy is a way to masquerade one server as another — perhaps because the "real" server is behind a firewall or because you want part of a web site to be served by a different machine but not to look that way. It can also be used to share loads between several servers — the frontend server simply accepts requests and forwards them to one of several backend servers. The optional module mod_rewrite has some special stuff in it to support this. This directive lets Apache adjust the URL in the Location response header. If a ProxyPass (or mod_rewrite) has been used to do reverse proxying, then this directive will rewrite Location headers coming back from the reverse-proxied server so that they look as if they came from somewhere else (normally this server, of course).
ProxyVia |
ProxyVia on|off|full|block Default: ProxyVia off Server config, virtual host
This directive controls the use of the Via: HTTP header by the proxy. Its intended use is to control the flow of proxy requests along a chain of proxy servers. See RFC2068 (HTTP 1.1) for an explanation of Via: header lines.
If set to off, which is the default, no special processing is performed. If a request or reply contains a Via: header, it is passed through unchanged.
If set to on, each request and reply will get a Via: header line added for the current host.
If set to full, each generated Via: header line will additionally have the Apache server version shown as a Via: comment field.
If set to block, every proxy request will have all its Via: header lines removed. No new Via: header will be generated.
ProxyReceiveBufferSize |
ProxyReceiveBufferSize bytes Default: None Server config, virtual host
The ProxyReceiveBufferSize directive specifies an explicit network buffer size for outgoing HTTP and FTP connections for increased throughput. It has to be greater than 512 or set to 0 to indicate that the system's default buffer size should be used.
ProxyReceiveBufferSize 2048
ProxyBlock |
ProxyBlock *|word|host|domain [word|host|domain] ... Default: None Server config, virtual host
The ProxyBlock directive specifies a list of words, hosts and/or domains, separated by spaces. HTTP, HTTPS, and FTP document requests to sites whose names contain matched words, hosts, or domains that are blocked by the proxy server. The proxy module will also attempt to determine IP addresses of list items that may be hostnames during startup and cache them for match test as well. For example:
ProxyBlock joes-garage.com some-host.co.uk rocky.wotsamattau.edu
rocky.wotsamattau.edu would also be matched if referenced by IP address.
Note that wotsamattau would also be sufficient to match wotsamattau.edu.
Note also that:
ProxyBlock *
blocks connections to all sites.
Copyright © 2003 O'Reilly & Associates. All rights reserved.