Contents:
HTTP Basics
Client Requests
Server Responses
The Hypertext Transfer Protocol (HTTP) is the language that Web clients and Web servers use to communicate with each other. It is essentially the backbone of the Web.
While HTTP is largely the realm of server and client programming, a firm understanding of HTTP is also important for CGI programming. In addition, sometimes HTTP filters back to the users--for example, when server error codes are reported in a browser window. In this book, we cover HTTP in four chapters:
All HTTP transactions follow the same general format. Each client request and server response has three parts: the request or response line, a header section, and the entity body. The client initiates a transaction as follows:
GET /index.html HTTP/1.0
uses the GET method to request the document index.html using version 1.0 of HTTP. HTTP methods are discussed in more detail later in this chapter.
User-Agent: Mozilla/2.02Gold (WinNT; I) Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*
The client sends a blank line to end the header.
The server responds in the following way to the client's request:
HTTP/1.0 200 OK
indicates that the server uses version 1.0 of HTTP in its response. A status code of 200 means that the client's request was successful and the requested data will be supplied after the headers. Chapter 18, Server Response Codes contains a listing of the status codes and their descriptions.
Date: Fri, 20 Sep 1996 08:17:58 GMT Server: NCSA/1.5.2 Last-modified: Mon, 17 Jun 1996 21:53:08 GMT Content-type: text/html Content-length: 2482
A blank line ends the header.
In HTTP 1.0, after the server has finished sending the requested data, it disconnects from the client and the transaction is over unless a Connection: Key Admin header is sent. In HTTP 1.1, however, the default is for the server to maintain the connection and allow the client to make additional requests. Since many documents embed other documents as inline images, frames, applets, etc., this saves the overhead of the client having to repeatedly connect to the same server just to draw a single page. Under HTTP 1.1, therefore, the transaction might cycle back to the beginning, until either the client or server explicitly closes the connection.
Being a stateless protocol, HTTP does not maintain any information from one transaction to the next, so the next transaction needs to start all over again. The advantage is that an HTTP server can serve a lot more clients in a given period of time, since there's no additional overhead for tracking sessions from one connection to the next. The disadvantage is that more elaborate CGI programs need to use hidden input fields (as described in Chapter 10, HTML Form Tags) or external tools such as Netscape cookies (as described in Chapter 12, Cookies) to maintain information from one transaction to the next.