So far in this chapter we have described how to implement stateful applications using sessions, but we have not discussed when they should or should not be used. Sessions allow some kinds of applications to be developed that otherwise would be difficult to implement on the Web. However, because HTTP is a stateless protocol, building a stateful application can present problems and restrictions. Avoiding the need to maintain state information is often a desirable goal. In this section we list some reasons sessions are used and some reasons to avoid them.
Sessions can be used in web database applications for several reasons. Many traditional database applications use sessions to help control user interaction, while other applications use sessions to reduce server processing.
In a stateless environment, an application may need to repeat an expensive operation. An example might be a financial calculation that requires many SQL statements and calls to mathematics libraries before displaying the results on several web pages. An application that uses a session variable to remember the result exposes the user, and the server, to the cost of the calculation only once.
Often a database application—or indeed any application—needs to present a series of screens in a controlled order. One style of application—known as a wizard—guides a user through what would otherwise be a complex task with a series of screens. Wizards are sometimes used for complex configurations, such as some software installations, and often alter the flow of screens based on user input. Some applications require that a user enter via a known page. Applications, such as online banking, often force a user to enter via a login page rather than allow access directly to a function such as funds transfer.
Many database applications validate data before creating or updating a record in the database, preventing erroneous data from being saved. Sessions can keep the intermediate data, so that incomplete data can be edited—rather that rekeyed—when errors are detected. Earlier in this chapter we used sessions to improve the interaction between the client entry <form> and validation scripts of the winestore application. In the case study, the fields entered by the user are held in an array as a session variable until the validation is successful. Another example where intermediate results can be used is when a database application collects and validates data for a single record over a number of fill-in forms. A shopping cart is an example where complete data may not be created until a user requests a purchase. The winestore application doesn't implement the shopping cart this way; rather, a shopping cart is implemented by creating a row in the orders table and adding rows to the items table as items are selected. The winestore application then needs to store only the cust_id and the order_no—the combination is the primary key of the orders table—as session variables while a shopping cart is being used. We develop the shopping cart in Chapter 11.
Sessions can personalize a web site. Personalization not only includes background color or layout alternatives, but can include recording a user's interests and modifying searches. The winestore application can record favorite regions or a buyer's price range as session variables; each query could then be modified to reflect these settings. A result screen displays "wines from your favorite regions within your budget" before displaying other wines.
The reasons to avoid sessions focus mainly on the stateless nature of HTTP. The features of HTTP that support browsing access to a disparate collection of resources don't support stateful applications. Stateful applications work over the Web often at the expense of HTTP features.
In an application that uses sessions, each HTTP request needs to be processed in the context of the session variables to which that request belongs. The state information recorded as the result of one request needs to be available to subsequent requests. Most applications that implement sessions store session variables in the middle tier. Once a session is created, all subsequent requests must be processed on the web server that holds the session variables. This requirement prevents such applications from using HTTP to distribute requests across multiple servers and therefore can't easily scale horizontally to handle large numbers of requests.[12] One way for a web database application to allow multiple web servers is to store session variables in the database tier. This approach is described in Appendix D, where we provide a PHP and MySQL implementation of a database-tier session store.
[12]Scaling up an application—increasing the number of requests an application can respond to in a given period—can be achieved horizontally by providing more machines, and vertically by providing a single bigger, faster, or more efficient machine.
When a server that offers session management processes a request, there is the unavoidable overhead of identifying and accessing session variables. The session overhead results in longer processing times for requests, which affects the performance and capacity of a site. While sessions can improve application performance—for example, a session can keep the result of an expensive operation—the gains may be limited and outweighed by the extra processing required. Servers that manage session variables in memory require more memory. As the amount of memory used by the web server grows, a system may need to move portions of memory to disk—an operation known as swapping. Swapping memory in and out of disk storage is slow and can severely degrade the performance of a server. Servers that use files—such as the default PHP session management—incur the cost of reading and writing a file on disk each time a session is accessed.
Sessions can also cause synchronization problems. Because HTTP is stateless, there is no way of knowing when a user has really finished with an application. Other network applications can catch the fact that a connection has been dropped and clean up the state that was held on behalf of that user, even if the user did not use a logout procedure (such as typing exit or clicking on a logout button). The Telnet application is such an example where a user makes a connection to a system over the Internet. However, unlike HTTP, the TCP/IP connection for Telnet is kept for the length of the session, and if the connection is lost—say, if the client's PC crashes or the power is lost—the user is logged out of the remote system. With a session over the Web, the server doesn't know about these events and has to make a decision as to how long to keep the session information. In the case of PHP session management, a garbage collection scheme is used, as we discussed earlier in this chapter.
Because HTTP is stateless, browsers allow users to save URLs as a list of bookmarks or favorite sites. The user can return to a web site at a later date by simply selecting a bookmarked URL. Web sites that provide weather forecasts, stock prices, and even search results from a web search engine are examples of the sites a user might want to bookmark. Consider the URL for a fictional site that provides stock prices:
http://www.someexchange.com/stockprice.php?code=SIMCO
The URL encodes a query that identifies a particular stock, and presumably, the script stockprice.php uses the query to display the current stock price of the company. The URL can be bookmarked because it contains all that is needed to generate the stock price page for the given company code. An alternative site may collect the company code using a <form> and, when the form is submitted, use a session variable to hold the company code as a query. The script that generates the stock price page reads the session variable, looks up the current price, and generates the result for the entered company code. If a user bookmarks the session-based stock price page and comes back in a week, the session that stored the company code is unlikely to still exist, and the script fails to display the desired company's stock price.
Sometimes bookmarking a page makes no sense. Consider an online banking application that allows transfer of funds between two accounts. A user would log in to the application, then request the transfer page that collects the source and target account details in a <form>. When that <form> is submitted, a confirmation page is shown without actually performing the transaction. Revisiting this page through a bookmark has no meaning if the transaction was subsequently confirmed or canceled. Generally, the pages generated from applications such as online banking can't be bookmarked because of the reliance on session variables. Session management in such applications is often tied closely to authentication, a topic explored further in Chapter 9.
Sessions can provide a way for a hacker to break into a system. Sessions can be open to hijacking; a hacker can take over after a legitimate user has logged into an application. There is much debate about the security of session-based applications on the Web, and we discuss some issues of session security in the next chapter.
Copyright © 2003 O'Reilly & Associates. All rights reserved.