So far in this chapter we have presented techniques that control access to resources—in particular, PHP scripts—based around HTTP authentication. The simplest technique discussed so far is to configure Apache to perform the authentication and authorization. For greater flexibility, we have described how PHP can manage the authentication process, allowing scripts to apply whatever logic is required to meet the authorization needs.
In this section we discuss issues of building web database applications:
Examining why HTTP authentication works well with stateless applications
Showing how a stateful application might manage HTTP authentication and the issues that are faced when building session-based web database applications
Discussing some reasons why HTTP authentication may not be suitable for all applications
Developing an authentication framework that can be used in a web database application illustrating the techniques presented in this section and earlier in this chapter
HTTP authentication is particularly well suited to stateless applications. HTTP authentication protects sets of resources, or realms, by challenging requests that don't contain authenticated credentials. We described the HTTP authentication process at the beginning of this chapter. Once an authenticated set of credentials has been collected for a realm, the user can browse the resources protected by that realm. For example, a web site may contain a set of browsable files—resources—on a web server. It doesn't matter which resource is requested; the first time a user accesses the site, she is challenged. Once the credentials are established, the user can browse the resources unchallenged.
HTTP authentication also supports bookmarking—the ability to add URLs to a list of bookmarks or favorite sites. The user can request the protected resource from the web site at a later date by selecting a bookmarked URL. If the user has not visited that site for some time, the request is challenged and the user is prompted for a username and password.
The techniques we have presented so far in this chapter can authenticate stateless applications. If you configure Apache to authenticate requests to an application's PHP scripts, no extra code needs to be written. If more authorization control is required, a function similar to the authenticateUser( ) function, shown in Example 9-7, can be included at the start of each script.
Building stateful web applications requires special care because of the stateless nature of HTTP. In Chapter 8 we presented session management as a technique for building stateful applications. Many web database applications—such as on-line banking—require both authentication and session management. We now look at some of the issues that arise when building session-based applications that require user authentication.
Many traditional database applications require users to log in before they can perform any operations. For example, an online banking application may allow access only after a user has entered credentials from a login page. In session-based applications, forcing users to always authenticate themselves via a login script allows session variables to be registered so that the rest of the application pages operate correctly. A single point of entry can also record when users access an application or force users to view advertising.
Using HTTP authentication, if a user makes a request for a script other than the login page of the application, and the request doesn't contain the Authorization header field, the response should redirect the user to the login page. This fragment of code sets the Location header field, which instructs the browser to relocate to the login page if either the $PHP_AUTH_USER or $PHP_AUTH_PW variables aren't set:
<?php // If this is an unauthorized request, just // re-locate to the login page of the application if (!isset($PHP_AUTH_USER) || !isset($PHP_AUTH_PW)) header("Location: login.php") exit( ); // ... perform authentication and authorization ... ?> ... rest of script ...
HTTP authentication provides a simple mechanism for building applications that need to control user access. HTTP authentication supports stateless applications well and, with additional coding, can support stateful, session-based applications. However, HTTP authentication may not meet the requirements of some web database applications. Consider the following problems of HTTP authentication:
Applications can be written to minimize this risk. By writing scripts that deliberately respond as unauthorized to a request that contains authenticated credentials, an application can enforce the intention of a logout. However, the application has to remember that the user logged out—or timed out—and respond accordingly. Such schemes lead to clumsy interactions with the user
Another feature that isn't supported using the basic HTTP authentication is allowing users to authenticate themselves with credentials other than a username and a password. You can allow a user who has forgotten his password, to go to an alternate login page that asks for his date of birth, his mother's maiden name, or other personal details to authenticate. For this kind of application you should collect a new password and restrict the number of attempts to the alternate login screen; otherwise, there could be a security risk.
Some applications require multiple logins. For example, an application might be a corporate information system that requires all users to log in for basic access but then requires an additional username and password to access a restricted part of the site. HTTP doesn't allow for multiple Authorization header fields in the one request
Authentication can be built into session-based applications by collecting user credentials in a <form>. When the <form> is submitted, the username and password are authenticated, and the authenticated state is recorded as a session variable. The authentication and authorization techniques developed earlier in this chapter—for example the authenticateUser( ) function shown in Example 9-7—can easily be modified to work with <form> data rather than $PHP_AUTH_USER and $PHP_AUTH_PW.
Collecting user credentials in a <form> and storing the authenticated state in a session has disadvantages. First, the username and password aren't encoded—not even in a basic form—when passed from the browser to the web server. This problem is solved by using the Secure Sockets Layer protocol as discussed later in this chapter. Second, session hijacking may arise because the state of the session is used to control access to the application.
By using the authenticated state stored as a session variable, a session-based application can be open to hijacking. When a request is sent to a session-based application, the browser includes the session identifier, usually as a cookie, to access the authenticated session. Rather than snoop for usernames and passwords, a hacker can use a session ID to hijack an existing session. Consider an online banking application in which a hacker waits for a real user to log in and then takes over the session, by including the session ID in a request, and transfers funds into his own account. If the session isn't encrypted, it is easy to read the session ID or, for that matter, the username and password. We recommend that any application that transmits usernames, passwords, cookies that identify sessions, or personal details should be protected using encryption.
Even if the connection is encrypted, the session ID may still be vulnerable. If the session ID is stored in a cookie on the client, it is possible to trick the browser into sending the cookie unencrypted. This can happen if the cookie was set up by the server without the secure parameter that prevents cookie transmission over an insecure connection. Cookie parameters and how to set up PHP session management to secure cookies are discussed in Chapter 8.
Hijack attempts can also be less sophisticated. A hacker can hijack a session by randomly trying session IDs in the hope that an existing session might be found. On a busy site many hundreds of sessions might exist at any one time, increasing the chance of the success of such an attack. One precaution is to reduce the number of idle sessions by setting short maximum lifetimes for dormant sessions, as discussed in Chapter 8.
Another precaution is to use session IDs that are hard to guess. The default PHP session management uses a random number—that can be configured with a random seed—passed through an MD5hashing algorithm, which generates a 32-character ID. The randomness and use of MD5 hashing in PHP session IDs make them much harder to guess than an ID based on other parameters, such as the system time, the client IP address, or the username.
Earlier in this chapter we showed how to access the IP address of the browser when processing a request. The script shown in Example 9-5 checked the IP address set in the $REMOTE_ADDR variable against a hardcoded value to limit access to users on a particular subnet.
The IP address of the client that sent a request can be used to help prevent session hijacking. If the IP address set in $REMOTE_ADDR variable is recorded as a session variable when a user initially connects to an application, subsequent requests can be checked and allowed only if they are sent from the same IP address.
TIP: Using the IP address as recorded from the HTTP request has limitations. Network administrators often configure proxy servers to hide the originating IP address by replacing it with the address of the proxy server. All users who connect to an application via such a proxy server appear to be located on the one machine. Some large sites—such as a large university campus—might even have several proxy servers to balance load, so successive requests coming from a single user might appear to change address.
The case study example in this chapter is an authentication framework that doesn't rely on HTTP authentication to collect the username and password. The scripts developed in the case study illustrate how several techniques are applied and how the issues raised relating to session-based applications are solved. In this case study, we:
Develop a login <form> to collect user credentials
Authenticate the user credentials against encrypted passwords stored in the customer table
Use the IP address of the login request to deny access to requests from other machines
Develop a function that is included on each page to deny access without a successful login
Develop a logout function
Each customer of the winestore has an entry in the customer table that records confidential account details, including delivery address and credit-card details. Given such information, there is a good reason to restrict access to the application and protect confidential data.
We design the login page as a <form>, and the authentication is handled by the script that processes POST variables. The POST method is used rather than GET method to prevent the username and password from appearing in the URL. The authentication uses a query on the customer table to check the credentials; we use the approach described in Section 9.3.
We create a session to record the username that is authenticated and the IP address of the machine from which the login request originated. Each protected script then tests for the existence of the session variables that hold the authenticated name and the originating IP address and checks these against the originating IP address of the request for that script.
While the pages we have developed on the online winestore site are more attractive than the examples in this section, the structure of the code is the same.
The login page displays a <form> that collects a username and password and is used as the entry point for winestore customers. The login page is also used when a login attempt fails, as the destination page when a member logs out, and as a warning page when an unauthorized request is made to a script that requires a user to log in. Also, if a user that is already authorized requests the login page, we display a message to indicate that the user is already logged on. Figure 9-3 shows the rendered login <form> with a message showing a failed login attempt.
Example 9-8 shows the login script with two helper functions that generate the HTML. The function login_page( ) generates the HTML <form> that collects two named <input> fields: formUsername and formPassword. The argument $loginMessage passes any error or warning messages the login page needs to display. If the $loginMessage is set, a formatted message is generated and included in the page. When the <form> is submitted, the formUsername and formPassword fields are encoded as POST variables and are processed by the script that performs the authentication.
The function logged_on_page( ) in Example 9-8 generates the HTML that is used when the script detects that a user has already logged in to the application. The main part of the script initializes a session and checks if the user has already been authorized. If the session variable authenticatedUser is registered, the user has already been authorized and the function logged_on_page( ) is called. If not, the entry <form> is displayed by calling the function login_page( ), and the session is destroyed.
<?php // Function that returns the HTML FORM that is // used to collect the username and password function login_page($errorMessage) { // Generate the Login-in page ?> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd" > <html> <head><title>Login</title></head> <body> <h2>Winestore Login Page</h2> <form method="POST" action="example.9-9.php"> <? // Include the formatted error message if (isset($errorMessage)) echo "<h3><font color=red>$errorMessage</font></h3>"; // Generate the login <form> layout ?> <table> <tr><td>Enter your username:</td> <td><input type="text" size=10 maxlength=10 name="formUsername"></td></tr> <tr><td>Enter your password:</td> <td><input type="password" size=10 maxlength=10 name="formPassword"></td></tr> </table> <p><input type="submit" value="Log in"> </form> </body> </html> <? } // // Function that returns HTML page showing that // the user with the $currentLoginName is logged on function logged_on_page($currentLoginName) { // Generate the page that shows the user // is already authenticated and authorized ?> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd" > <html> <head><title>Login</title></head> <body> <h2>Winestore</h2> <h2>You are currently logged in as <?=$currentLoginName ?></h2> <a href="example.9-10.php">Logout</a> </body> </html> <? } // Main session_start( ); // Check if we have established a session if (isset($HTTP_SESSION_VARS["authenticatedUser"])) { // There is a user logged on logged_on_page( $HTTP_SESSION_VARS["authenticatedUser"]); } else { // No session established, no POST variables // display the login form + any error message login_page($HTTP_SESSION_VARS["loginMessage"]); session_destroy( ); } ?>
It is important that the script test the associative array holding the session variable $HTTP_SESSION_VARS["authenticatedUser"] rather than the global variable $authenticatedUser. Because of the default order in which PHP initializes global variables from GET, POST, and session variables, a user can override the value of $authenticatedUser simply by defining a GET or POST variable in the request. We discussed security problems with PHP variable initialization in Chapter 5.
When the login <form> is submitted, the POST variables are processed by the authentication script shown in Example 9-9. The authentication is performed by passing a handle to a connected MySQL server, the username, and the password to the function authenticateUser( ). The function executes a query to find the user row with the same username and encrypted password. As with the code in Example 9-7, we use the first two characters from the username as the salt string to the crypt( ) function.
The Boolean control variable $authenticated is set to the return value of the authenticateUser( ) function. If $authenticated is true, the username is registered as the $authenticatedUser session variable and the IP address of the client machine from which the request originated as the $loginIpAddress session variable.
If the authentication fails and $authenticated is set to false, the $loginMessage session variable is registered containing the appropriate message to display on the login <form> as shown in Figure 9-3. In Example 9-9 we always relocate back to the login page, keeping the code reasonably simple. An alternative would be to relocate back to a customer welcome page when authentication succeeds and relocate back to the login page only when authentication fails.
<?php include 'db.inc'; include 'error.inc'; function authenticateUser($connection, $username, $password) { // Test that the username and password // are both set and return false if not if (!isset($username) || !isset($password)) return false; // Get the two character salt from the username $salt = substr($username, 0, 2); // Encrypt the password $crypted_password = crypt($password, $salt); // Formulate the SQL query find the user $query = "SELECT password FROM users WHERE user_name = '$username' AND password = '$crypted_password'"; // Execute the query $result = @ mysql_query ($query, $connection) or showerror( ); // exactly one row? then we have found the user if (mysql_num_rows($result) != 1) return false; else return true; } // Main ---------- session_start( ); $authenticated = false; // Clean the data collected from the user $appUsername = clean($HTTP_POST_VARS["formUsername"], 10); $appPassword = clean($HTTP_POST_VARS["formPassword"], 15); // Connect to the MySQL server $connection = @ mysql_connect($hostname, $username, $password) or die("Cannot connect"); if (!mysql_selectdb($databaseName, $connection)) showerror() $authenticated = authenticateUser($connection, $appUsername, $appPassword); if ($authenticated == true) { // Register the customer id session_register("authenticatedUser"); $authenticatedUser = $appUsername; // Register the remote IP address session_register("loginIpAddress"); $loginIpAddress = $REMOTE_ADDR; } else { // The authentication failed session_register("loginMessage"); $loginMessage = "Could not connect to the winestore " . "database as \"$appUsername\""; } // Relocate back to the login page header("Location: example.9-8.php"); ?>
A separate script is called when a user logs out of the application. Example 9-10 shows the script that unregisters the $authenticatedUser session variable, registers the $loginMessage variable containing the appropriate message, and relocates back to the login script. The login script checks if the $loginMessage session variable is registered and displays the message that the user has logged out.
<?php session_start( ); $appUsername = $HTTP_SESSION_VARS["authenticatedUser"]; $loginMessage = "User \"$appUsername\" has logged out"; session_register("loginMessage"); session_unregister("authenticatedUser"); // Relocate back to the login page header("Location: example.9-8.php"); ?>
The scripts shown in Example 9-8, Example 9-9, and Example 9-10 form a framework that manages the login and logout functions and sets up the authentication control session variables. Scripts that require authorization need to check the session variables before they generate any output.
The authorization code that checks the authentication control session variables, shown in Example 9-11, can be written to a separate file and included with each protected page using the include directive. This saves having to rewrite the code for each page that requires authorization.
Example 9-11 begins by initializing the session and calculating two Boolean flags. The first flag $notAuthenticated is set to true if the session variable $authenticatedUser isn't set. The second flag $notLoginIp is set to true only if the session variable $loginIpAddress is set and has the same value as the IP address of the client that sent this request. The IP address of the client that sent the request is available to scripts in the server environment variable $REMOTE_ADDR. Unlike with environment variables, PHP doesn't overwrite $REMOTE_ADDR by a GET or POST variable with the same name.
Both the $notAuthenticated flag and the $notLoginIp flag are tested, and if either is true, an appropriate $loginMessage is set and registered with the session, and then the Location: header field is sent with the HTTP response to relocate the browser back to the login script. The two cases are separated, because the script might be enhanced to record more information about the possible hijack attempt and even to destroy the session.
<?php session_start( ); $loginScript = "example.9-8.php"; // Set a boolean flag to check if // a user has authenticated $notAuthenticated = !isset($HTTP_SESSION_VARS["authenticatedUser"]); // Set a boolean flag to true if this request // originated from the same IP address // as the one that created this session $notLoginIp = isset($HTTP_SESSION_VARS["loginIpAddress"]) && ($HTTP_SESSION_VARS["loginIpAddress"] != $REMOTE_ADDR); // Check that the two flags are false if($notAuthenticated) { // The request does not identify a session session_register("loginMessage"); $loginMessage = "You have not been authorized to access the " . "URL $REQUEST_URI"; // Re-locate back to the Login page header("Location: " . $loginScript); exit; } else if($notLoginIp) { // The request did not originate from the machine // that was used to create the session. // THIS IS POSSIBLY A SESSION HIJACK ATTEMPT session_register("loginMessage"); $loginMessage = "You have not been authorized to access the " . "URL $REQUEST_URI from the address $REMOTE_ADDR"; // Re-locate back to the Login page header("Location: " . $loginScript); exit; } ?>
To use the code developed in Example 9-11 to protect a page, we only need to include the file containing the code. If we saved Example 9-11 to auth.inc, protecting a page is easy:
<?php include("auth.inc"); ?> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd" > <html> ... <h2>Your Credit Card details</h2> ... <p><a href="example.9-10.php">Logout</a> ... </html>
As discussed in Chapter 4, including files with the .inc extension presents a security problem. If the user requests the include file, the source of the include file is shown in the browser.
There are three ways to address this problem:
Store the include files outside the document tree of the Apache web server installation. For example, store the include files in the directory /usr/local/include/php and use the complete path in the include directive.
Use the extension .php instead of .inc. In this case, the include file is interpreted by the PHP script engine and produces no output because it contains no main body.
Configure Apache so that files with the extension .inc can't be retrieved.
Copyright © 2003 O'Reilly & Associates. All rights reserved.