Perl (Managing and Using MySQL, 2nd Edition)

9.1. Introduction to DBI

The Perl DBI interface is fully described in Programming the Perl DBI by Alligator Descartes and Tim Bunce (O'Reilly). The manpage for DBI is also a very informative reference. This chapter should make you competent, though not expert.

Before launching into a CGI application, let's try a couple trivial Perl programs you can run from the command line. These were tested on Unix and Linux systems, but should work the same way on other operating systems as well. After looking at the code, we'll show you how to provide the actual database and discuss some optimizations.

9.1.1. Basic Perl Example

Our first program is about as simple as Perl DBI can get. Example 9-1 is a program you can run at the command line to query a database and display the results. We'll describe how it works in this section, but you won't actually be able to run it until you've done the setup described in the next section.

Example 9-1. Submitting a query to MySQL using Perl DBI

#!/usr/bin/perl -w
  
use strict;
use DBI;
  
my $server = 'localhost'
my $db = 'Books';
my $username = 'andy' ;
my $password = 'ALpswd' ;
  
my $dbh = DBI->connect("dbi:mysql:$db:$server", $username, $password);
  
my $query = "SELECT * FROM Titles"; 
my $sth = $dbh->prepare($query);
$sth->execute( );
  
while (my $row = $sth->fetchrow_arrayref) {
    print join("\t", @$row),"\n";
}
  
$dbh->disconnect;

The basic sequence of events (which will be the same for nearly any Perl DBI application) is:

Connect to your database.
Build a query.
Build a statement handle.
Execute the statement handle.
Retrieve and process the results.

Now for a look at the code, line by line.

As with all Perl modules, you must issue a use statement to get access to the DBI module:

use DBI;

All interactions between Perl and MySQL are conducted through what is known as a database handle. This is an object that implements all the methods used to communicate with the database. You may have as many database handles open at once as you wish, limited only by your system resources.

The first DBI method you invoke, therefore, is a connect( ) method that creates the database handle. It takes three arguments, the first of which consists of several parts.

A data source, containing information that varies depending on the database engine you use. For MySQL, this argument includes the following information, separated by colons:
1. The string dbi.
2. The driver name, mysql.
3. The database name, which is Books in our example.
4. The host, which is localhost in our example. The local host is the most common choice, because most sites put the client application (such as a CGI program on a web server) on the same system as the MySQL server.
A username, which is andy in our example. This is actually an abbreviation of andy@localhost. MySQL assumes localhost as the hostname if you don't specify one.
A password, which is ALpswd in our example.

The call allows a fourth optional argument that can be used to change the default attributes of the connection. We'll show a bit of its use later.

The $dbh variable is returned by the DBI module and forms our handle into the MySQL server. The variable can be named anything you want, but $dbh is traditional. The handle can be used to to issue queries, updates, or any SQL statements we want. However, the SQL must be wrapped in Perl code.

Our example reads all the titles from the Titles table. We create the SQL query in the $query variable, wrapped up as a string in Perl. Then we prepare a statement handle $sth from the query. Finally, we execute the statement.

Note that prepare( ) is a method provided by $dbh, the database or connection handle, while execute( ) is a method provided by $sth, the statement handle. In other words, the database handle prepares a statement, which then executes itself.

The prepare( ) method takes an SQL query and stores it (either locally or on the database server) until execution. On database servers that store the query on the database server itself, one can perform operations on the query before executing it. However, MySQL does not support that ability yet; it simply stores prepared queries within the database driver until execution. The execute( ) method causes the query to be sent to the database server and executed.

The result of executing a query depends on the type of query. If the query is a non-SELECT query that returns no data (such as INSERT, UPDATE, or DELETE), the execute() method returns the number of rows that were affected by the query. That is, for an INSERT query that inserts one row of data, the execute( ) method will return 1 if the query is successful.

For SELECT queries, the execute( ) method simply returns a true value if the query is successful and a false value if there is an error. The data returned from the query is then available using various methods provided by the statement handle.

So we can now retrieve all the results of our query by issuing calls to the statement handle. We are finished with SQL and with our database, in this sample program. We use DBI methods and Perl arrays to manipulate the data from now on.

Most applications retrieve rows from a statement handle one at a time. The statement handle provides a fetchrow_arrayref( ) method to do this. A typical application issues fetchrow_arrayref( ) in a loop and processes each row in the body of the loop. For our first, simple program, we'll just display the data returned.

The fetchrow_arrayref( ) method graciously acts like a typical Perl function, returning undef when there are no more rows to fetch. Therefore, we can write a while loop that terminates at an undefined return value.

Because each row consists of multiple fields, the fetchrow_arrayref( ) method returns its results as a reference to an array. Each element of the array is a field in the row. We use @$row syntax to retrieve the array of fields from the reference $row.

Now that all the data has been printed, we can close the database handle. For our trivial program, this is not necessary, because Perl will automatically destroy the handle when the application exits. However, since cleaning up one's resources is necessarily in some environments (such as when using the Apache mod_perl module), it is a good habit to get into.

Before we run the program, we have to set up the andy user account and the database itself, which we'll do in the next section.

9.1.2. Setting Up the Database and Program

As raw material for our Perl programs, we'll create a simple database of information about O'Reilly & Associates. It's a very flat database with all its information in a single table, but it's sufficient to convey all the concepts you need for accessing MySQL with Perl.

First, you need a file of SQL commands that create the table and insert some data. The file is available on the O'Reilly online example site under the name books.sql, and it starts out with the commands in Example 9-2. We can save some room by declaring large VARCHAR fields for most text; only the isbn field has a known, fixed length.

Example 9-2. Beginning of SQL commands to load database

DROP TABLE IF EXISTS Titles;

CREATE TABLE Titles (
  isbn char(10) NOT NULL default '',
  title varchar(255) default NULL,
  publisher varchar(255) default NULL,
  author varchar(255) default NULL,
  pages int(11) default NULL,
  pubdate int(11) default NULL,
  PRIMARY KEY  (isbn)
) TYPE=MyISAM;

INSERT INTO Titles VALUES ('0596000448','Designing Active Server Pages','O\'Reilly & 
Associates','Scott Mitchell',376,967795200);
INSERT INTO Titles VALUES ('1565924460','Developing Asp Components','O\'Reilly & 
Associates','Shelley Powers',490,930816000);
INSERT INTO Titles VALUES ('156592567X','Writing Apache Modules with Perl and C: The 
Apache API and mod_perl (O\'Reilly Nutshell)','O\'Reilly & Associates','Lincoln Stein, 
Doug MacEachern, Linda Mui (Editor)',746,920275200);
INSERT INTO Titles VALUES ('1565927060','Apache : Pocket Reference','O\'Reilly & 
Associates','Andrew Ford, Gigi Estabrook',107,959846400);
...

Once you have the file, make sure you have been granted CREATE and INSERT privileges by MySQL, as described in Chapter 6. The MySQL root user can create an account for andy, assign him a password, and grant him the privileges he needs all at once in the following command:

mysql> GRANT ALL on Books.* TO andy@localhost identified by 'ALpaswd';

Andy (or any user knowing the password) can now execute the following commands at the shell:

$ mysqladmin -u andy -p create Books
$ mysql -u andy Books -p < books.sql

The -p option prompts for the password ALpswd we assigned when granting privileges. After you successfully execute these commands, the database will be loaded with data and ready for some Perl applications.

Before running the program, you must make sure that Perl is installed on your system (virtually every modern Unix system has it) and that the DBI module is installed. You can check whether a module is installed through a Perl one-liner such as:

$ perl -MDBI -e 1;

If you see no output, the module is loaded and can be used in your programs. If Perl prints the message Can't locate DBI.pm in @INC..., it means the module has to be downloaded and installed. Instructions on downloading and installing Perl modules are beyond the scope of this book, but all the modules described in this chapter (and most other useful, stable Perl modules) are available from the Comprehensive Perl Archive Network (CPAN) at http://www.cpan.org or one of its many mirrors that have sprung up all over the world.

The MySQL username and password for andy are included in our sample program, so it can be run by anybody on the local system once the database is created. The output follows. We have broken some lines and inserted backslashes to fit the page.

$ perl perl1.pl
0596000448      Designing Active Server Pages   O'Reilly & Associates \
  Scott Mitchell  376     967795200
1565924460      Developing Asp Components       O'Reilly & Associates \
  Shelley Powers  490     930816000
156592567X      Writing Apache Modules with Perl and C: \
  The Apache API and mod_perl (O'Reilly Nutshell) O'Reilly & Associates \
  Lincoln Stein, Doug MacEachern, Linda Mui (Editor)     746    920275200
1565927060      Apache : Pocket Reference       O'Reilly & Associates \
  Andrew Ford, Gigi Estabrook     107     959846400
...

9.1.3. Error Handling and Attributes

Because of typos in their programs or problems in setting up their databases, most programmers turn up errors when they first run an example. We'll look at some ways to deal with errors here. Luckily, the DBI module provides useful diagnostic messages by default.

The connect() method offers an optional argument that controls the attributes of the database handle. Attributes include whether the statement prints an error message in case of failure, whether it throws a potentially fatal exception in case of failure, and many other things. We will not delve into the many subtleties here, such as the varied places where attributes can be set or the ones that statement handles inherit from the database handle. Instead, we will show a couple uses of attributes for error handling.

Suppose you want DBI to terminate your program if it cannot make a connection or if any later DBI method fails. You can put this behavior in your connect( ) command by adding a fourth argument that sets the RaiseError attribute, as follows:

my $dbh = DBI->connect("dbi:mysql:$db:$server", $username, $password,
                       { RaiseError => 1 } );

If you want to do something fancier with attributes, such as set them at runtime, here is the syntax for specifying the attribute argument (it is a reference to a hash):

my %attr = ( RaiseError => 1 );
my $dbh = DBI->connect("dbi:mysql:$db:$server", $username, $password,
                       \%attr);

In either case, running a program with an incorrect password produces the following output (we inserted a line break so the message would fit on the page):

$ perl perl_raise.pl
DBI->connect(Books:localhost) failed: Access denied for user: \
  'andy@localhost' (Using password: YES) at perl_raise.pl line 11
$

The program terminates automatically.

In production environments (and particularly CGI programs) you don't want the program to terminate. If RaiseError is set, you could run your statement within a Perl eval block—a traditional way to trap an error. But by default, the RaiseError attribute is 0 (turned off) and another attribute, PrintError, is 1 (turned on). This attribute makes DBI print an error message and continue. Internally, RaiseError calls the Perl die statement while PrintError calls the warn statement.

Let's look at an alternative way to handle errors. Suppose you want to trap the connection error in your application gracefully and try another course of action. We'll turn off both error attributes and catch the error in typical Perl fashion.

my %attr = ( PrintError => 0, RaiseError => 0 );
my $dbh = DBI->connect("dbi:mysql:$db:$server", $username, $password,
                       \%attr)
or do {
  warn "Cannot connect to dbi:mysql:$db:$server;
        trying another method" ;
  ...
  }

The connect method returns undef in case of error. So we use an or clause to run some recovery code. Running the program with incorrect connection arguments produces the following output:

$ perl perl_check.pl
Cannot connect to dbi:mysql:Books:localhost; trying another method \
at perl_check.pl line 13.

And the program continues within the do loop.

9.1.4. Introducing Bind Variables and Optimizations

As we indicated in Section 9.1.1, creating a statement in DBI is a two-step process: first you create a query, then you create a statement handle from the query. One reason for this division is a complication in SQL called bind variables, which are a kind of placeholder. Instead of specifying every field and value literally in a query, such as:

SELECT author FROM Titles WHERE ISBN = '156592567X'

you can create a generalized query with a question mark as a bind variable:

SELECT author FROM Titles WHERE ISBN = ?

and plug in 156592567X or any other ISBN you want before executing the statement.

Databases operate much more efficiently if you create a general statement containing bind variables and issue repeated queries or updates with different values for the bind variables. This is because the database compiles the SQL statement into an internal format when you create the statement and reuses the internal version on each query or update. This compilation is shown as the second and third of the three phases, parsing and optimization, in Figure 5-1. Perl DBI performs compilation in the prepare( ) statement. If you use bind variables, you must plug in the values between the prepare( ) and the execute( ) statements.

We'll show a trivial example using a bind variable in this section. More sophisticated examples appear later in the chapter.

The new example alters the previous example to use a WHERE clause in the SELECT statement so that we can limit results to a particular publication date. Additionally, we request just two columns (ISBN and title) instead of all columns.

my $query = q{
    SELECT isbn,title FROM Titles 
        where author like ?
};

The q{} syntax is an alternative to using single quotes. The string ends with a question mark, which holds a place for a bind variable. We can plug in the actual value as an argument when we execute the statement:

$sth->execute('%Tim Bunce%'); #just books published in 2001

As Chapter 3 explained, the percent signs are used by SQL to match any text before or after the specified text.

Assume you had to issue the same query many times using different values for bind variables (during the generation of a daily report, for instance, which retrieves values for many different books or authors). You would prepare the statement with prepare( ), then run a loop that executes the statement repeatedly with different values.

Our amended program is shown in Example 9-3.

Example 9-3. Submitting a query to MySQL using Perl DBI and bind variables

#!/usr/bin/perl -w

use strict;
use DBI;

my $server = 'localhost';
my $db = 'Books';
my $username = 'andy' ;
my $password = 'ALpswd' ;

my $dbh = DBI->connect("dbi:mysql:$db:$server", $username, $password);

# The SQL contains a question mark to indicate a bind variable.
my $query = q{
    SELECT isbn,title FROM Titles 
        where author like ?
}; 

my $sth = $dbh->prepare($query);

# We pass an argument to bind the value
# '%Tim Bunce%' to our bind variable.
$sth->execute('%Tim Bunce%'); #just books published in 2001

while (my $row = $sth->fetchrow_arrayref) {
    print join("\t",@$row),"\n";
}

$dbh->disconnect;

When run, the output is:

$ perl perl2.pl
1565926994      Programming the Perl DBI

Another way to use bind variables is to put bind_param( ) calls between the prepare( ) and execute( ) calls:

my $sth = $dbh->prepare($query);
$sth->bind_param (1, '%Tim Bunce%');
$sth->execute( );

This bind_param( ) call sets the first parameter (our only one) to '%Tim Bunce%'. A third, optional argument allows you to specify the SQL type of the parameter, and we'll see that argument in use later in the chapter.

Another small optimization we could make to our program is to retrieve all the rows from the statement handle at once, instead of one at a time. The selectall_arrayref( ) retrieves the results as a reference to an array. Each element of the array is a single row, also packaged in a reference to an array. You can unpack the rows in a foreach loop as follows:

my $row;
my $result = $dbh->selectall_arrayref($sth);
foreach $row (@$result) {
    print join("\t", @$row),"\n";
}

The performance advantage is that your loop operates on a simple array, rather than having to call fetchrow_arrayref( ) on the statement handle on each iteration. You have to be careful when using this retrieval method, though, because if your data is enormous, it could put a strain on your program's memory use and even cause a crash. Make sure the results will fit into available memory.

If you choose to retrieve one row at a time, the fetchrow_arrayref( ) method is fastest. You can also use fetchrow_array( ) to return a simple array (instead of a reference to an array) or fetchrow_hash( ) to return a hash. The latter presents interesting possibilities, because you can then retrieve values by specifying the column name as a key. It is slow, however; if you want a hash, it is better to build it yourself from the results of fetchrow_arrayref( ).