Keeping a Small Memory Footprint (Practical mod_perl)

Since mod_perl processes tend to consume a lot of memory as the number of loaded modules and scripts grows during the child's lifetime, it's important to know how to keep memory usage down. Let's see what should be kept in mind when writing code that will be executed under mod_perl.

13.5.1. "Bloatware" Modules

Perl IO:: modules are very convenient, but let's see what it costs to use them. The following command (Perl 5.6.1 on Linux) reveals that when we use IO we also load the IO::Handle, IO::Seekable, IO::File, IO::Pipe, IO::Socket, and IO::Dir modules. The command also shows us how big they are in terms of code lines. wc(1) reports how many lines of code are in each of the loaded files:

panic% wc -l `perl -MIO -e 'print join("\n", sort values %INC, "")'`
  124 /usr/lib/perl5/5.6.1/Carp.pm
  602 /usr/lib/perl5/5.6.1/Class/Struct.pm
  456 /usr/lib/perl5/5.6.1/Cwd.pm
  313 /usr/lib/perl5/5.6.1/Exporter.pm
  225 /usr/lib/perl5/5.6.1/Exporter/Heavy.pm
   93 /usr/lib/perl5/5.6.1/File/Spec.pm
  458 /usr/lib/perl5/5.6.1/File/Spec/Unix.pm
  115 /usr/lib/perl5/5.6.1/File/stat.pm
  414 /usr/lib/perl5/5.6.1/IO/Socket/INET.pm
  143 /usr/lib/perl5/5.6.1/IO/Socket/UNIX.pm
   52 /usr/lib/perl5/5.6.1/SelectSaver.pm
  146 /usr/lib/perl5/5.6.1/Symbol.pm
  160 /usr/lib/perl5/5.6.1/Tie/Hash.pm
   92 /usr/lib/perl5/5.6.1/base.pm
 7525 /usr/lib/perl5/5.6.1/i386-linux/Config.pm
  276 /usr/lib/perl5/5.6.1/i386-linux/Errno.pm
  222 /usr/lib/perl5/5.6.1/i386-linux/Fcntl.pm
   47 /usr/lib/perl5/5.6.1/i386-linux/IO.pm
  239 /usr/lib/perl5/5.6.1/i386-linux/IO/Dir.pm
  169 /usr/lib/perl5/5.6.1/i386-linux/IO/File.pm
  612 /usr/lib/perl5/5.6.1/i386-linux/IO/Handle.pm
  252 /usr/lib/perl5/5.6.1/i386-linux/IO/Pipe.pm
  127 /usr/lib/perl5/5.6.1/i386-linux/IO/Seekable.pm
  428 /usr/lib/perl5/5.6.1/i386-linux/IO/Socket.pm
  453 /usr/lib/perl5/5.6.1/i386-linux/Socket.pm
  129 /usr/lib/perl5/5.6.1/i386-linux/XSLoader.pm
  117 /usr/lib/perl5/5.6.1/strict.pm
   83 /usr/lib/perl5/5.6.1/vars.pm
  419 /usr/lib/perl5/5.6.1/warnings.pm
   38 /usr/lib/perl5/5.6.1/warnings/register.pm
14529 total

About 14,500 lines of code! If you run a trace of this test code, you will see that it also puts a big load on the machine to actually load these modules, although this is mostly irrelevant if you preload the modules at server startup.

CGI.pmsuffers from the same problem:

panic% wc -l `perl -MCGI -le 'print for values %INC'`
  313 /usr/lib/perl5/5.6.1/Exporter.pm
  124 /usr/lib/perl5/5.6.1/Carp.pm
  117 /usr/lib/perl5/5.6.1/strict.pm
   83 /usr/lib/perl5/5.6.1/vars.pm
   38 /usr/lib/perl5/5.6.1/warnings/register.pm
  419 /usr/lib/perl5/5.6.1/warnings.pm
  225 /usr/lib/perl5/5.6.1/Exporter/Heavy.pm
 1422 /usr/lib/perl5/5.6.1/overload.pm
  303 /usr/lib/perl5/5.6.1/CGI/Util.pm
 6695 /usr/lib/perl5/5.6.1/CGI.pm
  278 /usr/lib/perl5/5.6.1/constant.pm
10017 total

However, judging the bloat by the number of lines is misleading, since not all the code is used in most cases. Also remember that documentation might account for a significant chunk of the lines in every module.

Since we can preload the code at server startup, we are mostly interested in the execution overhead and memory footprint. So let's look at the memory usage.

Example 13-12 is the perlbloat.pl script, which shows how much memory is acquired by Perl when you run some code. Now we can easily test the overhead of loading the modules in question.

Example 13-12. perlbloat.pl

#!/usr/bin/perl -w

use GTop ( );

my $gtop = GTop->new;
my $before = $gtop->proc_mem($$)->size;

for (@ARGV) {
    if (eval "require $_") {
        eval { $_->import; };
    }
    else {
        eval $_;
        die $@ if $@;
    }
}

my $after = $gtop->proc_mem($$)->size;
print "@ARGV added " . GTop::size_string($after - $before) . "\n";

The script simply samples the total memory use, then evaluates the code passed to it, samples the memory again, and prints the difference.

Now let's try to load IO:

panic% ./perlbloat.pl 'use IO;'
use IO; added  1.3M

"Only" 1.3 MB of overhead. Now let's load CGI.pm (v2.79) and compile its methods:

panic% ./perlbloat.pl 'use CGI; CGI->compile(":cgi")'
use CGI; CGI->compile(":cgi") added 784k

That's almost 1 MB of extra memory per process.

Let's compare CGI.pm with its younger sibling, whose internals are implemented in C:

%. /perlbloat.pl 'use Apache::Request'
use Apache::Request added   36k

Only 36 KB this time. A significant difference, isn't it? We have compiled the :cgi group of the CGI.pm methods, because CGI.pm is written in such a way that the actual code compilation is deferred until some function is actually used. To make a fair comparison with Apache::Request, we compiled only the methods present in both.

If we compile :all CGI.pm methods, the memory bloat is much bigger:

panic% ./perlbloat.pl 'use CGI; CGI->compile(":all")'
use CGI; CGI->compile(":all") added  1.9M

The following numbers show memory sizes in KB (virtual and resident) for Perl 5.6.0 on four different operating systems. Three calls are made: without any modules, with only -MCGI, and with -MIO (never with both). The rows with -MCGI and -MIO are followed by the difference relative to raw Perl.

  OpenBSD      FreeBSD       RedHat         Linux        Solaris
              vsz   rss     vsz   rss     vsz   rss    vsz    rss
  Raw Perl    736   772     832  1208    2412   980    2928  2272

  w/ CGI     1220  1464    1308  1828    2972  1768    3616  3232
  delta      +484  +692    +476  +620    +560  +788    +688  +960

  w/ IO      2292  2580    2456  3016    4080  2868    5384  4976
  delta     +1556 +1808   +1624 +1808   +1668 +1888   +2456 +2704

Which is more important: saving enough memory to allow the machine to serve a few extra concurrent clients, or using off-the-shelf modules that are proven and well understood? Debugging a reinvention of the wheel can cost a lot of development time, especially if each member of your team reinvents in a different way. In general, it is a lot cheaper to buy more memory or a bigger machine than it is to hire an extra programmer. So while it may be wise to avoid using a bloated module if you need only a few functions that you could easily code yourself, the place to look for real efficiency savings is in how you write your code.

13.5.2. Importing Symbols

Imported symbols act just like global variables; they can add up memory quickly. In addition to polluting the namespace, a process grows by the size of the space allocated for all the symbols it imports. The more you import (e.g., qw(:standard) versus qw(:all) with CGI.pm), the more memory will be used.

Let's say the overhead is of size Overhead. Now take the number of scripts in which you deploy the function method interface—let's call that Scripts. Finally, let's say that you have a number of processes equal to Processes.

You will need Overhead × Scripts × Processes of additional memory. Taking an insignificant Overhead of 10 KB and, adding in 10 Scripts used across 30 Processes, we get 10 KB × 10 × 30 = 3 MB! The 10-KB overhead becomes a very significant one.

Let's assume that we need to use strtol( ) from the POSIX package. Under Perl 5.6.1 we get:

panic% ./perlbloat.pl 'use POSIX ( ); POSIX::strtol(_ _PACKAGE_ _, 16)'
use POSIX ( ) added  176k

panic% ./perlbloat.pl 'use POSIX; strtol(_ _PACKAGE_ _, 16)'
use POSIX added  712k

The first time we import no symbols, and the second time we import all the default symbols from POSIX. The difference is 536 KB worth of aliases. Now let's say 10 different Apache::Registryscripts 'use POSIX;' for strftime( ), and we have 30 mod_perl processes:

536KB  ×  10  ×  30 = 160MB

We have 160 MB of extra memory used. Of course, you may want to import only needed symbols:

panic% ./perlbloat.pl 'use POSIX qw(strtol); strtol(_ _PACKAGE_ _, 16);'
use POSIX qw(strftime) added  344k

Still, using strftime( ) uses 168 KB more memory. Granted, POSIX is an extreme case—usually the overhead is much smaller for a single script but becomes significant if it occurs in many scripts executed by many processes.

Here is another example, now using the widely deployed CGI.pm module. Let's compare CGI.pm's object-oriented and procedural interfaces. We'll use two scripts that generate the same output, the first (Example 13-13) using methods and the second (Example 13-14) using functions. The second script imports a few functions that are going to be used.

Example 13-13. cgi_oo.pl

use CGI ( );
my $q = CGI->new;
print $q->header;
print $q->b("Hello");

Example 13-14. cgi_proc.pl

use CGI qw(header b);
print header( );
print b("Hello");

After executing each script in single server mode (-X), we can see the results with the help of Apache::Status, as explained in Chapter 9.

Here are the results of the first script:

Totals: 1966 bytes | 27 OPs

handler 1514 bytes | 27 OPs
exit     116 bytes |  0 OPs

The results of the second script are:

Totals: 4710 bytes | 19 OPs

handler  1117 bytes | 19 OPs
basefont  120 bytes |  0 OPs
frameset  120 bytes |  0 OPs
caption   119 bytes |  0 OPs
applet    118 bytes |  0 OPs
script    118 bytes |  0 OPs
ilayer    118 bytes |  0 OPs
header    118 bytes |  0 OPs
strike    118 bytes |  0 OPs
layer     117 bytes |  0 OPs
table     117 bytes |  0 OPs
frame     117 bytes |  0 OPs
style     117 bytes |  0 OPs
Param     117 bytes |  0 OPs
small     117 bytes |  0 OPs
embed     117 bytes |  0 OPs
font      116 bytes |  0 OPs
span      116 bytes |  0 OPs
exit      116 bytes |  0 OPs
big       115 bytes |  0 OPs
div       115 bytes |  0 OPs
sup       115 bytes |  0 OPs
Sub       115 bytes |  0 OPs
TR        114 bytes |  0 OPs
td        114 bytes |  0 OPs
Tr        114 bytes |  0 OPs
th        114 bytes |  0 OPs
b         113 bytes |  0 OPs

As you see, the object-oriented script uses about 2 KB of memory while the procedural interface script uses about 5 KB.

Note that the above is correct if you didn't precompile all of CGI.pm's methods at server startup. If you did, the procedural interface in the second test will take up to 18 KB, not 5 KB. That's because the entire CGI.pm namespace is inherited, and it already has all its methods compiled, so it doesn't really matter whether you attempt to import only the symbols that you need. So if you have:

use CGI  qw(-compile :all);

in the server startup script, having:

use CGI qw(header);

or:

use CGI qw(:all);

is essentially the same. All the symbols precompiled at startup will be imported, even if you request only one symbol. It seems like a bug, but it's just how CGI.pm works.


13.4. Interpolation, Concatenation, or List		13.6. Object Methods Calls Versus Function Calls

13.5. Keeping a Small Memory Footprint

13.5.1. "Bloatware" Modules

Example 13-12. perlbloat.pl

13.5.2. Importing Symbols

Example 13-13. cgi_oo.pl

Example 13-14. cgi_proc.pl