start page | rating of books | rating of authors | reviews | copyrights

Book HomeLearning Perl, 3rd EditionSearch this book

6.2. Input from the Diamond Operator

Another way to read input is with the diamond[143] operator: <>. This is useful for making programs that work like standard Unix[144] utilities, with respect to the invocation arguments (which we'll see in a moment). If you want to make a Perl program that can be used like the utilities cat, sed, awk, sort, grep, lpr, and many others, the diamond operator will be your friend. If you want to make anything else, the diamond operator probably won't help.

[143]The diamond operator was named by Larry's daughter, Heidi, when Randal went over to Larry's house one day to show off the new training materials he'd been writing, and complained that there was no spoken name for "that thing". Larry didn't have a name for it, either. Heidi (eight years old at the time) quickly chimed in, "That's a diamond, Daddy." So the name stuck. Thanks, Heidi!

[144]But not just on Unix systems. Many other systems have adopted this way of using invocation arguments.

The invocation arguments to a program are normally a number of "words" on the command line after the name of the program.[145] In this case, they give the names of a number of files to be processed in sequence:

[145]Whenever a program is started, it has a list of zero or more invocation arguments, supplied by whatever program is starting it. Often this is the shell, which makes up the list depending upon what you type on the command line. But we'll see in a later chapter that you can invoke a program with pretty much any strings as the invocation arguments. Because they often come from the shell's command line, they are sometimes called "command-line arguments" as well.

$ ./my_program fred barney betty

That command means to run the command my_program (which will be found in the current directory), and that it should process file fred, followed by file barney, followed by file betty.

If you give no invocation arguments, the program should process the standard input stream. Or, as a special case, if you give just a hyphen as one of the arguments, that means standard input as well.[146] So, if the invocation arguments had been fred - betty, that would have meant that the program should process file fred, followed by the standard input stream, followed by file betty.

[146]Here's a possibly unfamilar Unix fact: most of those standard utilities, like cat and sed use this same convention, where a hyphen stands for the standard input stream.

The benefit of making your programs work like this is that you may choose where the program gets its input at run time; for example, you won't have to rewrite the program to use it in a pipeline (which we'll discuss more later). Larry put this feature into Perl because he wanted to make it easy for you to write your own programs that work like standard Unix utilities -- even on non-Unix machines. Actually, he did it so he could make his own programs work like standard Unix utilities; since some vendors' utilities don't work just like others', Larry could make his own utilities, deploy them on a number of machines, and know that they'd all have the same behavior. Of course, this meant porting Perl to every machine he could find.

The diamond operator is actually a special kind of line-input operator. But instead of getting the input from the keyboard, it comes from the user's choice of input:[147]

[147]Which may or may not include getting input from the keyboard.

while (defined($line = <>)) {
  chomp($line);
  print "It was $line that I saw!\n";
}

So, if we run this program with the invocation arguments fred, barney, and betty, it will say something like: "It was [a line from file fred] that I saw!", "It was [another line from file fred] that I saw!", on and on until it reaches the end of file fred. Then, it will automatically go on to file barney, printing out one line after another, and then on to file betty. Note that there's no break when we go from one file to another; when you use the diamond, it's as if the input files have been merged into one big file.[148] The diamond will return undef (and we'll drop out of the while loop) only at the end of all of the input.

[148]If it matters to you, or even if it doesn't, the current file's name is kept in Perl's special variable $ARGV. This name may be "-" instead of a real filename if the input is coming from the standard input stream, though.

In fact, since this is just a special kind of line-input operator, we may use the same shortcut we saw earlier, to read the input into $_ by default:

while (<>) {
  chomp;
  print "It was $_ that I saw!\n";
}

This works like the loop above, but with less typing. And you may have noticed that we're using the default for chomp; without an argument, chomp will work on $_. Every little bit of saved typing helps!

Since the diamond operator is generally being used to process all of the input, it's typically a mistake to use it in more than one place in your program. If you find yourself putting two diamonds into the same program, especially using the second diamond inside the while loop that is reading from the first one, it's almost certainly not going to do what you would like.[149] In our experience, when beginners put a second diamond into a program, they meant to use $_ instead. Remember, the diamond operator reads the input, but the input itself is (generally, by default) found in $_.

[149]If you re-initialize @ARGV before using the second diamond, then you're on solid ground. We'll see @ARGV in the next section.

If the diamond operator can't open one of the files and read from it, it'll print an allegedly helpful diagnostic message, such as:

can't open wimla: No such file or directory

The diamond operator will then go on to the next file automatically, much like what you'd expect from cat or another standard utility.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.