start page | rating of books | rating of authors | reviews | copyrights

Unix Power ToolsUnix Power ToolsSearch this book

32.4. Using Metacharacters in Regular Expressions

Summary Box

There are three important parts to a regular expression:

Anchors
Specify the position of the pattern in relation to a line of text.

Character sets
Match one or more characters in a single position.

Modifiers
Specify how many times the previous character set is repeated.

The following regular expression demonstrates all three parts:

^#*

The caret (^) is an anchor that indicates the beginning of the line. The hash mark is a simple character set that matches the single character #. The asterisk (*) is a modifier. In a regular expression, it specifies that the previous character set can appear any number of times, including zero. As you will see shortly, this is a useless regular expression (except for demonstrating the syntax!).

There are two main types of regular expressions: simple (also known as basic) regular expressions and extended regular expressions. (As we'll see in the next dozen articles, the boundaries between the two types have become blurred as regular expressions have evolved.) A few utilities like awk and egrep use the extended regular expression. Most use the simple regular expression. From now on, if I talk about a "regular expression" (without specifying simple or extended), I am describing a feature common to both types. For the most part, though, when using modern tools, you'll find that extended regular expressions are the rule rather than the exception; it all depends on who wrote the version of the tool you're using and when, and whether it made sense to worry about supporting extended regular expressions.

[The situation is complicated by the fact that simple regular expressions have evolved over time, so there are versions of "simple regular expressions" that support extensions missing from extended regular expressions! Bruce explains the incompatibility at the end of Section 32.15. -- TOR]

The next eleven articles cover metacharacters and regular expressions:

-- BB



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.