The pattern space is a buffer that contains the current input line. There is also a set-aside buffer called the hold space . The contents of the pattern space can be copied to the hold space, and the contents of the hold space can be copied to the pattern space. A group of commands allows you to move data between the hold space and the pattern space. The hold space is used for temporary storage, and that's it. Individual commands can't address the hold space or alter its contents.
The most frequent use of the hold space is to have it retain a duplicate of the current input line while you change the original in the pattern space. [It's also used as a way to do the "move" and "copy" commands that most editors have -- but which sed can't do directly because it's designed for editing a stream of input text line-by-line. -- GU] The commands that affect the hold space are:
Hold |
h |
Copy contents of pattern space to hold space, replacing previous. |
H |
Append newline, then append contents of pattern space, to hold space. |
|
Get |
g |
Copy contents of hold space to pattern space, replacing previous. |
G |
Append newline, then append contents of hold space, to pattern space. |
|
Exchange |
x |
Swap contents of hold space and pattern space. |
Each of these commands can take an address that specifies a single line or a range of lines. The hold commands (h, H) move data into the hold space and the get commands (g, G) move data from the hold space back into the pattern space. The difference between the lowercase and uppercase versions of the same command is that the lowercase command overwrites the contents of the target buffer, while the uppercase command appends to the existing contents after adding a newline.
The hold command replaces the contents of the hold space with the contents of the pattern space. The get command replaces the contents of the pattern space with the contents of the hold space. The Hold command puts a newline followed by the contents of the pattern space after the contents of the hold space. (The newline is appended to the hold space even if the hold space is empty.) The Get command puts a newline followed by the contents of the hold space after the contents of the pattern space.
The exchange command (x) swaps the contents of the two buffers. It has no side effects on either buffer.
Here's an example to illustrate putting lines into the hold space and retrieving them later. We are going to write a script that reads a particular HTML file and copies all headings to the end of the file for a summary. The headings we want start with <h1> or <h2>. For example:
... <body> <h1>Introduction</h1> The blah blah blah <h1>Background of the Project</h1> ... <h2>The First Year</h2> ... <h2>The Second Year</h2> ... </body>
The object is to copy those headings into the hold space as sed reads them. When sed reaches the end of the body (at the </body> tag), output Summary:, then output the saved tags without the heading tags (<h1> or <h2>).
Look at the script:
/^<h[12]>/H /^<\/body>/ { i\ <strong>Summary:</strong> x G s/<\/*h[12]>//g }
Any line matching <h1> or <h2> is added to the hold space. (All those lines are also printed; that's the default in sed unless lines have been deleted.[103]) The last part of the script watches for the </body> tag. When it's reached, sed inserts the Summary: heading. Then the script uses x to exchange the pattern space (which has the </body> tag) with the saved headers from the hold space. Now the pattern space has the saved headers. Next, G adds the </body> tag to the end of the headers in the pattern space. Finally, a substitute command strips the <h1>, </h1>, <h2>, and </h2> tags. At the end of the script, the pattern space is printed by default.
[103]Note that this can lead to confusion when the same line is matched by several patterns and then printed, once per match!
The sequence of x followed by G is a way to find a matching line -- in this case, </body> -- and insert the contents of the hold space before the matched line. In other words, it's like an i command that inserts the hold space at the current line.
The script could do more cleanup and formatting. For instance, it could make the saved headings into a list with <ul> and <li>. But this example is mostly about the hold space.
Here's the result of running the script on the sample file:
% sed -f sedscr report.html ... <body> <h1>Introduction</h1> The blah blah blah <h1>Background of the Project</h1> ... <h2>The First Year</h2> ... <h2>The Second Year</h2> ... <strong>Summary:</strong> Introduction Background of the Project The First Year The Second Year </body>
For other scripts that use the hold space, see Section 34.18. For a fanciful analogy that makes clear how it works, see Section 34.17.
--DD and JP
Copyright © 2003 O'Reilly & Associates. All rights reserved.