Contents:
Fundamentals of Effective Layout
Formatting SQL Statements
Formatting Control Structures
Formatting PL/SQL Blocks
Formatting Packages
Using Comments Effectively
Documenting the Entire Package
You can learn everything about a programming language -- its syntax, high-performance tips, and advanced features -- and still write programs that are virtually unreadable, hard to maintain, and devilishly difficult to debug -- even by you, the author. You can be very smart and very clever, and yet develop applications that obscure your talent and accomplishments.
This chapter addresses the "look-and-feel" of your code -- the aesthetic aspect of programming. I am sure that you have all experienced the pleasure of reading well-structured and well-formatted code. You have also probably experienced a pang of jealousy at that programmer's style and effort, wondering where she or he found the time to do it right. Developers always experience a feeling of intense pride and satisfaction from carefully and artfully designing the visual layout of their code. Yet few of us take the time to develop a style and use it consistently in our work.
Of course, the impact of a coding style goes well beyond the personal satisfaction of any individual. A consistent, predictable approach to building programs makes it easier to debug and maintain that code. If everyone takes her own approach to structuring, documenting, and naming her code, every program becomes its own little pool of quicksand. It is virtually impossible for another person to put in a foot and test the water (find the source of a problem, analyze dependencies, etc.) without being pulled under.
I discuss the elements of an effective coding style in the PL/SQL language at this juncture, before we get to any code, for two reasons:
To drive home the point that if you are going to adopt a coding style which will improve the readability and maintainability of your application, you need to do it at the beginning of your project. Programming style involves an attention to detail that can be built only during the construction process. You are not going to go back into existing code and modify the indentation, case, and documentation format after the project is done.
To provide an explanation for the format and style which I employ throughout the book. While I can't promise that every line of code in the book will follow all of the guidelines in this chapter, I hope that you will perceive a consistent style that is easy on the eye and helpful in aiding comprehension.
Views on effective coding style are often religious in nature (similar to programmers' ideas on the use of GOTO) -- that is, based largely on faith instead of rationality. I don't expect you to agree with everything in this chapter (actually, in a number of places I suggest several alternatives). Such unanimity is unrealistic and unnecessary. Rather, I hope that this chapter gets you thinking about the style in your own programming.
There is really just one fundamental objective of code layout:
Reveal and reinforce the logical structure of your program . |
You could come up with ways of writing your code that are very pleasing to the eye, but doing so is less important than choosing a format that shows the structure and intent of the program.
It is easy to address the topic of effective code layout for PL/SQL because it is such a well structured language. It benefits greatly from Ada's block structure approach. Each control construct, such as IF and LOOP, has its own terminator keyword, such as END IF and END LOOP. Each logical block of code has an explicit beginning statement and an associated ending statement. This consistent and natural block style lends itself easily and naturally to standards for indentation and whitespace, which further expose the structure of the code.
Indentation is one of the most common and effective techniques used to display a program's logic via format. As illustrated in the following examples, programs that are indented are easier to read than those that are not indented, although programs that use excessive indentation are not much more readable than unindented programs. Here is an unindented IF statement:
IF to_number(the_value) > 22 THEN IF max_totals = 0 THEN calc_totals; ELSE WHILE more_data LOOP analyze_results; END LOOP; END IF; END IF;
The lack of indentation in this example makes it very difficult to pick out the statements that go with each clause in the IF statement. Some developers, unfortunately, go to the opposite extreme and use six or more spaces for indentation. (This usually occurs by relying on the tab key, which offers "logical" indentation -- a tab can be equivalent to three spaces in one editor and eight in another. I suggest avoiding the use of tabs altogether.)
I have found that a three-space indentation not only adequately reveals the logical structure of the code but also keeps the statements close enough together to read comfortably. And, with deeply nested structures, you won't run off the right margin as quickly! Here is the three-space indented version of the previous nested IF statement:
IF to_number(the_value) > 22 THEN IF max_totals = 0 THEN calc_totals; ELSE WHILE more_data LOOP analyze_results; END LOOP; END IF; END IF;
The rest of this chapter presents specific techniques that I have found to be essential in writing attractive, readable code that reveals the logic of my programs.
PL/SQL code is made up of many different components: variables, form items, report fields, procedures, functions, loops, declarations, control elements, etc. But they break down roughly into two types of text: reserved words and application-specific names or identifiers.
Reserved words are those names of language elements that are reserved by PL/SQL and have a special meaning for the compiler. Some examples of reserved words in PL/SQL are:
WHILE IF BEGIN TO_CHAR
Application-specific identifiers are the names that you, the programmer, give to data and program structures that are specific to your application and that vary from system to system.
The compiler treats these two kinds of text very differently. You can improve the readability of your code greatly by reflecting this difference in the way the text is displayed. Many developers make no distinction between reserved words and application-specific identifiers. Consider the following lines of code:
if to_number(the_value)>22 and num1 between lval and hval then newval := 100; elsif to_number(the_value) < 1 then calc_tots(to_date('12-jan-95')); else clear_vals; end if;
While the use of indentation makes it easier to follow the logical flow of the IF statement, all the words in the statements tend to blend together. It is difficult to separate the reserved words and the application identifiers in this code. Changing entirely to uppercase also will not improve matters. Indiscriminate, albeit consistent, use of upper- or lowercase for your code reduces its readability. The distinction between reserved words and application-specific identifiers is ignored in the formatting. This translates into a loss of information and comprehension for a developer.
You can easily solve this problem by adopting a guideline for using a mix of upper- and lowercase to your code. I have recoded my previous example below, this time using the UPPER-lower style: all reserved words are written in UPPERCASE and all application names are kept in lowercase:
IF to_number(the_value) > 22 AND num1 BETWEEN lval AND hval THEN newval := 100; ELSIF TO_NUMBER (the_value) < 1 THEN calc_tots (TO_DATE ('12-jan-95')); ELSE clear_vals; END IF;
Using a mixture of upper- and lowercase words increases the readability of the code by giving a sense of dimension to the code. The eye can more easily cruise over the text and pick the different syntactical elements of each statement. The uppercase words act as signposts directing the activity in the code. You can focus quickly on the lowercase words for the application-specific content. Consistent use of this method makes the program listings more attractive and accessible at a glance.
Most of your code consists of individual statements, such as assignments, calls to modules, and declarations. A consistent approach to formatting and grouping such statements will improve the readability of your program as a whole. This section suggests some guidelines.
As we discussed in Chapter 2, PL/SQL Language Fundamentals , PL/SQL uses the semicolon ( ; ) as the logical terminator for a statement. As a result you can have more than one statement on a line and you can continue a single executable statement over more than one line. You will sometimes be tempted to place several statements on a single line, particularly if they are very simple. Consider the following line:
new_id := 15; calc_total (new_id); max_dollars := 105 * sales_adj;
It is very difficult to pick out the individual statements in this line, in addition to the fact that a procedure is called in the middle of the line. By placing each statement on its own line you mirror the complexity of a program -- the simple lines look simple and the complex statements look complex -- and reinforce the top-to-bottom logic of the program:
new_id := 15; calc_total (new_id); max_dollars := 105 * sales_adj;
You can scan the left margin (which will move left and right depending on the logic and corresponding indentation) and know that you are reviewing all the lines of code.
You can use all the indentation and blank lines you want to reveal the logic of a program and still end up with some very dense and unreadable code. It is also important to employ whitespace within a single line to make that one statement more comprehensible. Here are two general rules I employ in my code:
Always include a space between every identifier and separator in a statement. Instead of this:
WHILE(total_sales<maximum_sales AND company_type='NEW')LOOP
write this:
WHILE (total_sales < maximum_sales AND company_type = 'NEW') LOOP
Use spaces to make module calls and their parameter lists more understandable. Instead of this:
calc_totals(company_id,LAST_DAY(end_of_year_date),total_type);
write this:
calc_totals (company_id, LAST_DAY (end_of_year_date), total_type);
The declaration section declares the local variables and other structures to be in your PL/SQL block. This section comes right at the top of the block, so it sets the first impression for the rest of the program. If the declaration section has no apparent order and is poorly formatted, it is unlikely that anything else in that program will be easily understood.
The declaration section in PL/SQL can contain many different kinds of declarations: simple, scalar variables; complex data structures like records and tables; exceptions; even entire subprograms which exist only in that program.
The following sections give some guidelines for creating your declaration statements.
You will be particularly tempted to "double up" declarations on a single line because, in general, declarations are very short in length. Resist that temptation! Which of the following sets of declarations would you prefer to try to understand at a glance?
DECLARE comp_type VARCHAR2(3); right_now DATE := SYSDATE; month_num INTEGER;
or:
DECLARE comp_type VARCHAR2(3); right_now DATE := SYSDATE; month_num INTEGER;
Many programmers like to align their declarations -- for example:
DECLARE company_name VARCHAR2(30); company_id INTEGER; employee_name VARCHAR2(60); hire_date DATE; termination_date DATE; min_value NUMBER;
I am not convinced of the value of declaration alignment. Although alignment makes it easier to scan down the datatypes, the datatype isn't nearly as important as the identifier, which is already left-justified. A commitment to alignment also raises all kinds of questions that consume a developer's time and thought processes: If you have one long variable name, do you have to move all the other datatypes out to match that datatype declaration? What about when you add a new, very long declaration into an existing section? Do you have to go back and add a tab or two to the existing declarations?
The elegance of alignment also breaks down when you include comments above individual declarations, as shown in the following example:
DECLARE company_name VARCHAR2(30); /* Primary key into company table */ company_id INTEGER; employee_name VARCHAR2(60); /* Date hired; must be no greater than today's date. */ hire_date DATE; termination_date DATE; min_value NUMBER;
When the comment text cuts across the vast spaces of the alignment tabs, it just makes the datatype look isolated from its identifier.
I believe that you are better off ignoring alignment for declarations.[ 1 ] Keep the elements of the declaration (datatype and default value) close to the identifier.
[1] I recognize, however, that many developers I respect greatly for their code quality and elegance differ with me strongly on this point.
Because a statement is terminated by a semicolon ( ; ) rather than by the physical end of the line, statements can be continued onto additional lines without any specific continuation symbol or operator. This makes it very easy to spread a statement across more than one line, but it can also make it difficult to read across these lines.
Here are some examples of multiline statements that are hard to follow:
IF total_sales < maximum_sales AND company_type = 'NEW' AND (override = 'Y' OR total_company_revenue < planned_revenue (SYSDATE)) THEN accept_order; END IF; generate_company_statistics (company_id, last_year_date , rollup_type, total, average, variance, budgeted, next_year_plan); total_sales := product_sales (company_id) + admin_cutbacks * .5 - overhead - golden_parachutes;
The format of these continuation lines highlights a key question: How do you best break up a complex expression so the different parts can be read clearly, but still be connected to the statement as a whole? The following guidelines respond to this question and produce much cleaner code.
This is the most important guideline. The best way to identify continuation lines is to use indentation to logically subsume those lines under the main or first line of the statement. The following call to generate_company_statistics is obscured because the continuation line butts right up against the left margin with the module name:
generate_company_statistics (company_id, last_year_date, rollup_type, total, average, variance, budgeted, next_year_plan);
If I indent the continuation line, the relationship of the second line to the first becomes clear:
generate_company_statistics (company_id, last_year_date, rollup_type, total, average, variance, budgeted, next_year_plan);
This attempt to recode, however, shows that simply adding an indentation isn't always enough. While it is clear that the line starting with rollup_type "belongs" to the previous line, the relationship of the text on the continuation line to that of the first line is unclear. We need more than a simple call to "Indent." There are several possible approaches:
You can place a single parameter on each line for maximum clarity, or include more than one parameter on a line -- as long as they are properly indented. You can even break up the parameters so that related parameters go together on separate lines. If the name of the procedure is long and results in pushing the alignment column for parameters too far to the right, start the entire parameter list on the next line (indented) and then align all parameters against that second line. Here are some examples illustrating these rules:
gen_stats (company_id, last_year_date, rollup_type, total, average, variance, budgeted, next_year_plan); gen_stats (company_id, last_year_date, rollup_type, total, average, variance, budgeted, next_year_plan); gen_stats (company_id, last_year_date, rollup_type, total, average, variance, budgeted, next_year_plan);
I prefer the third alternative, in which all parameters are moved to the line following the name of the module. You can then place multiple parameters on the same line or place one parameter on each line, but the indentation is always and only the standard three spaces in from the start of the module name.
If a statement is not going to fit onto a single line, break up the statement so that it is quite obvious, with the most casual glance, that the first line could not possibly finish the statement. The following examples highlight this approach:
The IN statement of a loop clearly needs its range:
FOR month_index IN first_month .. last_month LOOP ...
An assignment could not possibly end with a "+":
q1_sales := month1_sales + month2_sales + month3_sales;
The last comma in the first line of parameters indicates that other parameters follow:
generate_company_statistics (company_id, last_year_date, rollup_type, total, average, variance, budgeted, next_year_plan);
Copyright (c) 2000 O'Reilly & Associates. All rights reserved.