I like functions. They can replace an awful lot of complex logic in an expression with a simple statement of the result of all that logic. The following tips will help you construct functions that are as useful as possible.
Your function has one purpose in life: returning a single value via the RETURN statement. If it does anything else, such as update global or database information, then that function has potentially created a side effect. Side effects generally limit functions' usefulness. Let's look at a few examples and discuss how you can avoid side effects.
Although a function returns its value with the RETURN statement, PL/SQL allows you to define OUT and IN OUT parameters for a function. If you do that, the function can actually pass changed data back through the parameter itself into the calling block of code through the parameters. This is generally not a good idea, and is a typical side effect in a function.
The format_name function below contains side effect parameters. Let's examine the impact of these parameters. The function takes a first name and a last name and returns a full name with the format LAST, FIRST. Because a requirement of the application happens to be that all names must be in uppercase, the function also converts the first and last names to their uppercase versions. It uses two IN OUT parameters to do this.
FUNCTION format_name (first_name_inout IN OUT VARCHAR2, last_name_inout IN OUT VARCHAR2) RETURN VARCHAR2 IS BEGIN first_name_inout := UPPER (first_name_inout); last_name_inout := UPPER (last_name_inout); RETURN last_name_inout || ', ' || first_name_inout; END;
If the application requires uppercase names, you may wonder, then, what is wrong with converting the first and last names to uppercase in format_name? First of all, this approach may be trying to meet the requirement a bit too enthusiastically. If the formatted name -- the output, that is, of the function -- must be in uppercase, you can certainly accomplish that without modifying the parameters, as follows:
FUNCTION format_name (first_name_in IN VARCHAR2, last_name_in IN VARCHAR2) RETURN VARCHAR2 IS BEGIN RETURN UPPER (last_name_in || ', ' || first_name_in); END;
The second reason you would not want to automatically uppercase the name components is that you will make format_name less useful as a basic utility. Your current application may insist on uppercase, but there will be plenty of other applications that maintain entities having first and last names -- and they do not have a similar requirement.
If you insist on the IN OUT parameters, the format_name utility could not be used in these other application development efforts.
If you do want a module that will convert both individual name components and the combined name to uppercase, then you would be better served with a procedure:
PROCEDURE combine_and_format_names (first_name_inout IN OUT VARCHAR2, last_name_inout IN OUT VARCHAR2, full_name_out OUT VARCHAR2) IS BEGIN first_name_inout := UPPER (first_name_inout); last_name_inout := UPPER (last_name_inout); full_name_out := last_name_inout || ', ' || first_name_inout; END;
By using a procedure, you make it clearer that the module is applying changes to all the parameters. You avoid providing a function with side effects (i.e., other actions and impact beyond the return of a value). Let's take a look at the way these different modules would appear in your code.
Let's first look at the function version. All our attention is focused on depositing the formatted name into the caller_name item:
full_name := format_name (first_name, :last_name);
Now let's look at the procedure version. The use of the plural of "name" in the name of the procedure indicates a change to more than one of the parameters:
combine_and_format_names (first_name, last_name, full_name);
Generally speaking, when your function has a side effect such as the uppercasing of the parameter values, that function is less broadly useful. What if programmers don't want to have the other values uppercased? They would have to write their own function that does the same thing as format_name, but without the UPPER statements. This code redundancy will create nightmares down the road. If the function is supposed to return a formatted name based on the first and last names, then that is all it should do -- never use OUT and IN OUT parameters with a function.
Such parameters are not the only kinds of side effects you may be tempted to slip into a function. Consider this Oracle Forms function, which retrieves the name of a company from its primary key:
FUNCTION company_name (company_id_in IN company.company_id%TYPE) RETURN VARCHAR2 IS CURSOR name_cur IS SELECT name FROM company WHERE company_id = company_id_in; name_rec name_cur%ROWTYPE; BEGIN OPEN name_cur; FETCH name_cur INTO name_rec; IF name_cur%FOUND THEN CLOSE name_cur; RETURN name_rec.name; ELSE BELL; MESSAGE ('Invalid company ID: ' || TO_CHAR (company_id_in), ACKNOWLEDGE); RAISE FORM_TRIGGER_FAILURE; END IF; END;
When the company ID returns an existing company, the function properly closes the cursor and returns the company's name. When the SELECT statement comes up empty, however, I change my approach completely. Rather than return a value, I go into the equivalent of PL/SQL panic mode: ring the bell, display a message (and force the user to acknowledge that message), and then fail out of the calling trigger. I don't even close the cursor! Rather strong stuff, and very much out of place in this function.
My response to invalid data is a side effect in the function, made all the worse by the fact that my side effect completely substitutes for a RETURN. When the company ID does not find a match, I do not execute a RETURN statement at all.
The consequences of my error handling in this function are:
This function can be called only in Oracle Forms. It is, however, a fairly generic task: take a primary key and return a name/description. It could be a function stored in the database and available to any program that needs to perform the lookup. I have now made that impossible.
Anyone who calls this function gives up control over his or her own program. You cannot decide for yourself what to do if the name is not found. That message is displayed whether or not it is appropriate. That bell sounds even if users don't like to announce to everyone around them that they made a mistake. The best you can do is code an exception handler for FORM_TRIGGER_FAILURE in whatever trigger or program calls company_name.
A much better approach to the company_name function is simply to return a NULL value if the company is not found. Since the company name is a NOT NULL column, a NULL return value clearly indicates "no data found." This new approach is shown in the following example:
FUNCTION company_name (company_id_in IN company.company_id%TYPE) RETURN VARCHAR2 IS CURSOR name_cur IS SELECT name FROM company WHERE company_id = company_id_in; name_rec name_cur%ROWTYPE; no_company_found EXCEPTION; BEGIN OPEN name_cur; FETCH name_cur INTO name_rec; IF name_cur%FOUND THEN CLOSE name_cur; RETURN name_rec.name; ELSE RAISE no_company_found; END IF; EXCEPTION WHEN no_company_found OR OTHERS THEN CLOSE name_cur; RETURN NULL; END;
This is now a nonjudgmental, mind-your-own-business function. It doesn't try to dictate the terms of surrender, or tell programmers who call it how they should handle a lack of data. It simply reports back the problem by returning a NULL value. Users of the function may then decide for themselves on an appropriate consequence, which might well consist of every action formerly embedded in the function itself:
new_company := company_name (:company.company_id); IF new_company IS NULL THEN BELL; MESSAGE ('Invalid company ID: ' || TO_CHAR (:company.company_id), ACKNOWLEDGE); RAISE FORM_TRIGGER_FAILURE; END IF;
At least now the programmer has freedom of choice.
Side effects in functions come in many shapes and flavors. If you remember to keep the focus of the function on computing and returning its value, the resulting module will be more effectively and widely used.
A function exists to return a single value. The best way to structurally emphasize this single-minded focus in the body of the function is to make sure that the only RETURN statement in the execution section of the function is the last executable statement. Consider the following code-translating function:
FUNCTION status_desc (status_cd_in IN VARCHAR2) RETURN VARCHAR2 IS BEGIN IF status_cd_in = 'C' THEN RETURN 'CLOSED'; ELSIF status_cd_in = 'O' THEN RETURN 'OPEN'; ELSIF status_cd_in = 'A' THEN RETURN 'ACTIVE'; ELSIF status_cd_in = 'I' THEN RETURN 'INACTIVE'; END IF; END;
Each of the different RETURN statements is actually a different exit point in this function. You are now probably saying to yourself, "This is such a simple program. What's wrong with using the different RETURN statements?"
Granted, the program is straightforward and short enough so that, at a glance, you understand its purpose and follow the exit flow. Unfortunately, it also is prone to failure. What if the status code that is passed into status_desc is not one of C, O, A, or I? Then the function does not even execute a RETURN statement, leaving the calling program with indeterminate results. Now examine the version of status_desc that relies on a single RETURN:
FUNCTION status_desc (status_cd_in IN VARCHAR2) RETURN VARCHAR2 IS return_value VARCHAR2 (20) := NULL; BEGIN IF status_cd_in = 'C' THEN return_value := 'CLOSED'; ELSIF status_cd_in = 'O' THEN return_value := 'OPEN'; ELSIF status_cd_in = 'A' THEN return_value := 'ACTIVE'; ELSIF status_cd_in = 'I' THEN return_value := 'INACTIVE'; END IF; RETURN return_value; END;
Here, my IF statement assigns the description to a local variable. After the IF statement, the RETURN statement serves its purpose in life, which is to pass the value of the variable back to the calling program. Because I initialize the temporary description to NULL, if the status code is not one of the chosen four, then I return NULL. In any case, the result is that, by placing the RETURN statement at the last line of the execution section of the function, I always execute a RETURN statement when the function completes successfully. There are no holes in my return logic.
This approach is absolutely critical when your function is more elaborate and longer than 10 or 20 lines in length, perhaps even hundreds of lines long. It can be very difficult to follow multi-page complex logic, and nearly impossible to confirm, simply by reading the code, that the function always issues a RETURN statement for all branches of the logic. Your application works much more reliably if you build right into it the requirement that the only RETURN statement in the function occurs at the end of the program.
I realize that it is one thing to read through all of the tips in this book and, as you read them, nod in agreement with their wisdom. It is quite another thing to both remember the tips and make the time to put them to use in your code. The best way I have found to at least improve the chances that I follow even my own advice is to set hard and fast rules -- and then stick to them. When it comes to "Use a single RETURN statement for successful termination," I try to always follow two simple rules when creating a function:
Always declare a local variable named "return_value" with a datatype that matches the RETURN clause of the function. If I am returning a VARCHAR2, declare a VARCHAR2 local variable. If I am returning data with the same datatype as a table column, declare that. Here are two examples:
FUNCTION net_sales (company_id_in IN company.company_id%TYPE) RETURN NUMBER IS return_value NUMBER; BEGIN ... END; or: FUNCTION company_name (company_id_in IN company.company_id%TYPE) RETURN company.name%TYPE IS return_value company.name%TYPE; BEGIN ... END;
Always make the last executable statement of my function the RETURN statement as follows:
RETURN return_value;
By following these two rules, the general structure of my function now looks like this:
FUNCTION function_name (parameter_list) RETURN datatype IS return_value datatype; BEGIN ... executable code of function ... RETURN return_value; EXCEPTION ... optional exception section ... END;
The name of the local variable states quite clearly that it is standing in for the data that is to be returned by the function. Any time this return_value variable is assigned a value in the function, you know that the function is fulfilling its purpose of generating a value to be returned with the RETURN statement. It provides a clean, understandable, and predictable structure for the function. I have even gone so far as to create a template script on disk that implements this guideline. It looks like this:
/* Filename on companion disk: functmpl.sf */ CREATE OR REPLACE FUNCTION fname () RETURN datatype IS return_value datatype; BEGIN RETURN return_value; END fname;
Then I just have to search-and-replace on "fname" and "datatype" and I am all ready to go.
One exception to the rule of a single RETURN statement in a function concerns exception handling. A function should return a value, whether it completes successfully or it fails. Upon successful completion, you can use the above structure to guarantee that the RETURN executes. What if the function fails? Any exception handler in a function should have as its last executable statement a RETURN of its own, as this example shows:
FUNCTION present_value (FV IN NUMBER, interest_rate IN NUMBER, period IN NUMBER) RETURN NUMBER IS return_value NUMBER; negative_value EXCEPTION; BEGIN return_value := FV / ((1 + interest_rate)**period); IF return_value < 0 THEN RAISE negative_value; ELSE RETURN return_value; END IF; EXCEPTION WHEN negative_value THEN RETURN 0; WHEN OTHERS THEN RETURN NULL; END;
When each handler has its own RETURN statement, you can be sure that the function returns a value regardless of the way it is terminated. The present_value function also illustrates another important principle: if the program encounters an error, always call an exception handler to deal with the problem. Do not process the error inline. PL/SQL provides a very structured distinction between the executable body of the program and the exception section. As soon as you hit an error or logical exception to normal processing, RAISE an exception to transfer control out of normal execution.
There is always only one entry point for a module: the first executable statement is always the one that follows the BEGIN statement. PL/SQL does not allow you to enter a block at an arbitrary line of code. The same cannot be said, however, for the way in which a module terminates. A PL/SQL program unit may complete its execution either through the last executable statement, a RETURN statement, or the exception handler section, as is appropriate:
The last executable statement in a program is the normal exit point for a procedure.
The RETURN statement is the normal exit point for a function.
The exception section is the way out of a module that has hit an error and raised an exception.
A programming language like PL/SQL is constructed very carefully: every keyword and element of syntax is chosen with a specific purpose. Although you can at times justify using a language construct in an unorthodox way, in most cases such an action will raise a red flag and be closely examined. One example of such linguistic abuse occurs when a programmer uses an exception handler to perform a normal program exit. Here is an example:
FUNCTION company_name (company_id_in IN company.company_id%TYPE) RETURN VARCHAR2 IS cname company.company_id%TYPE; found_it EXCEPTION; BEGIN SELECT name INTO cname FROM company WHERE company_id = company_id_in; RAISE found_it; EXCEPTION WHEN NO_DATA_FOUND THEN RETURN NULL; WHEN found_it THEN RETURN cname; END;
In this function the programmer uses an implicit cursor to try to find the company that matches the incoming primary key. If no record is found, the program properly handles the NO_DATA_FOUND exception and returns NULL. But if a record is returned, then the program RAISEs an exception, which in turn returns the company name selected.
The found_it exception is not an exception at all. It indicates successful completion of the program. Rather than use an exception, the function should simply issue a RETURN with cname, indicating that the SELECT was successful:
FUNCTION company_name (company_id_in IN company.company_id%TYPE) RETURN VARCHAR2 IS cname company.company_id%TYPE; BEGIN SELECT name INTO cname FROM company WHERE company_id = company_id_in; RETURN cname; EXCEPTION WHEN NO_DATA_FOUND THEN RETURN NULL; END;
In this case the replacement of the RAISE with a RETURN was obvious and straightforward; no other code changes were required. In other situations, it can take much more effort to undo the poor programming practices of others. It is much better to do the job right the first time!
Programmers are often tempted to use the RAISE statement because it acts like a GOTO: it immediately halts execution of the program and transfers control to the exception section. This GOTO-like behavior can be very convenient when a programmer cannot figure out how to use IF statements in a structured way to neatly end the program. In this case, the RAISE statement becomes an easy way out.
If you find yourself writing programs that rely on the RAISE statement to do something besides handle exceptions, take a step back. Examine the logical flow of the program and explore how you can use the standard control structures (IF, LOOP, and perhaps even GOTO as a last resort) to accomplish your task instead.
Just about every piece of software you write makes assumptions about the data it manipulates. For example, parameters may have only certain values or be within a certain range, or a string value should have a certain format. In both of these cases, an underlying data structure is assumed to have been created. It's fine to have such rules and assumptions, but it is also very important to verify or "assert" that none of the rules is being violated.
One approach you can take in this verification process is to make use of the IF statement at the beginning of your program. The status_desc function shown below uses the conditional construct to implicitly validate the value of the argument. The function does not actively or explicitly check to see if the status code argument is one of C, O, A, or I. Instead, the IF statement returns a non-null value only if a valid code is provided.
FUNCTION status_desc (status_cd_in IN VARCHAR2) RETURN VARCHAR2 IS return_value VARCHAR2 (20) := NULL; BEGIN IF status_cd_in = 'C' THEN return_value := 'CLOSED'; ELSIF status_cd_in = 'O' THEN return_value := 'OPEN'; ELSIF status_cd_in = 'A' THEN return_value := 'ACTIVE'; ELSIF status_cd_in = 'I' THEN return_value := 'INACTIVE'; END IF; RETURN return_value; END;
The status_desc function does not really assert anything. If the programmer passes an invalid status code, the function returns NULL. Sometimes this kind of behavior makes sense. Perhaps a status code other than C, O, A, or I is a valid entry and a NULL description is useful information in the application. In that case, an assertion of argument validity is not needed.
On the other hand, what if a legal value should always be passed to status_desc? What if a different value means that a programmer has coded her use of the function improperly? If status_desc is an "internal" function accessed only by programmers and not intended to handle user input, then an illegal status code probably reflects a typographical error or a misunderstanding of the program's use. In this case, the lack of feedback by status_desc on the invalid argument results in hard-to-trace errors. Because the programmer is not informed of the typographical mistake when it happens, the problem propagates through other modules before it surfaces.
If the status_desc function is built to accept only four valid entries, then it should raise a red flag when that rule is broken. The following function shows a version of status_desc that adds an ELSE clause to trap an invalid entry:
FUNCTION status_desc (status_cd_in IN VARCHAR2) RETURN VARCHAR2 IS return_value VARCHAR2 (20) := NULL; BEGIN IF status_cd_in = 'C' THEN return_value := 'CLOSED'; ELSIF status_cd_in = 'O' THEN return_value := 'OPEN'; ELSIF status_cd_in = 'A' THEN return_value := 'ACTIVE'; ELSIF status_cd_in = 'I' THEN return_value := 'INACTIVE'; ELSE MESSAGE ('Invalid status code: ' || status_cd_in); RAISE FORM_TRIGGER_FAILURE; END IF; RETURN return_value; END;
With this new version, programmers get immediate feedback if they pass an illegal value to status_desc. That problem is solved, but I personally find this solution less than totally satisfactory. Since it forces me to add several lines of code to the body of my program -- lines that were added simply to handle values that never should have gotten into the program -- this approach is somewhat intrusive and inelegant. Consider the alternative below:
FUNCTION status_desc (status_cd_in IN VARCHAR2) RETURN VARCHAR2 IS return_value VARCHAR2 (20) := NULL; BEGIN /* Assert that the status code is valid */ assert_condition (status_cd_in IN ('C', 'O', 'A', 'I')); /* Now perform processing of valid argument. */ IF status_cd_in = 'C' THEN return_value := 'CLOSED'; ELSIF status_cd_in = 'O' THEN return_value := 'OPEN'; ELSIF status_cd_in = 'A' THEN return_value := 'ACTIVE'; ELSIF status_cd_in = 'I' THEN return_value := 'INACTIVE'; END IF; RETURN return_value; END;
The procedure, assert_condition, looks like this:
PROCEDURE assert_condition (condition_in IN BOOLEAN) IS BEGIN IF NOT condition_in THEN /* Uses generic PL/SQL exception */ RAISE VALUE_ERROR; END IF; END;
Now the first line of my program is a call to a very general assertion module. I pass the condition, which must evaluate to TRUE in order for the assumptions to hold. In this case the condition was a fairly complex Boolean expression:
status_cd_in IN ('C', 'O', 'A', 'I')
This kind of a program is called an assertion module because its single Boolean parameter asserts that a condition is true, and leaves it to the module to reject that claim. Clearly, there isn't much to assert_condition. It just hides an IF statement behind a procedural interface. Yet, by including the call to assert_condition, I can let the remainder of the body of my program concentrate on handling the valid data properly. I do not have to include special handling for illegal values.
Because assert_condition is very generic, I can use it in any program that has a condition I can express in a single Boolean expression. I don't have to code special nested logic repeatedly. I tell assert_condition what rule I want to test, and leave it to the module to take action (raise an exception) when necessary.
Of course, you can build less generic versions of assert_condition. The next function shows an assertion module for Oracle Forms that displays an optional message to the screen:
PROCEDURE assert_condition (condition_in IN BOOLEAN, message_in IN VARCHAR2 := NULL) IS BEGIN IF NOT condition_in THEN /* Use NVL to substitute default message if not provided. */ MESSAGE (NVL (message_in, 'Check Condition Failure!')); /* Raise an exception specific to Oracle Forms. */ RAISE FORM_TRIGGER_FAILURE; END IF; END;
You can even produce assertion modules that handle very specific kinds of conditions. If you do a lot of work with record groups in Oracle Forms, you find yourself checking repeatedly at the beginning of your program to see that the name or handle for the record group points to a valid record group. You do this with the ID_NULL and FIND_GROUP built-ins, as follows:
IF ID_NULL (FIND_GROUP ('monthly_sales')) THEN MESSAGE ('Record group does not exist!'); RAISE FORM_TRIGGER_FAILURE; ELSE FOR month_count IN 1 .. GET_GROUP_ROW_COUNT ('monthly_sales') LOOP ... process monthly data ... END LOOP; END IF;
Again, rather than write these statements over and over in each of your programs, you can bundle them into a standalone module, as shown below:
assert_valid_group ('monthly_sales'); FOR month_count IN 1 .. GET_GROUP_ROW_COUNT ('monthly_sales') LOOP ... process monthly data ... END LOOP;
Then simply call that assertion program:
PROCEDURE assert_valid_group (rg_name_in IN VARCHAR2) /* || Assertion: if there is not a valid Id for the record || group, display message and raise exception. This is a || "design" error; code should not continue executing. */ IS rg_id RECORDGROUP := FIND_GROUP (rg_name_in); BEGIN IF ID_NULL (rg_id) THEN MESSAGE (' Record group '||rg_name_in|| ' is not defined.'); END IF; END;
A big advantage of the assertion procedure over the ELSE clause is that it clearly states, at the very beginning of the module, the assumptions for that program. With the ELSE clause, a failure is a byproduct of the module's execution. The rule violation is only implied by the structure of the IF statement and not expressed explicitly.
But wait. If I call assert_condition from within a function and the assertion fails, the function will not execute a RETURN statement. Doesn't the use of an assertion module violate my guideline to always return a value from a function? The assert_condition procedure bails out of the calling module (or at least forces a transfer to the exception section) without any regard for whether a RETURN statement is needed. Even worse, doesn't the Oracle Forms version containing a MESSAGE statement contradict my suggestion to avoid side effects in a function?
Yes and no. Yes, I admit that when assert_condition RAISEs its exception, you do not issue a RETURN statement (unless you code for it yourself in the function, but then you would have to know which exception was RAISEd by assert_condition). I also admit that the MESSAGE is a side effect in the function, the main purpose of which is to return a row number in a table.
But I can explain everything, honest!
You see, the point of assert_condition is to test the underlying assumptions in the very structure of the module. If status_desc is called with an invalid code, that kind of error is very different from, say, a NO_DATA_FOUND when obtaining the name for a particular company ID number. The status_desc function as a whole is invalid if it is not passed a valid code. In this context, it doesn't even make sense to continue processing. The assert_condition procedure uncovers design level errors in the code that must be corrected before you can even worry about data entry errors or other application-level concerns.
The logic behind assert_condition is similar to the way Oracle Forms handles level 25 errors when it executes a form. No matter how high a value you have set for SYSTEM.MESSAGE_LEVEL, Oracle Forms always halts processing in a form when it encounters a level 25 error, which indicates a design error in the application.
Copyright (c) 2000 O'Reilly & Associates. All rights reserved.