Referential integrity is a condition in which all references to external objects within each database object are valid. Enforcing referential integrity is a critical task in ensuring that data is accurate and complete. Referential integrity problems can
result in data loss, wasted storage, and inaccurate data. Overall database integrity is a broader issue, relating to the values stored within a single object. Non-referential problems with integrity are typically somewhat less serious, resulting in
inaccuracies and the storage of unwanted values.
Oracle provides many ways to enforce integrity, including column constraints, table constraints, sequences, and triggers. Each of these methods will be examined in this chapter, through the use of a common example.
Many different types of applications need to store information regarding individuals, companies, phone numbers, and addresses. This same information has been stored in databases in many different ways, with varying degrees of success. Assume, for the
sake of example, that the following rules apply to the storage of this information.
- Individuals, Companies, Addresses, and Phones should not be stored redundantly.
- An individual may be related to zero, one, or many companies.
- Addresses and Phones may be related to an individual, a company, or an individual at a company.
- Addresses and Phones will have standard descriptions, such as Home and Office.
- A phone may or may not be related to an address.
- The current user and timestamp will be stored for each record when it is inserted or updated, and the application will use an optimistic locking scheme.
The DDL in Listing 20.1 provides an example of one possible physical model that will accommodate these rules.
Listing 20.1. This DDL contains the column attributes for the sample application that will be used to demonstrate integrity constraints.
CREATE TABLE individual (
ID NUMBER(10)
,last_name VARCHAR2(30)
,first_name VARCHAR2(30)
,notes VARCHAR2(255)
,date_of_birth DATE
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
CREATE TABLE company (
ID NUMBER(10)
,name VARCHAR2(30)
,notes VARCHAR2(255)
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
CREATE TABLE ind_co_rel (
individual_id NUMBER(10)
,company_id NUMBER(10)
,title VARCHAR2(80)
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
CREATE TABLE address (
ID NUMBER(10)
,address_type_id NUMBER(10)
,address_line1 VARCHAR2(40)
,address_line2 VARCHAR2(40)
,address_line3 VARCHAR2(40)
,city VARCHAR2(40)
,state CHAR(2)
,zip NUMBER(5)
,zip_4 NUMBER(4)
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
CREATE TABLE phone (
ID NUMBER(10)
,phone_type_id NUMBER(10)
,area_code CHAR(3)
,prefix CHAR(3)
,line CHAR(4)
,extension VARCHAR2(6)
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
CREATE TABLE address_rel (
address_id NUMBER(10)
,individual_id NUMBER(10)
,company_id NUMBER(10)
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
CREATE TABLE phone_rel (
phone_id NUMBER(10)
,address_id NUMBER(10)
,individual_id NUMBER(10)
,company_id NUMBER(10)
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
CREATE TABLE address_type (
ID NUMBER(10)
,description VARCHAR2(40)
,last_updt_user VARCHAR2(40)
,last_updt_date DATE
);
CREATE TABLE phone_type (
ID NUMBER(10)
,description VARCHAR2(40)
,last_updt_user VARCHAR2(40)
,last_updt_date DATE
);
Although this data typically makes up only a small portion of an real application's data model, the many relationships among these tables present great potential for referential integrity problems. In the sections that follow, issues relating to the
integrity of these objects are used to illustrate the various means of enforcing integrity in Oracle databases.
Column constraints are probably the most widely used means of enforcing integrity. Of these, PRIMARY KEY is the most significant. It is used to ensure that each row in the table is unique. When a column is declared as the PRIMARY KEY, an index on this
column is automatically created and assigned a unique name by Oracle. The additional constraints UNIQUE and NOT NULL are implied by the PRIMARY KEY constraint.
In the example application, the ID column is used as the PRIMARY KEY in each table in which it occurs. This column contains unique numeric values generated by an Oracle sequence. Listing 20.2 contains the DDL for the individual table with the column
constraint PRIMARY KEY enabled. Assume that the ID column is declared identically in each table where it is used.
Listing 20.2. This DDL contains a PRIMARY KEY column constraint.
CREATE TABLE individual (
ID NUMBER(10) PRIMARY KEY
,last_name VARCHAR2(30)
,first_name VARCHAR2(30)
,notes VARCHAR2(255)
,date_of_birth DATE
,last_updt_user VARCHAR2(20)
,last_updt_date DATE
);
The generated column, ID, is necessary to create a primary key for the individual table, because no combination of columns is guaranteed to be unique. Although it is highly unlikely, two individuals with the same name and the same date of birth could
exist in the database.
A common mistake in developing database applications relates to lookup tables, which are sometimes created almost as an afterthought. The PRIMARY KEY constraint is particularly important to lookups because of the way in which they are used. If two rows
of the lookup have the same value in the column that is used in joins, unwanted duplicates will be returned in the result set. This point may seem obvious, but a simple oversight can cause this error to go unnoticed until long after a duplicate is
inserted, when SQL statements joining to the lookup produce results that are visibly erroneous.
The UNIQUE constraint, which is implied by PRIMARY KEY, can be used on a different column to designate a secondary key. A UNIQUE constraint, like PRIMARY KEY, causes Oracle to automatically create an index and assign it a unique name.
Listing 20.3 creates a unique index on the description column in the address_type lookup table. Although this guarantees uniqueness for each row in the table, it is still preferable to use ID as the primary key. The primary key will be stored in another
table, and because the description can be fairly long, using an ID (which will start at 1) conserves disk space. Another advantage to using the generated value as the primary key is that joins are typically faster on numeric values.
Listing 20.3. This DDL contains both a primary key and a unique constraint.
CREATE TABLE address_type (
ID NUMBER(10) PRIMARY KEY
,description VARCHAR2(40) UNIQUE
,last_updt_user VARCHAR2(40)
,last_updt_date DATE
);
The CONSTRAINT keyword can be used to assign a name to a constraint and its corresponding index. See Listing 20.4 for an example of the CONSTRAINT syntax for column constraints.
Listing 20.4. This DDL script demonstrates the use of the CONSTRAINT keyword.
CREATE TABLE address_type (
ID NUMBER(10)
CONSTRAINT addr_type_pk PRIMARY KEY
,description VARCHAR2(40)
CONSTRAINT addr_type_desc UNIQUE
,last_updt_id VARCHAR2(40)
,last_updt_date DATE
);
The UNIQUE constraint does not prevent null values from being inserted. For this reason, it is often used for columns that should be unique but do not necessarily need to be populated. For example, an application may need to store the Social Security
number for individuals if it is known, while allowing individuals to be inserted before their Social Security numbers are known. If a duplicate were to exist in this column, it would obviously be in error. However, the column cannot be defined as the
primary key, because null values must be allowed.
The NOT NULL column constraint ensures that null values are not inserted into a column, and they can be used in conjunction with the UNIQUE constraint. Listing 20.5 demonstrates the use of NOT NULL as the sole constraint on a column, as well as its use
in conjunction with a UNIQUE constraint.
Listing 20.5. This DDL script illustrates the uses of NOT NULL column constraints.
CREATE TABLE address_type (
ID NUMBER(10)
CONSTRAINT addr_type_pk PRIMARY KEY
,description VARCHAR2(40)
CONSTRAINT addr_type_desc_u UNIQUE
CONSTRAINT addr_type_desc_nn NOT NULL
,last_updt_id VARCHAR2(40)
CONSTRAINT addr_type_lu_id NOT NULL
,last_updt_date DATE
CONSTRAINT addr_type_lu_dt NOT NULL
);
The NOT NULL constraint is used to simply define required columns, which may or may not be unique. For example, first and last name may be required to add an individual to the database. Specifying these columns with a NOT NULL constraint ensures that
records missing either of these fields will not be inserted.
The most flexible column constraint is the CHECK constraint. A CHECK constraint can reference any column in the table, but it cannot reference any external objects, system variables, or system constants. When a CHECK constraint references columns, the
conditions are always applied to the current row. In order to insert or update a column with the CHECK constraint, the specified condition must evaluate to TRUE or unknown (when a NULL value is being inserted into one of the columns referenced by the CHECK
condition). CHECK constraints can be used in conjunction with other column constraints.
CHECK constraints, as the name implies, are particularly useful in sanity checking. Listing 20.6 demonstrates the use of a CHECK constraint on the individual table, which ensures that no individual can be inserted who is over 150 years old.
Listing 20.6. This DDL script features a CHECK constraint.
CREATE TABLE individual (
ID NUMBER(10)
CONSTRAINT indiv_pk PRIMARY KEY
,last_name VARCHAR2(30)
CONSTRAINT indiv_l_name NOT NULL
,first_name VARCHAR2(30)
CONSTRAINT indiv_f_name NOT NULL
,notes VARCHAR2(255)
,date_of_birth DATE
CONSTRAINT indiv_chk_bday
CHECK (date_of_birth > TO_DATE('01JAN1845',
'DDMONYYYY',
'nls_date_language = American'))
,last_updt_user VARCHAR2(20)
CONSTRAINT indiv_lu_id NOT NULL
,last_updt_date DATE
CONSTRAINT indiv_lu_dt NOT NULL
);
When used as a column constraint, only one column can be referenced in the CHECK condition, and only one CHECK constraint is allowed per column.
To use the Social Security number as an example again, an application may want to store the dashes for easy reporting. You can use a CHECK constraint to ensure that the Social Security number is the correct length and that the dashes are in the correct
positions.
The system catalog tables, user_indexes and user_constraints, contain all information regarding indexes and constraints. Listing 20.7 contains two simple SQL scripts that query these tables. The scripts can be
executed in SQL*Plus or SQL*DBA to produce the output shown in Figure 20.1.
Listing 20.7. These simple SQL scripts query catalog tables to display information on existing indexes and constraints.
/* INDEXES.SQL */
select table_name, index_name, uniqueness
from user_indexes
order by 1, 2
/
/* CONST.SQL */
select table_name, constraint_name, constraint_type
from user_constraints
order by 1, 2
/
Figure 20.1. This output was generating by running the SQL scripts from Listing 20.7 in SQL*Plus.
Column constraints can be employed to enforce referential integrity for foreign keys using the REFERENCES keyword. The table and column specified in the REFERENCES clause must already exist, and the referenced column must be defined as a unique or
PRIMARY KEY. Listing 20.9 demonstrates the use of two foreign keys in a table used only to join the referenced objects. Note that the script in Listing 20.8 must be run first, or the second REFERENCES constraint in Listing 20.9 will produce an error.
Listing 20.8. This DDL script creates the company table which is referenced by a foreign key constraint in Listing 20.9.
CREATE TABLE company (
ID NUMBER(10)
CONSTRAINT company_pk PRIMARY KEY
,name VARCHAR2(30)
CONSTRAINT co_name_u UNIQUE
CONSTRAINT co_name_nn NOT NULL
,notes VARCHAR2(255)
,last_updt_user VARCHAR2(20)
CONSTRAINT co_lu_id NOT NULL
,last_updt_date DATE
CONSTRAINT co_lu_dt NOT NULL
);
Listing 20.9. This DDL script contains two REFERENCES column constraints.
CREATE TABLE ind_co_rel (
individual_id NUMBER(10)
CONSTRAINT indco_ind_fk
REFERENCES individual(ID)
ON DELETE CASCADE
CONSTRAINT indco_ind_nn NOT NULL
,company_id NUMBER(10)
CONSTRAINT indco_co_fk
REFERENCES company(ID)
ON DELETE CASCADE
CONSTRAINT indco_co_nn NOT NULL
,title VARCHAR2(80)
,last_updt_user VARCHAR2(20)
CONSTRAINT indco_lu_id NOT NULL
,last_updt_date DATE
CONSTRAINT indco_lu_dt NOT NULL
);
The REFERENCES column constraint does not imply NOT NULL, but NOT NULL can be used in conjunction with a REFERENCES constraint on a column. As a result of the REFERENCES clauses in Listing 20.8, an application cannot insert a row into this table, unless
both the individual_id and company_id values exist in their respective tables.
The data type is unnecessary for columns containing a REFERENCES constraint. The column will automatically be defined with the data type of the column it references.
The ON DELETE CASCADE option is a very significant feature of the REFERENCES column constraint. If it is not specified, rows with key values in the parent table cannot be deleted until all corresponding rows in the table containing the REFERENCES
constraint are deleted. When ON DELETE CASCADE is specified, deletions in the parent table cause all references in the child table to be deleted automatically. In this respect, using ON DELETE CASCADE is similar to creating a DELETE trigger on the
referenced table. This is an extremely powerful option and should be used with caution. If it is not properly understood, unwanted automatic deletions could result.
Although they are ideally suited to enforcing referential integrity, REFERENCES constraints may not be desirable in certain situations. For example, if a table has a high volume of transactions and contains several foreign keys that are simple lookups,
performance can be adversely affected by using the REFERENCES constraint. Each time a row is inserted or updated, the referenced tables must be checked to ensure that each foreign key being inserted exists in the referenced tables. Depending on the nature
of the data and the importance of performance, it may be preferable to enforce the foreign keys through the application itself.
Each of the column constraints described in the previous section can also be applied as table constraints, with the exception of NOT NULL. Table constraints have the additional advantage of being able to operate on multiple columns.
Refer to the CHECK constraint in Listing 20.6. This provides a sanity check on the lower bounds of date_of_birth, but it would be better to check the upper bounds as well. Listing 20.10 illustrates how you can accomplish this using a table constraint.
Listing 20.10. This DDL script, containing a CHECK table constraint, is an improvement on the column constraint used in Listing 20.6.
CREATE TABLE individual (
ID NUMBER(10)
CONSTRAINT indiv_pk PRIMARY KEY
,last_name VARCHAR2(30)
CONSTRAINT indiv_l_name NOT NULL
,first_name VARCHAR2(30)
CONSTRAINT indiv_f_name NOT NULL
,notes VARCHAR2(255)
,date_of_birth DATE
,last_updt_user VARCHAR2(20)
CONSTRAINT indiv_lu_id NOT NULL
,last_updt_date DATE
CONSTRAINT indiv_lu_dt NOT NULL
,CONSTRAINT indiv_chk_bday
CHECK (date_of_birth BETWEEN (TO_DATE('01JAN1845',
'DDMONYYYY', 'nls_date_language = American'))
AND last_updt_date)
);
The syntax of table constraints is nearly identical to that of column constraints, except that the CONSTRAINT clause comes at the end of the column list.
Another advantage to using a table constraint applies only to constraints that result in the creation of an index. The table constraint syntax allows the designation of a tablespace and storage specification for the corresponding index. When indexes are
created automatically for column constraints, they are created in the default tablespace using the default storage specification. The script in Listing 20.11 adds a UNIQUE constraint to the ind_co_rel table, enforcing uniqueness in the combination of
individual_id and company_id. It also implements the USING INDEX clause to designate a tablespace and storage specification for the index that will be created.
Listing 20.11. This DDL script uses the ALTER TABLE syntax to create a UNIQUE table constraint and a storage specification for the corresponding index.
ALTER TABLE ind_co_rel
ADD CONSTRAINT ind_co_rel_u UNIQUE(individual_id, company_id)
USING INDEX
TABLESPACE oracle_unleashed
STORAGE (INITIAL 10K
NEXT 10K
MAXEXTENTS 100
PCTINCREASE 10);
The syntax for tablespace and storage specification following the USING INDEX keyword is identical to that for the CREATE INDEX statement. The USING INDEX clause can also be used in a PRIMARY KEY constraint, as illustrated in Listing 20.12.
Listing 20.12. This DDL script removes the UNIQUE constraint created in Listing 20.11 and re-creates it as a PRIMARY KEY.
ALTER TABLE ind_co_rel
DROP CONSTRAINT ind_co_rel_u
/
ALTER TABLE ind_co_rel
DROP CONSTRAINT indco_ind_nn
/
ALTER TABLE ind_co_rel
DROP CONSTRAINT indco_co_nn
/
ALTER TABLE ind_co_rel
ADD CONSTRAINT ind_co_rel_pk PRIMARY KEY(individual_id, company_id)
USING INDEX
TABLESPACE oracle_unleashed
STORAGE (INITIAL 10K
NEXT 10K
MAXEXTENTS 100
PCTINCREASE 10)
/
Notice that the NOT NULL constraints on the individual_id and company_id columns were also dropped. To keep them would be redundant, because columns included in a PRIMARY KEY constraint cannot have null values.
Foreign key table constraints are implemented in much the same way as REFERENCING column constraints. If necessary, multiple columns can be included in a FOREIGN KEY table constraint. This is not the case with the column constraint implementation. Note
also that the FOREIGN KEY keyword is only available for the table constraint syntax. Listing 20.13, the final script for ind_co_rel, also defines temporary tables into which exceptions are logged.
Listing 20.13. The final ind_co_rel script uses the FOREIGN KEY table constraint syntax and logs exceptions into temporary tables.
CREATE TABLE ind_co_rel (
individual_id NUMBER(10)
,company_id NUMBER(10)
,title VARCHAR2(80)
,last_updt_user VARCHAR2(20)
CONSTRAINT indco_lu_id NOT NULL
EXCEPTIONS INTO ind_co_err_1
,last_updt_date DATE
CONSTRAINT indco_lu_dt NOT NULL
EXCEPTIONS INTO ind_co_err_2
,CONSTRAINT indco_ind_fk FOREIGN KEY (individual_id)
REFERENCES individual(ID)
ON DELETE CASCADE
EXCEPTIONS INTO ind_co_err_3
,CONSTRAINT indco_co_fk FOREIGN KEY (company_id)
REFERENCES company(ID)
ON DELETE CASCADE
EXCEPTIONS INTO ind_co_err_4
,CONSTRAINT ind_co_rel_pk PRIMARY KEY(individual_id, company_id)
USING INDEX
TABLESPACE oracle_unleashed
STORAGE (INITIAL 10K
NEXT 10K
MAXEXTENTS 100
PCTINCREASE 10)
EXCEPTIONS INTO ind_co_err_5
)
TABLESPACE oracle_unleashed;
Each type of exception is logged into a different temporary table in Listing 20.13, which is helpful in debugging and resolving integrity problems. The tables into which exceptions are logged should be identical to the ind_co_rel table, except that they
should have no constraints, for obvious reasons. The EXCEPTIONS INTO clause can be used for any type of column or table constraint. If the table referenced by EXCEPTIONS INTO does not exist, using the clause will produce an error.
The EXCEPTIONS INTO clause is particularly useful in applications where frequent and timely bulk loads are required. In order to use SQL*Loader with the direct path option to maximize its performance, constraints must be disabled. When the constraints
are re-enabled, the EXCEPTIONS INTO clause can be used to categorize problem records by inserting them in separate tables based on the constraint that was violated. This minimizes the negative impact constraints have on performance, while maintaining
integrity and providing a means of identifying and resolving problems with the data being loaded.
As shown by these examples, Oracle provides a wide variety of options for enforcing integrity through column and table constraints. Constraints are powerful tools for enforcing integrity, but they should be used with care. Overuse of constraints can add
significantly to long-term maintenance requirements, and misuse can create unwanted dependencies or unnecessary exceptions. The possible trade-offs involving constraints and performance will be discussed in greater detail later in this chapter.
Oracle sequences are ideally suited to the task of generating unique key values. A sequence is a stored object that simply generates a sequence of numbers. Listing 20.14 illustrates the syntax for creating a sequence.
Listing 20.14. This sequence is used to generate primary keys for the individual table.
CREATE SEQUENCE individual_ids
START WITH 1
INCREMENT BY 1
NOMAXVALUE;
The values specified for the START WITH, and INCREMENT BY parameters in Listing 20.14 are the defaults. NOMAXVALUE is the default as well. The script in Listing 20.13 will produce the same result as the following:
CREATE SEQUENCE individual_ids;
It is a good practice to explicitly declare these defaults for documentation purposes, if nothing else. The implementation is fairly self-explanatory. The START WITH parameter indicates the first number that will be generated, INCREMENT BY specifies a
number to be added to the current value to generate the next value, and NOMAXVALUE indicates that there is no maximum to the numbers it will generate, (practically no maximum, although there is an upper limit).
The MINVALUE parameter is used only by descending sequences. (Specifying a negative value for INCREMENT BY produces a descending sequence.) CYCLE, when used in conjunction with MAXVALUE or MINVALUE, indicates that the sequence should start from the
beginning when the minimum or maximum value is reached. The default for this option is NOCYCLE.
The CACHE parameter indicates the number of values that should be pre-generated and cached by the sequence. The default value for CACHE is 20. Raising this parameter can improve performance in high transaction volume environments.
ORDER ensures that sequences are used in the order generated. Regardless of the setting of this option, the same value will never be returned twice. If an application uses a sequence in a transaction that is rolled back, the value is simply discarded.
NORDER, the default for this option, is acceptable unless the sequence is being used like a timestamp to indicate the order of events over time.
A sequence has two pseudocolumns, currval and nextval. Currval returns the current value of the sequence, while nextval increments the sequence and returns the new value. Listing 20.15 demonstrates how a sequence can be used in an Oracle function to
generate new keys when inserting records.
Listing 20.15. This function accesses a sequence to fetch a new key when inserting a record.
CREATE OR REPLACE FUNCTION insert_indiv (last CHAR, first CHAR,
notes CHAR, dob DATE) RETURN NUMBER
IS
new_id NUMBER;
BEGIN
SELECT individual_ids.nextval INTO new_id FROM dual;
INSERT INTO individual VALUES (new_id, last, first, notes,
dob, user, sysdate);
COMMIT;
RETURN(new_id);
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
RETURN(1);
END insert_indiv;
It is a common practice to use sequences for generating unique primary keys. One sequence can be used for many tables, or a separate sequence can be created for each table requiring generated keys. Either option is preferable to any key that requires
user intervention, because typographical errors are bound to occur. It is typically preferable to generate unique keys even when one exists naturally in the data (Social Security number, for example).
Using a sequence to generate primary keys can improve performance in certain situations, as well. As mentioned previously, integer joins are typically faster than character-based joins, and even when a natural integer primary key exists, a sequence is
often a better choice. To use Social Security number as an example, 10 bytes must be stored for each key. If a sequence is used, starting with 1, a considerable amount of disk space can be conserved, and a much smaller index produced, which will result in
less I/O. Perhaps a less important consideration is the order in which values are inserted. Depending on how inserts are handled by applications accessing the sequence, the index on a sequence-generated primary key may be created in ascending order
naturally, which is somewhat of an optimization in terms of I/O performance. If the sequence is created with the ORDER option, and inserts are handled using Oracle subprograms similar to Listing 20.15, this will always be true.
Triggers are stored procedures associated with a specific operation on a specific table. A trigger is automatically fired when the operation with which it is associated is performed on the table with which it is associated. Triggers can perform many of
the same tasks as constraints, and in most cases, they can go beyond what constraints can do. For example, a NOT NULL constraint can only ensure that a value is present in a column, but it does nothing to ensure the accuracy of the data. Listing
20.16 provides an example of how you can use triggers to enforce a NOT NULL constraint and ensure the accuracy of the data being inserted.
Listing 20.16. This trigger ensures that the columns last_updt_user and last_updt_date are inserted and updated accurately.
CREATE OR REPLACE TRIGGER indiv_timestamp
BEFORE INSERT OR UPDATE ON individual FOR EACH ROW
BEGIN
:new.last_updt_user := user;
:new.last_updt_date := sysdate;
END indiv_timestamp;
The simple trigger in Listing 20.16 ensures that the last_updt_user and last_updt_date are being inserted and updated accurately in the individual table. In effect, the trigger intercepts the actual values being inserted and replaces them with
user and sysdate. Using the NOT NULL constraint for these columns is no longer necessary, and the trigger goes far beyond what the constraint could do.
The trigger in Listing 20.16 also relieves the application of the burden of supplying values for the last_updt_user and last_updt_date columns when inserting and updating records.
INSERT and UPDATE triggers are commonly used for customized transaction logging, or to generate statistical summaries to be accessed by a different group of users than those applying the transactions. For example, a large order
entry system might use an INSERT trigger to write only the date, the order amount, and the salesperson to a separate table to be used only for management reporting.
The syntax for creating triggers is very similar to the creation of procedures, with a few notable exceptions. The BEFORE or AFTER keyword must follow the name of the trigger, indicating whether it should be fired before or after
the operation that causes it is fired.
Although it may not seem logical to do so, this trigger had to be created with the BEFORE option. Trying to implement this trigger with the AFTER option produces the following error:
ORA04091: table SCOTTY.INDIVIDUAL is mutating, trigger/function may not see it
Because the trigger is being executed by a process that is currently involved in a transaction on the same row, it cannot be created with the AFTER option. This would, in effect, invalidate the old correlation of the trigger.
Immediately following the BEFORE or AFTER keyword is the action (or actions) with which the trigger is associated. This can be INSERT, UPDATE, or DELETE, or any combination of these separated by OR. The FOR EACH ROW keyword defines the behavior of the
trigger when it is fired by statements affecting multiple rows. The default behavior is to fire the trigger only once, regardless of the number of rows affected. A trigger may also include a WHEN clause, which limits the conditions under which it will
fire.
The WHEN clause can be used for specialized reporting, or to draw attention to a value that may seem to be out of range. For example, an accounts payable system might use an INSERT trigger to log all payments of greater than $10,000 to a temporary
table, which can then be used to generate a report for management's review and approval. This could be an alternative to a CHECK condition, which might prove to be overly restrictive. In most circumstances, it would not be acceptable to reject a valid
payment simply because it is unusually high. On the other hand, management may be interested in reviewing or auditing these payments. In this respect, a trigger can be used in a way that is analogous to passively enforcing a CHECK constraint.
Note that in Listing 20.16, the variable new is never declared. This is the default correlation name associated with the new row (which is valid for inserts and updates only). The name old is associated with the old row by default, and is
valid for updates and deletes only. These default names can be reassigned using a REFERENCING clause. The REFERENCING clause should placed immediately before the FOR EACH ROW keyword (if it is used), as in Listing 20.17.
Listing 20.17. This trigger uses a REFERENCING clause to rename new.
CREATE OR REPLACE TRIGGER indiv_timestamp
BEFORE INSERT OR UPDATE ON individual
REFERENCING new AS new_row FOR EACH ROW
BEGIN
:new_row.last_updt_user := user;
:new_row.last_updt_date := sysdate;
END indiv_timestamp;
The REFERENCING clause is rarely used, but it is provided in order to allow the use of new and old as regular program variables.
Triggers, when used with sequences, can also be used to enforce primary key constraints. Listing 20.18 creates a trigger that ensures that new rows in the individual table are assigned unique primary key values. The trigger also ensures that the
sequence is used, by overriding any value specified in an insert statement with the next value in the sequence.
Listing 20.18. This trigger enforces the primary key on the individual table.
CREATE OR REPLACE TRIGGER indiv_get_key
BEFORE INSERT ON individual FOR EACH ROW
DECLARE
new_id NUMBER;
BEGIN
SELECT individual_ids.nextval INTO new_id FROM dual;
:new.id := new_id;
END indiv_get_key;
Unfortunately, this trigger cannot coexist with the trigger in Listing 20.17. For each table, only one trigger can exist for each operation. Listing 20.19 demonstrates how these can be implemented together, replacing the CHECK constraint on
date_of_birth in the process.
Listing 20.19. These triggers can be used to enforce several constraints on the individual table.
CREATE OR REPLACE TRIGGER indiv_ins_trg
BEFORE INSERT ON individual FOR EACH ROW
DECLARE
new_id NUMBER;
BEGIN
IF ((:new.date_of_birth < TO_DATE('01JAN1845', 'DDMONYYYY',
'nls_date_language = American'))
OR (:new.date_of_birth > sysdate)) THEN
RAISE_APPLICATION_ERROR(20001, 'Invalid birth date.');
ELSE
SELECT individual_ids.nextval INTO new_id FROM dual;
:new.id := new_id;
:new.last_updt_user := user;
:new.last_updt_date := sysdate;
END IF;
END indiv_ins_trig;
CREATE OR REPLACE TRIGGER indiv_updt_trg
BEFORE UPDATE ON individual FOR EACH ROW
BEGIN
IF ((:new.date_of_birth < TO_DATE('01JAN1845',
'DDMONYYYY', 'nls_date_language = American'))
OR (:new.date_of_birth > sysdate)) THEN
RAISE_APPLICATION_ERROR(20001, 'Invalid birth date.');
ELSE
:new.id := :old.id;
:new.last_updt_user := user;
:new.last_updt_date := sysdate;
END IF;
END indiv_updt_trg;
In Listing 20.19, the PRIMARY KEY constraint is enforced for both inserts and updates. The UPDATE trigger completely prevents an ID from being changed, which might not be acceptable if this were not a generated key. The triggers also enforce a CHECK
constraint on date_of_birth and the NOT NULL constraints on last_updt_user and last_updt_date.
Triggers can be used to enforce foreign key constraints, as well. Refer to Listing 20.9. The REFERENCES constraints in the example both contained the ON DELETE CASCADE option. If the foreign keys integrity constraints are enforced by the triggers
in Listing 20.20, the REFERENCES column constraints could be removed.
Listing 20.20. These triggers can be used to enforce foreign key constraints.
CREATE OR REPLACE TRIGGER indiv_del_trg
BEFORE DELETE ON individual FOR EACH ROW
BEGIN
DELETE FROM ind_co_rel WHERE individual_id = :old.id;
END indiv_del_trg;
CREATE OR REPLACE TRIGGER co_del_trg
BEFORE DELETE ON company FOR EACH ROW
BEGIN
DELETE FROM ind_co_rel WHERE company_id = :old.id;
END co_del_trg;
CREATE OR REPLACE TRIGGER ind_co_trg
BEFORE INSERT OR UPDATE ON ind_co_rel FOR EACH ROW
DECLARE
co_id NUMBER;
indiv_id NUMBER;
BEGIN
SELECT ID INTO co_id FROM company WHERE ID = :new.company_id;
SELECT ID INTO indiv_id FROM individual
WHERE ID = :new.individual_id;
EXCEPTION
WHEN OTHERS THEN
RAISE_APPLICATION_ERROR(20002, 'Invalid id.');
END ind_co_trg;
As these examples demonstrate, triggers can be used to perform the same tasks as table and column constraints. In many cases, it may be preferable to use triggers because they are likely to provide better performance. This is particularly true in
distributed environments, where it may not be possible to enforce foreign key constraints at all.
When designing triggers, you should pay special attention to cascading triggers. Cascading occurs when a trigger on one table causes a trigger on another table to be fired. Codependencies, in particular, can be a problem.
Note that in Listing 20.20, the individual and company tables both have DELETE triggers that delete corresponding rows from ind_co_rel. For the sake of example, assume that ind_co_rel has a DELETE trigger that deletes corresponding
rows in address_rel and phone_rel, and that individual and company also include these deletions in their DELETE triggers.
This presents numerous integrity problems. If a relationship between an individual and a company is deleted, records in address_rel and phone_rel that are related to both should be deleted. Also, if an individual or company is deleted entirely,
all address_rel and phone_rel records related to the specific individual or company should be deleted. When an individual or company is deleted, the ind_co_rel record is deleted, which causes its trigger to be fired, resulting in deletions from the
address_rel and phone_rel tables. If these records are also to be deleted by the trigger that was originally fired by the deletion of the individual or company, the mutating table problem described earlier will occur. In this case, the ind_co_rel should
probably not have a DELETE trigger at all. Meaningless records in address_rel and phone_rel exist only until the corresponding individual or company is deleted. This is just one example of how cascading triggers can produce unexpected results.
Regardless of how integrity is enforced in the database, the application must have knowledge of the constraints. The application must be able to submit transaction statements in the proper order, and it must know how to respond to exceptions resulting
from integrity problems. This point is best illustrated through the use of an example. Assume that the application needs to perform a single transaction based on the objects created in Listing 20.21.
Listing 20.21. These objects participate in a single transaction.
CREATE SEQUENCE individual_ids
START WITH 1
INCREMENT BY 1
NOMAXVALUE;
CREATE TABLE individual (
ID NUMBER(10)
CONSTRAINT indiv_pk PRIMARY KEY
,last_name VARCHAR2(30)
CONSTRAINT indiv_l_name NOT NULL
,first_name VARCHAR2(30)
CONSTRAINT indiv_f_name NOT NULL
,notes VARCHAR2(255)
,date_of_birth DATE
,last_updt_user VARCHAR2(20)
CONSTRAINT indiv_lu_id NOT NULL
,last_updt_date DATE
CONSTRAINT indiv_lu_dt NOT NULL
,CONSTRAINT indiv_chk_bday
CHECK (date_of_birth BETWEEN (TO_DATE('01JAN1845',
'DDMONYYYY', 'nls_date_language = American'))
AND last_updt_date)
);
CREATE SEQUENCE company_ids
START WITH 1
INCREMENT BY 1
NOMAXVALUE;
CREATE TABLE company (
ID NUMBER(10)
CONSTRAINT company_pk PRIMARY KEY
,name VARCHAR2(30)
CONSTRAINT co_name_u UNIQUE
CONSTRAINT co_name_nn NOT NULL
,notes VARCHAR2(255)
,last_updt_user VARCHAR2(20)
CONSTRAINT co_lu_id NOT NULL
,last_updt_date DATE
CONSTRAINT co_lu_dt NOT NULL
);
CREATE TABLE ind_co_rel (
individual_id NUMBER(10)
,company_id NUMBER(10)
,title VARCHAR2(80)
,last_updt_user VARCHAR2(20)
CONSTRAINT indco_lu_id NOT NULL
EXCEPTIONS INTO ind_co_err_1
,last_updt_date DATE
CONSTRAINT indco_lu_dt NOT NULL
EXCEPTIONS INTO ind_co_err_2
,CONSTRAINT indco_ind_fk FOREIGN KEY (individual_id)
REFERENCES individual(ID)
ON DELETE CASCADE
EXCEPTIONS INTO ind_co_err_3
,CONSTRAINT indco_co_fk FOREIGN KEY (company_id)
REFERENCES company(ID)
ON DELETE CASCADE
EXCEPTIONS INTO ind_co_err_4
,CONSTRAINT ind_co_rel_pk PRIMARY KEY(individual_id, company_id)
USING INDEX
TABLESPACE oracle_unleashed
STORAGE (INITIAL 10K
NEXT 10K
MAXEXTENTS 100
PCTINCREASE 10)
EXCEPTIONS INTO ind_co_err_5
);
It would be useful if the application could insert an individual, a company, and a record relating the two in one transaction. The foreign key constraints on indiv_co_rel dictate that this record must be inserted last. In designing a process to complete
this transaction, you should also consider that the application will need to insert an individual and a company separately, as well. Listing 20.22 provides three functions to accomplish these tasks.
Listing 20.22. These three functions guarantee that integrity constraints are satisfied for INSERTS into the objects in Listing 20.21.
CREATE OR REPLACE FUNCTION insert_indiv (last CHAR, first CHAR,
notes CHAR, dob DATE)
RETURN NUMBER
IS
invalid_name EXCEPTION;
new_id NUMBER;
BEGIN
IF ((LENGTH(RTRIM(last)) > 0) AND
(LENGTH(RTRIM(first)) > 0)) THEN
SELECT individual_ids.nextval INTO new_id FROM dual;
INSERT INTO individual VALUES (new_id, last, first,
notes, dob, user, sysdate);
RETURN(new_id);
ELSE
RAISE invalid_name;
END IF;
EXCEPTION
WHEN invalid_name THEN
ROLLBACK;
RETURN(20001);
WHEN OTHERS THEN
ROLLBACK;
RETURN(1);
END insert_indiv;
CREATE OR REPLACE FUNCTION insert_company (name CHAR, notes CHAR)
RETURN NUMBER
IS
invalid_name EXCEPTION;
new_id NUMBER;
BEGIN
IF (LENGTH(RTRIM(name)) > 0) THEN
SELECT company_ids.nextval INTO new_id FROM dual;
INSERT INTO company VALUES (new_id, name, notes, user,
sysdate);
RETURN(new_id);
ELSE
RAISE invalid_name;
END IF;
EXCEPTION
WHEN invalid_name THEN
ROLLBACK;
RETURN(20001);
WHEN OTHERS THEN
ROLLBACK;
RETURN(1);
END insert_company;
CREATE OR REPLACE FUNCTION insert_ind_co(last CHAR, first CHAR,
notes CHAR, dob DATE, co_name CHAR,
co_notes CHAR, title CHAR)
RETURN NUMBER
IS
ret NUMBER;
ind_id NUMBER;
co_id NUMBER;
BEGIN
ret:=insert_indiv(last, first, notes, dob);
IF (ret > 0) THEN
ind_id:=ret;
ret:=insert_company(co_name, co_notes);
IF (ret > 0) THEN
co_id:=ret;
INSERT INTO ind_co_rel VALUES (ind_id, co_id, title,
user, sysdate);
RETURN(ind_id);
ELSE
ROLLBACK;
RETURN(ret);
END IF;
ELSE
ROLLBACK;
RETURN(ret);
END IF;
EXCEPTION
WHEN OTHERS THEN
ROLLBACK;
RETURN(1);
END insert_ind_co;
The transaction ensures that all constraints are being met, with the exception of the check constraint on individual, which could easily be added to the insert_indiv function. Sequences are used to generate primary keys, the functions check for null
values, and the foreign key constraints in ind_co_rel are enforced by the order in which inserts occur. If an error occurs in any of the transactions, a rollback is issued, but commits are left out altogether. This is to ensure that an individual or
company is not committed as part of the transaction that establishes their relationship. It is up to the calling application to commit the transaction as a whole.
These functions illustrate the point that in some cases, it may be a good solution to allow the client application to enforce integrity. Column and table constraints can be costly, especially foreign key constraints, which require an additional read. In
an environment with a high volume of transactions and users, these constraints can have a significant negative impact on performance, possibly resulting in unnecessary contention, snapshot problems, and other bottlenecks.
When using the application itself to enforce referential integrity, you can run batch processes periodically to ensure that the application is enforcing integrity. You can create temporary tables to store exceptions for further analysis and resolution,
and you can use simple SQL to identify problem records. For example, you can use a statement like the one following to identify invalid references to individuals in the ind_co_rel table.
SELECT * FROM ind_co_rel INTO temp_no_ind WHERE individual_id NOT IN
(SELECT ID FROM individual)
If the application is properly designed, batch processes should not identify exceptions and would need to be run very infrequently.
Primary key constraints, on the other hand, typically improve performance, especially if they are defined as a single small column. This constraint causes an index to be built, and primary key values are often stored as foreign keys in other tables. In
these cases, declaring the primary key usually results in faster joins.
In general, however, constraints have a negative impact on overall performance. Depending on the nature of the data, security considerations, and standards of acceptable performance, you can rely upon the application itself to enforce many of the
required integrity constraints.
Column constraints, table constraints, triggers, and PL/SQL are all useful in enforcing integrity, but each approach also has limitations. The best way to combine these elements to enforce integrity depends entirely on the application. Most designs use
a combination of all of these methods and attempt to strike a balance between pure performance and the strict enforcement of integrity.
|