Wednesday, 30 March 2011

DBMS: DK/NF (Domain/Key Normal Form)

Unlike previously defined normal forms, DK/NF is not defined in terms of traditional dependencies (functional, multivalued, or join). It was introduced by Ron Fagin in his paper "A Normal Form for Relational Databases that Is Based on Domains and Keys," ACM TODS 6, No. 3 (September 1981). Instead, it is defined in terms of the more primitive concepts of domain and key, along with the general concept of a “constraint.”
A domain constraint specifies the permissible values for a given attribute, while a key constraint specifies the attributes that uniquely identify a row in a given table. That is, A domain constraint--better called an attribute constraint--is simply a constraint to the effect a given attribute A of R takes its values from some given domain D. A key constraint is simply a constraint to the effect that a given set A, B, ..., C of R constitutes a key for R. To be specific, enforcing domain constraints just means checking that attribute values are always values from the applicable domain (i.e., values of the right type); enforcing key constraints just means checking that key values are unique.
By definition it is a normal form used in database normalization which requires that the database contains no constraints other than domain constraints and key constraints. That is, a relation schema is in DK/NF if every constraint can be inferred by simply knowing the set of attribute names and their underlying domains, along with the set of keys.
However, successfully building a domain/key normal form database remains a difficult task, even for experienced database programmers. Thus, while the domain/key normal form eliminates the problems found in most databases, it tends to be the most costly normal form to achieve. However, failing to achieve the domain/key normal form may carry long-term, hidden costs due to anomalies which appear in databases adhering only to lower normal forms over time. After transforming a database into DK/NF there would be no insertion or deletion anomalies.
The trouble is, lots of relvars aren't in DK/NF in the first place. For example, suppose there's a constraint on R to the effect that R must contain at least ten tuples. Then that constraint is certainly not a consequence of the domain and key constraints that apply to R, and so R isn't in DK/NF. The sad fact is, not all relvars can be reduced to DK/NF; nor do we know the answer to the question "Exactly when can a relvar be so reduced?".
Now, it's true that Fagin proves in his paper that if relvar R is in DK/NF, then R is automatically in 5NF (and hence 4NF, BCNF, etc.) as well. However, it's wrong to think of DK/NF as another step in the progression from 1NF to 2NF to ... to 5NF, because 5NF is always achievable, but DK/NF is not.
It's also wrong to say that there are "no normal forms higher than DK/NF." In CODD's recent work -- documented in the book TEMPORAL DATA AND THE RELATIONAL MODEL, he has come up with a new sixth normal form, 6NF. 6NF is higher than 5NF (all 6NF relvars are in 5NF, but the converse isn't true); moreover, 6NF is always achievable, but it isn't implied by DK/NF. In other words, there are relvars in DK/NF that aren't in 6NF.
Sometimes it is asked: "If a [relvar] has an atomic primary key and is in 3NF, is it automatically in DK/NF?" No. If the EMP relvar just shown is subject to the constraint that there must be at least ten employees, then EMP is in 3NF (and in fact 5NF) but not DK/NF.

For further study:

No comments:

Post a Comment