Friday, 18 March 2011

DBMS: Candidate Key Determination for a Relation

How to find Cadidate Keys of a Relation using its FDs

In this lession JBS will provide you the way to find candidate keys for a relation...
A candidate key uniquely identifies each row in a relational table. It is a minimal super key. To find a candidate key for a relation, we can use FDs to find a minimal set of attributes in the relation which functionally determines (uniquely identifies) all the attributes in the relation. An attribute always functionally determines itself - trivial but true.

Elmasri & Navathe's Algorithm
According to Elmasri and Navathe, we know how to determine X+, the Closure of X under F where X is a subset of the attibutes in a given relation, and F is a set of functional dependencies specified for the given relation. This algorithm also allows us to find candidate keys using FDs.
We know that, if the closure of a set of attributes, X, in a relation is equal to the set of all the attributes in the relation, we know that X is a superkey of the relation. Moreover that, if we cannot find a proper subset of X such that the closure of the proper subset is equal to the set of all the attributes in the relation, we know that X is a candidate key of the relation. The algo is as follows:

Algorithm : Determining X+, the Closure of X under F
X+ := X;
repeat
oldX+ := X+;
for each functional dependency Y → Z in F do
if Y is a subset of X then X+ := X+ UNION Z;
until (X+ = oldX+);
End: { Determining X+, the Closure of X under F }

Example of using the above mentioned Algorithm:
Problem: We are given a relation R(A,B,C,D,E) and the set of its functional dependencies (FDs) F = {A -> B, AC -> D, B -> E} now to find the candidate key(s) of R.

Solution: You con apply the above algorithm to each and every subset of R, but According to JBS, a *good* place to start looking for candidate keys is amongst the determinants in F.
A determinate is the LHS (Left Hand Side) of an FD. If the closure of a determinant is equivalent to R (if it is not a proper subset of R) we know that the determinant is a superkey of R and may therefore be a candidate key of R.

Using this Algo. , first we try to find whether {A} is a superkey?


X+ := {A}

oldX+:= {A}
for A -> B,
A is a subset of X+
so B is added to X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+
so E is added to X+
oldX+ != X+ repeat

oldX+:= {A,B,E}
for A -> B,
A is a subset of X+, B is already in X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+, E is already in X+
X+ = oldX+ stop
  X+ is {A}




X+ is now {A,B}

X+ is still {A,B}


X+ is now {A,B,E}




X+ is still {A,B,E}

X+ is still {A,B,E}

X+ is still {A,B,E}


The closure of {A} over F is {A,B,E} which is a proper subset of R. Therefore, {A} is not a superkey of R and cannot be a candidate key for R.

Using this Algo. , now we try to find whether {A, C} is a superkey or not?


X+ := AC

oldX+:= {A,C}
for A -> B,
A is a subset of X+
so B is added to X+
for AC -> D,
AC is a subset of X+
so D is added to X+
for B -> E,
B is a subset of X+
so E is added to X+
oldX+ != X+ repeat

oldX+:= {A,C,B,D,E}
for A -> B,
A is a subset of X+, B is already
in X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+, E is already
in X+
X+ = oldX+ stop

X+ is {A,C}




X+ is now {A,C,B}


X+ is now {A,C,B,D}


X+ is now {A,C,B,D,E}





X+ is still {A,C,B,D,E}

X+ is still {A,C,B,D,E}


X+ is still {A,C,B,D,E}



The closure of {A,C} over F is {A,B,C,D,E} which is not a proper subset of R. {A,C} is a superkey for R and may be a candidate key for R.

Using the same Algo. , now we try to find whether {B} is a superkey?


X+ := B

oldX+:= {B}
for A -> B,
A is a not a subset of X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+
so E is added to X+
oldX+ != X+ repeat

oldX+:= {B,E}
for A -> B,
A is a subset of X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is a subset of X+,
E is already in X+
X+ = oldX+ stop

X+ is {B}



X+ is still {B}

X+ is still {B}


X+ is now {B,E}




X+ is still {B,E}

X+ is still {B,E}


X+ is still {B,E}



The closure of {B} over F is {B,E} which is a proper subset of R. Therefore, {B} is not a superkey of R and cannot be a candidate key for R.

Therefore we have found a single super key {A, C} for R using this algorithm.

Connolly & Begg define a candidate key as, A superkey such that no proper subset is a superkey within the relation.
So, We require to check the proper subsets of the superkey to verify if any of these subsets are superkeys.
The proper subsets of {A,C} are: the null set, {A}, {C}
According to JBS a candidate key, which may be chosen to be a primary key, cannot be null.
We have already discovered above that {A} is not a superkey for R. So we need to consider only {C} as a possible superkey.

Using the algorithm we will try to determine whether {C} is a superkey or not?


X+ := C

oldX+:= {C}
for A -> B,
A is a not a subset of X+
for AC -> D,
AC is not a subset of X+
for B -> E,
B is not a subset of X+
oldX+ = X+ stop

X+ is {C}



X+ is still {C}

X+ is still {C}

X+ is still {C}



The closure of {C} over F is {C} which is a proper subset of R. Therefore, {C} is not a superkey of R and cannot be a candidate key for R.

We have discovered that no proper subsets of {A,C} are superkeys of R, so {A,C} is a candidate key for R.

Are there any other candidate keys for R?
We could use the above algorithm for D and E, but I am exhausted. JBS know that any attribute which is in none of the determinants in F cannot be a candidate key. If you would like to see an exhaustive application of this algorithm, you can find all the subsets of R and apply the algorithm to each subset.

We have found only one candidate key {A,C} which I must select as the primary key for R.

Therefore, the relational schema for R is R (A, C, [pk] B, D, E)

No comments:

Post a Comment