Normalization in database management system

I have a relation in a database:

Emp_project(SSN,PNum,Hours,Ename,PName,Plocs)

I have been asked to normalize the relation Emp_project.

How do I normalize it?


Solution 1:

A relation has an associated set of columns of any type and an associated set of rows. There is one value per column per row.

Sometimes "normalize" is used to mean (1) "decompose each relation to smaller relations that are its projections and that join back to it". This is normalization to higher NFs (normal forms) than 1NF. This involves FDs (functional dependencies) and JDs (join dependencies). Sometimes the meaning requires input in some "NF". Sometimes the meaning includes putting into "1NF" before decomposition. But decomposition only needs relations.

Sometimes "normalize" is used to mean (2) "convert from relations that have relation-valued columns to relations that don't". (This was its original meaning.) After (1) was developed this use of "normalize" became "put into 1NF".

Ubiquitously "normalize" is (wrongly and vaguely) used to mean (3) "convert from a relation with columns with some types that have multiple parts to relations with columns with the types of the parts". The undesired columns are called "non-atomic", "composite", etc. The meaning might involve parts of the same type and/or of different types. (This is a misinterpretation of (2).)

Sometimes "relation" means relation. Sometimes it (forgiveably) means relation with no relation-valued columns. Sometimes it (wrongly and vaguely) means a relation with no "non-atomic" columns. Sometimes "in 0NF" or "in UNF" means being some some kind of non-relation. Sometimes "in 0NF" or "in UNF" or "in 1NF" means "is a relation".

Sometimes "normalize" is (wrongly) used to mean (4) "convert from non-relations to relations", where the relations might also be in "0NF" or "UNF" or "1NF". (And usually what the non-relational data structure is supposed to mean in terms of relations is not explained, so "normalize" doesn't actually get defined.) Sometimes presentations of (3) write as though there were multiple values per column per row, but that cannot be, since a relation by definition has one value per column per row, so they are confused about whether they are in (3) or (4).

Sometimes "normalize" is (wrongly) used to mean (5) "convert from a relation with some columns plus one additional column per value in a range to a relation with only one additional column but with each row of the original replaced by a row for each dropped column". (This is a misinterpretation of (2).)

Sometimes (2), (3) or (5) are (wrongly) called "getting rid of a repeating group". (But "repeating group" is a vague pre-relational non-relational notion.)

Sometimes "normalize" is (wrongly) used to mean (6) "convert from a relation with columns with certain values to a relation with columns with ids replacing the values plus some relation(s) associating ids with those values".

"Normalize" also has other generic & specific uses both applicable to and outside of database design.


So, what do you mean, "normalize"? In this question one of the columns, Plocs, seems to be "project locations". So maybe "normalize" is used in sense 3, and maybe some others. You have to look at what you were taught and what references you were given.