A process of analyzing a relation to ensure that it is well formed
Normalization
Involves decomposing relations to produce smaller, well-structured relations
A formal process for deciding which attributes should be grouped together in a relation so that all anomalies are removed
If a relation is normalized (well-formed), rows can be inserted, deleted, or modified without creating anomalies
Goals of normalization
Minimize data redundancy thereby conserving space and avoiding anomalies
Make it easier to maintain data
Provide a better design that is an improved representation of the real world
Normalization
A logical data-modelling technique used to ensure that data are well structured from an organization-wide view
Modification Anomalies
Tables that are not normalized are susceptible to experiencing modification anomalies
Insertion Anomaly
Occurs when certain attributes cannot be inserted into the database without the presence of other attributes
Update Anomaly
Exists when one or more instances of duplicated data is updated, but not all
Deletion Anomaly
Exists when certain attributes are lost because of the deletion of other attributes
Most modification problems are solved by breaking an existing table into two or more tables through a process known as normalization
Functional dependency
You can also describe this as a relationship where knowing the value of one attribute (or a set of attributes) is enough to tell you the value of another attribute (or set of attributes) in the same table
Bank Account No. -> Customer Name, Amount of Deposit
Partial functional dependency
When a non-key attribute is functionally dependent on part (but not all) of the primary key
Transitive dependency
A functional dependency between the primary key and one or more nonkey attributes that are dependent on the primary key via another nonkey attribute
Transitive dependency means if we have a primary key A and a non-key domain B and C where C is more dependent on B than A and B is directly dependent on A, then C can be considered transitively dependent on A
Transitive dependency means if we have a primary key DeptID and a non-key domain MgrID and MgrName where MgrName is more dependent on MgrID than DeptID and MgrID is directly dependent on DeptID, then MgrName can be considered transitively dependent on DeptID
Steps in Normalization
First Normal Form - Any multivalued attribute (also called repeating groups) have been removed
Second Normal Form - Any partial functional dependencies have been removed
Third Normal Form - Any transitive dependencies have been removed
A table not in 1st Normal Form has multivalued attributes
A table in 1st Normal Form has no multivalued attributes
Going to 2nd Normal Form means removing partial dependencies
Going to 3rd Normal Form means removing transitive dependencies
Surrogate key
Used as primary key to simplify key structures like when composite key is too long or when the primary key (that can be used) is inefficient i.e. it is too long or cannot be guaranteed to be unique over time (e.g. name)
Flight Number + Flight Date could be assigned a Flight Code which can instead serve as the primary key
UNF table has multivalued attributes
1NF table has no multivalued attributes
3NF table has no partial or transitive dependencies