process used to organize relational model database into tables and columns
Creating Tables Normalization ensures:
dividing large table into smaller
rules
relationships
eliminates redundancy
reduce inconsistent dependency
History of Normalization
introduced in 1969 by Edgar Codd
Normalization is part of the Relational Model
IBM databases were the first ones to utilize IBM DB2
Utilized on mainframe computers in 1980s
transition in the 90s
Codd devised set of rules for Relational Database Management System (DBMS)
Codd's Rule
Information Representation
Guaranteed Access
Systemic Treatment of Null Values
Database Description rule
comprehensive data sub-language
View updating
high-level update, insert, delete
physical data independence
logical data independence
Distribution rule
Non-Subversion
Integrity rule
First Normal Form (1NF)
basic rules for an organized database
contains only single-column values
no repeating groups
Second Normal Form
It is in first normal form
all non-key attributes (columns) must be dependent on the primary key
Third Normal Form
It is in second normal form
no dependencies between attributes (columns) in the body of the table
Anomaly
updating the values of one fact requires multiple changes of database
inserting one fact in the database requires knowledge of other facts unrelated to the fact being inserted
deleting one fact from the database causes loss of other unrelated data
Codd's theorem
states that Relational Algebra (how to retrieve) and the domain-independent Relational Calculus (what to retrieve) queries, two well-known foundational query languages for the relational model,are precisely equivalent in expressive power:
Relational Algebra is a procedural language
Relational Calculus is a declarative language
relational operations: select
extractsrows from table
only want specific data
must meet criteria
Relational Operations: Union
adds row from one table to another
must have same # of columns
Selects only distinctvalues by default
to allow duplicate, use unionall
Relational Operations: intersect
combining data from two tables into a third
only contains rows that are common to both
order of table is unimportant
Relational Operations: DIfference
Subtract rows in one table from those in another
rows which appears in the first but not in the second
must be union-compatible
Relational Operations: Product
Multiplies the rows in two tables
contains all the rows in the first and added to the rows in the second
Relation operations: Division
extracts rows and columns from one table based on the data in the second
only returns columns that don't exist in the second table