A set of tables which data items are organised and related to each other via their primary and foreign keys.
What are advantages of database normalisation?
Reduced data redundancy.
Increased data integrity.
Splits entities into smaller numbers of entities, so easier for sorting, indexing and searching
What is data integrity?
The correctness of data over the lifetime of that data.
What is entity integrity?
That each record in a table must have a unique identifier.
What is referential integrity?
It is a database concept that ensures that relationships between databases, i.e. using foreign keys to point to the primary key of another table.
What is domain integrity?
All attributes in the database are related to the overall domain that the database is working on. (Ensures data is correct - sort of a validation thing)
What is data redundancy?
It is wasted or duplicated data that is held in a database.
How can data redundancy cause inaccuracy?
Duplication of data can lead to inconsistencies and errors when updates or changes are made to one copy of the data but not to others.
What is normalisation?
The process of converting a flat file database to a relational database.
What is an index?
A data structure used to shorten the length of time to search a database by having a search filter (basically).
What is first normal form?
That every attribute (column) contains just one data item and that all attributes are dependent on the primary key.
What is second normal form?
Same as first normal form but contains no partial dependencies. (This means data does not need to be in a table because it can be found through another primary key to it).
What is third normal form?
Same as second normal form but contains no transitive dependencies. (Give each ID its own table).
What are the main database views and access rights?
Search (user can run queries).
Alter (user can alter the structure of the database).
Update (user can edit records).
Delete (User can delete records).
Drop (User can delete a table and everything in it).
What is data validation?
Process of ensuring that data entered into a system is reasonable.
What is data verification?
Process of checking the accuracy and consistency of data.
What is a database management system?
Software that manages and organizes data in a structured manner, using SQL queries for example.
What is a data dictionary?
Metadata of the names of all tables and their attributes; all the stuff you do in unit 2.
What is big data?
Large volumes of data. The attributes are the volume, velocity and variety of formats the data takes.
What is data mining?
The process of finding new patterns in large data sets.
How can data mining be used in fraud detection?
Pattern recognition and searching for anomalous data.
What are the requirements of data mining?
High-specification processors and complex algorithms.
Predictive analytics tries to predict the likelihood of an action or event happening.
An advantage of data warehousing?
Acting as an archiving system.
What is a distributed system?
A system split over multiple servers or computers.
What are the advantages of distributed systems?
Faster processing.
Reduces server redundancy.
Improves disaster recovery.
Access to entire databases can be limited.
If one database goes down, other areas can keep working.
How are distributed databases updated?
Two phases occur:
First phase - all databases are asked if they are ready to update -> update is perfomed up until the point where they commit or finalise the update.
Second phase, known as the commit - a message is sent to all databases to commit and asks for acknowledgement.
What is the purpose of query language?
To retrieve and manipulate data from a database.
How do you create a table in SQL?
CREATETABLE tableName (
AttributeName DataType Validation
)
How do you return a column or all data from a table in SQL?
SELECT ColumnName FROM TableName
* if all columns
How to return a specific record in SQL?
SELECT * FROM TableName WHERE ColumnName = DataItem