Relational data models

What

The relational data model was invented in 1969 at the IBM research department, with the lead researcher Edgar R. Codd. They created the 12 rules (13) to manifest the requirements for a data management system.

Rule 1: The information rule:
All information in a relational data base is represented explicitly at the logical level and in exactly one way – by values in tables.

Why

Relational data models are important as they help us with

standardisation of data models
flexibility in adding and altering tables
data integrity
structured query language can be used
simplicity - the tabular format
intuitive organisation - spreadsheets

When

small data volumes
secondary indices needed
Analytics needed
Aggregations needed
ACID transactions needed
Flexibility for writing queries: In relational data models we are modelling data and not queries.
easy to change, when business requirements change

Structure

Why

free database from unwanted insertions, updates and deletion dependencies
reduce the need of refactoring the database as new types of data are introduced
make the data more informative for users
neutral to quer statistics. We don't want to design for particular queries

Normalisation

Normalisation is needed to reduce data redundancy and to increase data integrity. The overall goal here is to have updates only in one place.

1NF - First Normal Form

Atomic values: each cell contains unique and single values
Be able to add data without altering tables
Separate different relations into different tables
Keep relations between tables together with foreign keys

2NF - Second Normal Form

1NF is reached
All columns in a table rely on the primary key

3NF - Third Normal Form

2NF is reached
No transitive dependencies

There are up to 6 normal forms. But most of the time we will need only up to 3.

Denormalisation

When everything is normalised, we need Joins to gather the data. The more Joins you need, the slower the query will be. Denormalisation must be done to increase performance.

Tables

Fact table

Consists of the measurements, metrics of facts of business process. EVENTS THAT HAVE ACTUALLY HAPPENED - like a transaction

Dimension table

A structure that categorises facts and measures in order to enable users to answer business questions.
Dimensions can be for example people, products, place or time.