Data modeling – Table Inheritance

Modeling data is challenging. Well, at least I find it hard because of database normalization and the ever-changing database schemas.

This article will cover table inheritance: what it is, where it likely came from, an example challenge, the different patterns with their pros and cons. Most of it will be theoretical, but I will do my best to include the relevant SQL for the examples.

In this article, I will refer to database tables as an Entity and a column in a table as an Attribute. The SQL snippets will also be PostgreSQL specific.

So, let's jump into it.

Meet the family

Inheritance is an Object-Oriented Programming (OOP) pattern where a class (child) acquires properties of another class (parent). The inheritance between different classes creates a hierarchy/ tree from which you can draw a relationship between the classes.

This presents an advantage – reducing redundancy redefining a parent class' attribute in the child class.

Relational databases, however, were not designed to be object-oriented – except for PostgreSQL that supports inheritance.

Object Relational Mappers (ORMs) created an abstraction over databases, allowing you to model tables in your database as you would define entities/ objects/ classes instead of writing SQL by hand. The advantage this creates is that it creates a simple way to conceptualize your database. My hunch is that table inheritance was introduced by ORMs

A visit to the bank 💰

A customer in a bank has an Account. A customer can be a person or a company. This also means a bank would support different types of accounts: CompanyAccount and a SavingsAccount – to keep this analogy simple.The 2 types of accounts will have shared attributes – id, address, name – and unique properties – company name, owners etc.

How would you model this sort of data?

One option is table inheritance. There are 3 different forms of inheritance that you can use:

  • Single Table Inheritance
  • Concrete Table Inheritance
  • Class Table Inheritance

Single Table Inheritance

Ironically, Single Table Inheritance doesn't inform any form of inheritance when modeling tables using SQL. This is a conceptual model introduced by ORMs.

You would create your parent entity Account that would contain all the shared attributes. The child entities CompanyAccount and SavingsAccount would inherit the parent entity's attributes and define their own attributes.

To differentiate the child entities a type or discriminator property is introduced in the parent entity which is a string value.

class Account {
    // define the attributes here
    type: string // "CompanyAccount" or "SavingsAccount"
}

class CompanyAccount extends Account {
    // define the attributes here
    signatories: Json
}

class SavingsAccount extends Account { 
    // define the attributes here
    signatory: string
}

However, at the database level, only one table – Account – is created. All the attributes of the child entities would be added to the Account table and marked as nullable or NULL.

CREATE TABLE "Account" (
    "id" SERIAL NOT NULL,
    "balance" DOUBLE PRECISION NOT NULL,
    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
    "type" TEXT NOT NULL,

    -- SavingsAccount specific field
    "signatory" TEXT,

    -- CorporateAccount specific field
    "signatories" JSONB,


    CONSTRAINT "Account_pkey" PRIMARY KEY ("id")
);

CREATE INDEX "account_type_index" on "content" ("type");

This approach would be considered okay if there are few child entities inheriting from Account.

Single Table Inheritance wouldn't scale well because:

  • Depending on the number of attributes from child entities, there would be many NULL fields.
  • Adding a new class would make this difficult to maintain.

Concrete Table Inheritance

The Concrete Table Inheritance pattern takes a slightly different approach when creating tables in the database.

The parent entity would serve as a template from which child entities would inherit. However, in this instance, the parent entity isn't created in the database.

The plus sides for this approach include:

  • Child entities will have individual tables and won't store attributes of other child entities.
  • Mandatory fields of the child entity would be enforced with the NOT NULL constraint.
  • No type column in the parent entity to determine the type of child entity

Some downsides are:

  • Modeling using SQL would be very repetitive when defining attributes from the parent entity every time you'd create a child entity
  • Searching through all child entities would require a UNION on child tables

The SQL for Concrete Table Inheritance would resemble this:

CREATE TABLE "SavingsAccount" (
    "id" SERIAL NOT NULL,
    "balance" DOUBLE PRECISION NOT NULL,
    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
    "signatory" TEXT NOT NULL,

    CONSTRAINT "SavingsAccount_pkey" PRIMARY KEY ("id")
);

CREATE TABLE "CorporateAccount" (
    "id" SERIAL NOT NULL,
    "balance" DOUBLE PRECISION NOT NULL,
    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
    "signatories" JSONB NOT NULL,

    CONSTRAINT "CorporateAccount_pkey" PRIMARY KEY ("id")
);

PostgreSQL supports the INHERITS keyword that takes care of inheritance. The only difference is that the parent entity is also created in the database

CREATE TABLE ACCOUNT (
-------- Base table properties here
);
CREATE TABLE CORPORATE_ACCOUNT (
-------- Specific table properties
) INHERITS(ACCOUNT);
CREATE TABLE SAVINGS_ACCOUNT (
-------- Specific table properties
) INHERITS(ACCOUNT);

Class Table Inheritance

Class Table inheritance resembles Classes in OOP and takes the best of both worlds.

The parent entity would have its own table and any child entity would have a 1-1 relationship with the parent. The ID of the child entity would serve as a foreign key in the parent entity.

Example Entity Relationship Diagram:

SQL

-- CreateTable
CREATE TABLE "Account" (
    "id" SERIAL NOT NULL,
    "balance" DOUBLE PRECISION NOT NULL,
    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
    "savingsAccountId" INTEGER NOT NULL,
    "corporateAccountId" INTEGER NOT NULL,

    CONSTRAINT "Account_pkey" PRIMARY KEY ("id")
);

-- CreateTable
CREATE TABLE "SavingsAccount" (
    "id" SERIAL NOT NULL,
    "signatory" TEXT NOT NULL,
    "accountId" INTEGER,

    CONSTRAINT "SavingsAccount_pkey" PRIMARY KEY ("id")
);

-- CreateTable
CREATE TABLE "CorporateAccount" (
    "id" SERIAL NOT NULL,
    "signatories" JSONB NOT NULL,
    "accountId" INTEGER,

    CONSTRAINT "CorporateAccount_pkey" PRIMARY KEY ("id")
);

-- CreateIndex
CREATE UNIQUE INDEX "Account_savingsAccountId_key" ON "Account"("savingsAccountId");

-- CreateIndex
CREATE UNIQUE INDEX "Account_corporateAccountId_key" ON "Account"("corporateAccountId");

The advantages of this approach are:

  • It is DRY
  • No type column in the parent entity to determine the type of child entity
  • Searching through tables is easier – no UNIONs
  • There is no risk of child entities having attributes of other child entities.

It's not all roses with this approach. Some of the disadvantages are:

  • You wouldn't be able to determine the type of a child entity without querying them. However, if you're using an ORM is a non-issue
  • It can be hard keeping the parent and child entities in sync using triggers.

Conclusion

The take away of this article is there's no one-size fits all approach when modeling entities in your database. Some have work well, and others... don't. What's important is doing the appropriate research in your domain and modeling data from that. Even this won't be enough as you will have to iterate on your database schema over and over again.

You can read more in Patterns of Enterprise Application Architecture by Martin Fowler.

21