Normalized and denormalized databases

08 September, 2020

Normalizing and denormalizing a type of the treatment we apply to data to structure it. The golden source of the data could be upstream from multiple applications that are collected into a database.

Normalized databases reduce redundancy and improve consistency across multiple tables. When data is referred to in one table it may be related to another table where additional data is stored.

Melissa's personal info could be stored in one table, while details of her insurance policies could be stored in other tables. In the insurance policies table, her unique identifier would be tagged here, along with other policy holders.

Relational databases best serve normalized data.

Denormalized databases store all information about an entity in a single (or fewer) table. Data is redundant but it's easier to be retrieved.

Mike's personal info (name, DOB, address) and all the insurance policies he has is stored in the same table. We don't have to look else where because everything we need to know about Mike is in his collection (borrowing the term from Mongo). The full set of insurance policy details would also be repeated in other policy holders like Mike if they had the same one - making it redundant.

Non-relational databases best serve denormalized data.

Denormalizing data doesn't seem to be the norm anymore, even if non relational databases like MongoDB are used. Reference. It's harder to scale, manage, and synchronize.

Even if denormalizing has needs to be done to optimize performance, people usually normalize it first. Reference.

Back