Abhay Singh Kathayat

Posted on Dec 18

Understanding Database Normalization: Ensuring Efficient and Consistent Data Storage

#sql #database #databasenormalization #mysql

What is Normalization in Databases?

Normalization is the process of organizing data in a relational database to reduce redundancy and dependency by dividing large tables into smaller ones and defining relationships between them. The primary aim of normalization is to ensure data integrity and minimize data anomalies, like insertion, update, and deletion anomalies.

Objectives of Normalization

Eliminate Redundancy:

Avoid storing duplicate data in the database, which can save storage space and prevent inconsistencies.
Ensure Data Integrity:

By organizing data efficiently, normalization ensures that the data is accurate, consistent, and reliable.
Minimize Anomalies:

Reducing redundancy helps to prevent problems like:
- Insertion anomaly: Inability to insert data due to missing other related data.
- Update anomaly: Inconsistent data after updating.
- Deletion anomaly: Unintended loss of data when deleting a record.
Optimize Queries:

Normalized data can lead to more efficient querying by structuring data in logical relationships.

Normal Forms

Normalization is done in steps, known as normal forms. Each normal form has specific rules that must be followed to progress to the next level of normalization. The main normal forms are:

1. First Normal Form (1NF)

Rule:

A table is in 1NF if:
- Each column contains only atomic (indivisible) values.
- Each column contains values of a single type.
- Each record must be unique.

- Example:

Before 1NF (Repeating Groups):

OrderID	Product	Quantity
1	Apple, Banana	2, 3
2	Orange	1

After 1NF:

OrderID	Product	Quantity
1	Apple	2
1	Banana	3
2	Orange	1

2. Second Normal Form (2NF)

Rule:

A table is in 2NF if:
- It is in 1NF.
- All non-key columns are fully dependent on the primary key.
Note:

The concept of partial dependency is eliminated in 2NF. This means that every non-key column must depend on the entire primary key, not just a part of it.

- Example:

Before 2NF:

OrderID	Product	CustomerName	Price
1	Apple	John	10
1	Banana	John	5
2	Orange	Jane	8

Here, CustomerName depends only on OrderID, not on the whole primary key (OrderID, Product).

After 2NF:
Tables:

Orders (OrderID, CustomerName)
OrderDetails (OrderID, Product, Price)

Orders table:

OrderID	CustomerName
1	John
2	Jane

OrderDetails table:

OrderID	Product	Price
1	Apple	10
1	Banana	5
2	Orange	8

3. Third Normal Form (3NF)

Rule:

A table is in 3NF if:
- It is in 2NF.
- There are no transitive dependencies. A non-key column should not depend on another non-key column.
Example:

Before 3NF:

OrderID	Product	Category	Supplier
1	Apple	Fruit	XYZ
2	Carrot	Vegetable	ABC

Here, Supplier depends on Category, which is a transitive dependency.

After 3NF:
Tables:

Orders (OrderID, Product, Category)
Category (Category, Supplier)

Orders table:

OrderID	Product	Category
1	Apple	Fruit
2	Carrot	Vegetable

Category table:

Category	Supplier
Fruit	XYZ
Vegetable	ABC

4. Boyce-Codd Normal Form (BCNF)

Rule:

A table is in BCNF if:
- It is in 3NF.
- Every determinant (a column that determines another column) is a candidate key.
Example:

Before BCNF:

CourseID	Instructor	Room
101	Dr. Smith	A1
101	Dr. Johnson	A2
102	Dr. Smith	B1

In this case, Instructor determines Room, but Instructor is not a candidate key. To move to BCNF, we separate the relationship between instructors and rooms.

After BCNF:
Tables:

Courses (CourseID, Instructor)
Rooms (Instructor, Room)

Courses table:

CourseID	Instructor
101	Dr. Smith
101	Dr. Johnson
102	Dr. Smith

Rooms table:

Instructor	Room
Dr. Smith	A1
Dr. Johnson	A2
Dr. Smith	B1

Benefits of Normalization

Reduces Data Redundancy:

Data is stored more efficiently, preventing repetition and unnecessary storage space.
Prevents Data Anomalies:

Normalization helps maintain consistency in data by preventing errors during updates, inserts, or deletes.
Improves Query Performance:

Well-organized tables lead to faster query processing as fewer data needs to be processed.
Data Integrity:

Ensures the accuracy and reliability of the data through defined relationships.

When to Denormalize?

While normalization improves data integrity, sometimes denormalization is done for performance reasons. Denormalization is the process of combining tables to reduce the number of joins and improve query performance, particularly in read-heavy applications. However, this can lead to data redundancy and anomalies, so it should be used judiciously.

Conclusion

Normalization is a key concept in database design aimed at organizing data to minimize redundancy and improve data integrity. By breaking down large tables into smaller, related ones, normalization ensures efficient storage and data consistency. While the process involves several stages (1NF, 2NF, 3NF, and BCNF), the goal remains the same: to create a database schema that is both efficient and maintainable.

Hi, I'm Abhay Singh Kathayat!
I am a full-stack developer with expertise in both front-end and back-end technologies. I work with a variety of programming languages and frameworks to build efficient, scalable, and user-friendly applications.
Feel free to reach out to me at my business email: kaashshorts28@gmail.com.

DEV Community

Understanding Database Normalization: Ensuring Efficient and Consistent Data Storage

What is Normalization in Databases?

Objectives of Normalization

Normal Forms

1. First Normal Form (1NF)

- Example:

Before 1NF (Repeating Groups):

After 1NF:

2. Second Normal Form (2NF)

- Example:

Before 2NF:

Orders table:

OrderDetails table:

3. Third Normal Form (3NF)

Before 3NF:

Orders table:

Category table:

4. Boyce-Codd Normal Form (BCNF)

Before BCNF:

Courses table:

Rooms table:

Benefits of Normalization

When to Denormalize?

Conclusion

Top comments (0)

Read next

Transaction Safety in Rails: Identifying and Addressing Non-Atomic Interactions

🚀 Building a CRM in PHP & MySQL: My Final Year Project Journey 🎓

02. DBMS என்றால் என்ன? What is a DBMS?

Beyond LIKE: Advanced Text Search and Keyword Matching in Postgres using Full Text Search