Fourth Normal Form (4NF)

Fourth Normal Form comes into picture when Multi-valued Dependency occur in any relation. In this tutorial we will learn about Multi-valued Dependency, how to remove it and how to make any table satisfy the fourth normal form.

Follow the video above for complete explanation of 4th Normal Form. Or, if you want, you can even skip the video and jump to the section below for the complete tutorial.

In our last tutorial, we learned about the boyce-codd normal form, we suggest you to follow the last tutorial before this one.


Rules for 4th Normal Form

For a table to satisfy the Fourth Normal Form, it should satisfy the following two conditions:

  1. It should be in the Boyce-Codd Normal Form.
  2. And, the table should not have any Multi-valued Dependency.

Let’s try to understand what multi-valued dependency is in the next section.


What is Multi-valued Dependency?

A table is said to have multi-valued dependency, if the following conditions are true,

  1. For a dependency A → B, if for a single value of A, multiple value of B exists, then the table may have multi-valued dependency.
  2. Also, a table should have at-least 3 columns for it to have a multi-valued dependency.
  3. And, for a relation R(A,B,C), if there is a multi-valued dependency between, A and B, then B and C should be independent of each other.

If all these conditions are true for any relation(table), it is said to have multi-valued dependency.


Time for an Example

Below we have a college enrolment table with columns s_idcourse and hobby.

s_idcoursehobby
1ScienceCricket
1MathsHockey
2C#Cricket
2PhpHockey

As you can see in the table above, student with s_id 1 has opted for two courses, Science and Maths, and has two hobbies, Cricket and Hockey.

You must be thinking what problem this can lead to, right?

Well the two records for student with s_id 1, will give rise to two more records, as shown below, because for one student, two hobbies exists, hence along with both the courses, these hobbies should be specified.

s_idcoursehobby
1ScienceCricket
1MathsHockey
1ScienceHockey
1MathsCricket

And, in the table above, there is no relationship between the columns course and hobby. They are independent of each other.

So there is multi-value dependency, which leads to un-necessary repetition of data and other anomalies as well.


How to satisfy 4th Normal Form?

To make the above relation satify the 4th normal form, we can decompose the table into 2 tables.

CourseOpted Table

s_idcourse
1Science
1Maths
2C#
2Php

And, Hobbies Table,

s_idhobby
1Cricket
1Hockey
2Cricket
2Hockey

Now this relation satisfies the fourth normal form.

A table can also have functional dependency along with multi-valued dependency. In that case, the functionally dependent columns are moved in a separate table and the multi-valued dependent columns are moved to separate tables.

If you design your database carefully, you can easily avoid these issues.

Normalization of Database

Database Normalization is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a multi-step process that puts data into tabular form, removing duplicated data from the relation tables.

Normalization is used for mainly two purposes,

  • Eliminating redundant(useless) data.
  • Ensuring data dependencies make sense i.e data is logically stored.

The video below will give you a good overview of Database Normalization. If you want you can skip the video, as the concept is covered in detail, below the video.


Problems Without Normalization

If a table is not properly normalized and have data redundancy then it will not only eat up extra memory space but will also make it difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anomalies are very frequent if database is not normalized. To understand these anomalies let us take an example of a Student table.

rollnonamebranchhodoffice_tel
401AkonCSEMr. X53337
402BkonCSEMr. X53337
403CkonCSEMr. X53337
404DkonCSEMr. X53337

In the table above, we have data of 4 Computer Sci. students. As we can see, data for the fields branchhod(Head of Department) and office_tel is repeated for the students who are in the same branch in the college, this is Data Redundancy.


Insertion Anomaly

Suppose for a new admission, until and unless a student opts for a branch, data of the student cannot be inserted, or else we will have to set the branch information as NULL.

Also, if we have to insert data of 100 students of same branch, then the branch information will be repeated for all those 100 students.

These scenarios are nothing but Insertion anomalies.


Updation Anomaly

What if Mr. X leaves the college? or is no longer the HOD of computer science department? In that case all the student records will have to be updated, and if by mistake we miss any record, it will lead to data inconsistency. This is Updation anomaly.


Deletion Anomaly

In our Student table, two different informations are kept together, Student information and Branch information. Hence, at the end of the academic year, if student records are deleted, we will also lose the branch information. This is Deletion anomaly.


Normalization Rule

Normalization rules are divided into the following normal forms:

  1. First Normal Form
  2. Second Normal Form
  3. Third Normal Form
  4. BCNF
  5. Fourth Normal Form

First Normal Form (1NF)

For a table to be in the First Normal Form, it should follow the following 4 rules:

  1. It should only have single(atomic) valued attributes/columns.
  2. Values stored in a column should be of the same domain
  3. All the columns in a table should have unique names.
  4. And the order in which data is stored, does not matter.

In the next tutorial, we will discuss about the First Normal Form in details.


Second Normal Form (2NF)

For a table to be in the Second Normal Form,

  1. It should be in the First Normal form.
  2. And, it should not have Partial Dependency.

To understand what is Partial Dependency and how to normalize a table to 2nd normal for, jump to the Second Normal Form tutorial.


Third Normal Form (3NF)

A table is said to be in the Third Normal Form when,

  1. It is in the Second Normal form.
  2. And, it doesn’t have Transitive Dependency.

Here is the Third Normal Form tutorial. But we suggest you to first study about the second normal form and then head over to the third normal form.


Boyce and Codd Normal Form (BCNF)

Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anomaly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. For a table to be in BCNF, following conditions must be satisfied:

  • R must be in 3rd Normal Form
  • and, for each functional dependency ( X → Y ), X should be a super Key.

To learn about BCNF in detail with a very easy to understand example, head to Boye-Codd Normal Form tutorial.


Fourth Normal Form (4NF)

A table is said to be in the Fourth Normal Form when,

  1. It is in the Boyce-Codd Normal Form.
  2. And, it doesn’t have Multi-Valued Dependency.

Here is the Fourth Normal Form tutorial. But we suggest you to understand other normal forms before you head over to the fourth normal form.