Remove duplicates from an unsorted linked list

Introduction

The linked list is one of the most important concepts and data structures to learn while preparing for interviews. Having a good grasp of Linked Lists can be a huge plus point in a coding interview.

Problem Statement

In this problem, we are given an unsorted LinkedList and are asked to remove the duplicate elements present in the list.

Problem Statement Understanding

According to the problem statement, we will be given an unsorted linked list that may contain duplicate elements. We need to remove the duplicate elements from the list.

Let's try to understand the problem with the help of some examples.

If the given input unsorted linked list is:

  • As we can see in the above list, there are multiple occurrences of elements 10, 12 and 11 in the linked list.
  • So it means that the list contain duplicates of 10, 12 and 11, and we need to remove these duplicates from the linked list.
  • After removing the duplicate elements from the list, the output linked list will be:

If the linked list is: head->10->12->12->15->10->20->20.

  • In this case, as we can see in the above list, 10, 12 and 20 are duplicate elements present more than once in the list.
  • After removing the duplicate elements from the list, the output will be head->10->12->15->20.
Some more examples:

Sample Input 1: head->2->3->2->4->7.
Sample Output 1: head->2->3->4->7.

Sample Input 2: head->3->3->7->9->7->11->4->9->15.
Sample Input 2: head->3->7->9->11->4->15.

Now, I think from the above examples, the problem statement is clear. So let's see how we will approach it.

Before moving to the approach section, try to think about how you can approach this problem.

  • If stuck, no problem, we will thoroughly see how we can approach the problem in the next section.

Let’s move to the approach section.

Approach 1 (Using two loops)

Duplicate elements can be removed using two loops:

  • Outer loop for traversing the linked list and picking the element of list one by one and inner loop to check if in the rest of the list (starting from outer loop pointer's next to the end of the list) any duplicate of the element picked up by outer loop is present or not.
  • If the duplicate is present in the linked list, remove the duplicate element from the list.

The time complexity for this approach will be O(n2).

  • Although the above approach will work fine but if in case the length of our linked list is greater than or equal to 104, it will give us time limit exceed error (TLE).

So to solve the above problem of TLE, we will see in the next approach how using sorting can solve our problem without getting TLE.

Approach 2 (Using merge sort)

Usually, merge sort is used to sort the linked lists.

  • We will sort the linked list by using merge sort.
  • And now, all the duplicate nodes will be adjacent, and we can now easily remove these duplicate nodes from the list by comparing the adjacent nodes by iterating the list.

The time complexity for this approach will be O(nLogn), because sorting a list takes (nlogn) time.

Let's see if we can further reduce the time complexity in the next approach. Any Ideas?

  • If not, its okay; we will thoroughly see how to reduce time complexity in the next section.

Let's move to the next approach.

Approach 3 (Using set)

In this approach, we will make use to set to store the occurrences of the elememnts of the list.

  • The most efficient way to remove the duplicate element is to use a set to store the occurrence of the elements present in the Linked List.
  • Now, we traverse the Linked List and if the element in the current node is already present in the set:
    • We will remove the current node from the linked list.
    • Else we store the element in the set and move forward in the linked list.

Let's see the algorithm for this approach.

Algorithm

  • Take a set seen to store the elements of the Linked List. Also remember that the set stores unique elements.
  • Take a variable curr and initialize it with head of the linked list.
  • Take a variable prev and initialize it with NULL.
  • Traverse the linked list till curr do not become NULL.
  • While traversing the list:
    • If the element in the current node curr is already present in the set then remove the current node from the linked list by assigning next of curr to the next of previous node of curr (prev->next = curr->next) and delete the node curr.
    • Else, insert the element in the set and move prev forward by moving prev to the position of curr.
    • Move curr forward by moving curr to the next of prev.
  • Finally, after the traversal, our resultant list will be free from duplicates.

Dry Run



Code Implementation

#include
using namespace std;

struct Node
{
    int data;
    struct Node *next;
};

/* Using this function we will be creating a new node */
struct Node *newNode(int data)
{
    Node *temp = new Node;
    temp->data = data;
    temp->next = NULL;
    return temp;
}

/* Using this function we will be removing the duplicates elements from the given unsorted linked list */
void removeDuplicatesFromList(struct Node *head)
{
    unordered_set seen; 
    struct Node *curr = head;
    struct Node *prev = NULL;
    while (curr != NULL)
    {
        if (seen.find(curr->data) != seen.end())
        {
            prev->next = curr->next;
            delete (curr); 
        }
        else
        {
            seen.insert(curr->data); 
            prev = curr; 
        }
        curr = prev->next; 
    }
}

/* Using this function we will be printing the linked list */
void printList(struct Node *node)
{
    while (node != NULL)
    {
        printf("%d ", node->data);
        node = node->next;
    }
}

int main()
{
    struct Node *head = newNode(10);
    head->next = newNode(12);
    head->next->next = newNode(11);
    head->next->next->next = newNode(15);
    head->next->next->next->next = newNode(12);
    head->next->next->next->next->next = newNode(11);
    head->next->next->next->next->next->next = newNode(10);
    cout<<"Linked list before removing duplicates"<

						 

Output

Linked list before removing duplicates
10 12 11 15 12 11 10
Linked list after removing duplicates
10 12 11 15

Time Complexity: O(n), where n is the number of nodes in the given linked list.

In this blog, we have seen how to remove duplicate elements from an unsorted single LinkedList in linear time and extra space using a set in the most efficient manner. This is a basic problem and is good for strengthening your concepts in LinkedList and if you want to practice more such problems, you can checkout Prepbytes (Linked List).

Previous post Reverse A Sublist Of Linked List
Next post Remove duplicates from a sorted linked list

Leave a Reply

Your email address will not be published. Required fields are marked *