Hashing in Data Structure

In this article, we will learn what is hashing in data structure, index mapping in hashing, what is a collision in a hash table in data structure, separate chaining collision handling technique in hashing, and open addressing collision handling technique in hashing.

What is Hashing in Data Structure?

It is a method for storing and retrieving data from a database in O(1) time. Sometimes it is called mapping. With the help of hashing in data structure, we convert larger values into smaller values using the concept of hashing.
With the help of the search key, we point the data into the database and for this, we use a pointer in hashing.

There are three main terms in hashing.

1. Search Key
It is a unique number that we will put into a hash table to retrieve or store data from a database.
For example, We find out a particular student in a classroom with the help of the student roll number.

2. Hash Table in data structure
It is a data structure that provides a methodology to store data in a proper way. It looks like an array but in actuality, it is not an array.

3. Hash Function
Using the hash function, we insert, delete and search data in a hash table in O(1) time.

There are mainly one hash function methods.

  • K mod N ( Where N is the number of keys)
    In this function, we divide the search key by N and get a reminder. The reminder is a hash value. A hash value is the index number of the hash table and we will put the search key on that index number in a hash table.

Index Mapping in Hashing

We will understand the above terminology with the help of the example and we will also learn index mapping in a hash table in the data structure.

Here, we use the hash function ( k mod 10) to put the search key in a hash table. That is why in a hash table we have a 0-9 index number.

Example:

Search key ( 45, 56. 17, 23, 32, 81)
Hash function ( k mod 10)

For key: 45
Hash Value: 45 mod 10 = 5
So we will map 45 at index number 5 of a hash table.

For key: 56
Hash Value: 56 mod 10 = 6
So we will map 56 at index number 6 of a hash table.

Likewise, We will calculate the hash value for left keys and put the key in a hash table according to its hash value.

After inserting a key in a hash table in data structure, hash table looks like:

Search Key Index
0
81 1
32 2
23 3
4
45 5
56 6
17 7
8
9

Code Implementation

// What is hashing in data structure
// hasing program in c

#include 
#include 
#include 
#include 

#define SIZE 20

struct DataItem {
   int data;   
   int key;
};

struct DataItem* hashArray[SIZE]; 
struct DataItem* dummyItem;
struct DataItem* item;

int hashCode(int key) {
   return key % SIZE;
}

struct DataItem *find(int key) {
    
   int hashIndex = hashCode(key);  
    
    
   while(hashArray[hashIndex] != NULL) {
    
      if(hashArray[hashIndex]->key == key)
         return hashArray[hashIndex]; 
            
         ++hashIndex;
        
      
      hashIndex %= SIZE;
   }        
    
   return NULL;        
}

void insert(int key,int data) {

   struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem));
   item->data = data;  
   item->key = key;

  
   int hashIndex = hashCode(key);

   
   while(hashArray[hashIndex] != NULL && hashArray[hashIndex]->key != -1) {
      
      ++hashIndex;
        
      
      hashIndex %= SIZE;
   }
    
   hashArray[hashIndex] = item;
}

struct DataItem* delete(struct DataItem* item) {
   int key = item->key;

   
   int hashIndex = hashCode(key);

   
   while(hashArray[hashIndex] != NULL) {
    
      if(hashArray[hashIndex]->key == key) {
         struct DataItem* temp = hashArray[hashIndex]; 
            
         
         hashArray[hashIndex] = dummyItem; 
         return temp;
      }
        
      
      ++hashIndex;
        
      
      hashIndex %= SIZE;
   }      
    
   return NULL;        
}

void display() {
   int i = 0;
    
   for(i = 0; ikey,hashArray[i]->data);
      else
         printf(" ----- ");
   }
    
   printf("\n");
}

int main() {
   dummyItem = (struct DataItem*) malloc(sizeof(struct DataItem));
   dummyItem->data = -1;  
   dummyItem->key = -1; 

   insert(2, 90);
   insert(3, 750);
   insert(4, 20);
   insert(6, 27);
   insert(32, 47);
   insert(14, 39);
   insert(77, 14);
   insert(33, 79);
   insert(37, 91);

   display();
   item = find(14);

   if(item != NULL) {
      printf("Element found: %d\n", item->data);
   } else {
      printf("Element not found\n");
   }

   delete(item);
   item = find(14);

   if(item != NULL) {
      printf("Element found: %d\n", item->data);
   } else {
      printf("Element not found\n");
   }
}

Output

-----  -----  (2,90) (3,750) (4,20) -----  (6,27) -----  -----  -----  -----  -----  (32,47) (33,79) (14,39) -----  -----  (77,14) (37,91) ----- 

Element found: 39
Element not found

What is Collision?

In a hash table, there is a chance that two keys result in the same value. A collision occurs when a newly inserted key maps to an already occupied slot in the hash table and must be solved using some collision handling technique.

There are mainly two collision handling techniques.

  1. Separate Chaining
  2. Open Addressing

Separate Chaining Collision Handling Technique in Hashing

In this technique, we will build an array as a linked list called a chain. Because of that, When multiple elements are mapped into the same slot index, they are inserted into a chain, which is a singly-linked list.
All elements that hash into the same slot index are inserted into a linked list in this case. We now use a key K to search the linked list by simply traversing it linearly. If the intrinsic key for any entry is equal to K, we have found our entry. If we have reached the end of the linked list and still have not found our entry, it’s because it doesn’t exist. As a result, in separate chaining, if two different elements have the same hash value, we store both of them in the same linked list one after the other.

Now, we will understand the separate chaining technique using examples.

Example:

Search key: 24, 19, 32, 44, 56
Hash function:( K mod 6)

Step 1:
We will calculate the hash value of 24.
Hash value of 24 = 24 mod 6 = 0.
So, we will insert 24 at index number 0 in a hash table.

Step 2:
We will calculate the hash value of 19 and 32.
Hash value of 19 = 19 mod 6 = 1.
Hash value of 32 = 32 mod 6 = 2.
So, we will insert 19 and 32 at index numbers 1 and 2 respectively in a hash table.

Step 3:
We will calculate the hash value of 44.
Hash value of 44 = 44 mod 6 = 2.
Now, we insert the 44 at index number 2 which is preoccupied with 32. Here, Collision occurs which is why we have to add a chain at 32.

Step 4:
We will calculate the hash value of 56.
Hash value of 56 = 56 mod 6 = 2.
Now, we insert the 56 at index number 2 which is preoccupied with 32. Here, again Collision occurs which is why we have to add a chain at 44.

Open Addressing Collision Handling Technique in Hashing

All elements in Open Addressing are stored in the hash table itself. As a result, at any point, the table’s size must be greater than or equal to the total number of keys.

There are mainly three ways of open addressing.

1. Linear Probing

In this technique, When we insert the key at index number, at that time if slot is occupied then we will find next empty slot in hash table to put that key.

Example:

Search Keys: 43, 135, 72, 23, 99, 19, 82
Hash Function: K mod 10

Step 1: First, create an empty hash table with a possible range of hash values ranging from 0 to 9 based on the hash function provided.

Step 2: We will find hash value one by one and according to its hash value, we will insert the key in a hash table. In this step, we wil calculate hash value of first three key.

Hash value (43) = 43 mod 10 = 3
Hash value (135) = 135 mod 10 = 5
Hash value (72) = 72 mod 10 = 2

Now, we will insert 43, 135 and 72 at index numbers 3,5 and 2 respectively.

Step 3: Calculate hash value for key 23.
Hash value (23) = 23 mod 10 = 3

Now, we will insert 23 at index number 3 but index number 3 is preoccupied with 43. So, we will find next empty slot for mapping.
We find out that next empty slot is index number 4. So we will insert 43 at index number 4.

Step 4: Calculate hash value for key 99.
Hash value (99) = 99 mod 10 = 9

Now, we will insert 99 at index number 9 in a hash table.

Step 5: Calculate hash value for key 19.
Hash value (19) = 19 mod 10 = 9

Now, we will insert 19 at index number 9 but index number 9 is preoccupied with 99. So, we will find next empty slot for mapping.
We find out that next empty slot is index number 0. So we will insert 19 at index number 0.

Step 6: Calculate hash value for key 82.
Hash value (82) = 82 mod 10 = 2

Now, we will insert 82 at index number 2 but index number 2 is preoccupied with 72. So, we will find next empty slot for mapping.
We find out that next empty slot is index number 6. So we will insert 82 at index number 6.

Quadratic Probing

In this technique, When we insert the key at index number, at that time if slot is occupied then we will find empty slot in hash table to put that key using hash function
(h(k) + i^2) mod N ; where h(k) is hash value and
(i) is number of time attempting to find slot..

Example:

Search Keys: 42, 16, 91, 33, 18, 27, 36, 62
Hash Function: K mod 10

Step 1: First, create an empty hash table with a possible range of hash values ranging from 0 to 9 based on the hash function provided.

Step 2: We will find hash value one by one and according to its hash value, we will insert the key in a hash table. In this step, we wil calculate hash value of first six key.

Hash value (42) = 42 mod 10 = 2
Hash value (16) = 16 mod 10 = 6
Hash value (91) = 91 mod 10 = 1
Hash value (33) = 33 mod 10 = 3
Hash value (18) = 18 mod 10 = 8
Hash value (27) = 27 mod 10 = 7

Now, we will insert 42, 16 and 91 at index numbers 2,6 and 1 respectively and also insert 33, 18 and 27 at index numbers 3, 8 and 7.

Step 3: Calculate hash value for key 36.
Hash value (36) = 36 mod 10 = 6

Now, we will insert 36 at index number 6 but index number 6 is preoccupied with 16. So, we will find empty slot for mapping using hash function (h(k) + i^2) mod 10.

(h(k) + i^2) mod 10
(6 + (1)^2) mod 10
(7) mod 10
7
Here, index number 7 is also preoccupied with 27. So we will find next empty slot.

(h(k) + i^2) mod 10
(6 + (2)^2) mod 10
(10) mod 10
0

We find out that empty slot is index number 0. So we will insert 36 at index number 0.

Step 4: Calculate hash value for key 62.
Hash value (62) = 62 mod 10 = 2

Now, we will insert 62 at index number 2 but index number 2 is preoccupied with 42. So, we will find empty slot for mapping using hash function (h(k) + i^2) mod 10.

(h(k) + i^2) mod 10
(2 + (1)^2) mod 10
(3) mod 10
3
Here, index number 3 is also preoccupied with 33. So we will find next empty slot.

(h(k) + i^2) mod 10
(2 + (2)^2) mod 1
(6) mod 10
6
Here, index number 6 is also preoccupied with 16.

(h(k) + i^2) mod 10
(2 + (3)^2) mod 10
(11) mod 10
1
Here, index number 1 is also preoccupied with 91.

(h(k) + i^2) mod 10
(2 + (4)^2) mod 10
(18) mod 10
8
Here, index number 8 is also preoccupied with 18.

We tried to find empty slot but it did not happen because function is in iterative mode. We will never find empty slot for this value using function.

This is the major disadvantage of quadratic probing technique.

Double Hashing

In this technique, When we insert the key at index number, at that time if slot is occupied then we will find empty slot in hash table to put that key using second hash function
(h1(k) + (i)h2(k)) mod N

Another hash function is used to compute the intervals between probes. Double hashing is a technique for optimising clustering reduction.

Example:

Search key: 20, 34, 45
h1(k): k mod 11
h2(k): 8 – (k mod 8)

Step 1: First, create an empty hash table with a possible range of hash values ranging from 0 to 11 based on the hash function provided.

Step 2: We will find hash value one by one and according to its hash value, we will insert the key in a hash table. In this step, we wil calculate hash value of first two key.

Hash value (20) = 20 mod 11 = 9
Hash value (34) = 34 mod 11 = 1

Now, we will insert 20 and 34 at index numbers 9 and 1 respectively.

Step 3: Calculate hash value for key 45.
Hash value (45) = 45 mod 11 = 1

Now, we will insert 45 at index number 1 but index number 1 is preoccupied with 34. So, we will find next empty slot for mapping using function.

(h1(k) + (i)h2(k)) mod N
(1 + (1) (8 – (45 mod 8))) mod 11
(1 + (1) (8 – 5)) mod 11
(1 + (1)(3)) mod 11
(4) mod 11
4

We find out that empty slot is index number 4. So we will insert 45 at index number 4.

Leave a Reply

Your email address will not be published. Required fields are marked *