In this article, we will learn a famous problem called merge k sorted arrays. This is a very famous problem asked in most of the technical interviews.
How to Merge K Sorted Array
Given K sorted arrays of size N each, Our task is to merge all the given arrays such that the final output array contains all these elements in sorted order.
Examples:
Input:
K = 3, N = 4,
arr1: {1, 3, 5, 7},
arr2: {2, 4, 6, 8},
arr3: {0, 9, 10, 11}
Output:
0 1 2 3 4 5 6 7 8 9 10 11
Explanation: The output array is a sorted array and it contains all the elements of the given input matrix.
Input:
k = 4, n = 4,
arr1: {13, 15, 16, 17},
arr2: {2, 4, 6, 8},
arr3: {0, 9, 10, 11}
Output:
0 2 4 6 8 9 10 11 13 15 16 17
Explanation: The output array is a sorted array and it contains all the elements of the given input matrix.
Approach 1:
Naive Approach for Merging k sorted arrays:
Algorithm:
Create an output array of size (NK), copy all elements to the output array, then sort the output array.
Step 1: Creates an output array of size N K.
Step 2: Traverses the matrix from beginning to end and inserts all elements into the output array.
Step 3: Sort and print the output array.
Code of the above approach:-
#include <bits/stdc++.h> using namespace std; #define N 4 // Merge arr1[0..N1-1] and arr2[0..N2-1] into // arr3[0..N1+N2-1] void mergeArrays(int arr1[], int arr2[], int N1, int N2, int arr3[]) { int i = 0, j = 0, k = 0; // Traverse both array while (i < N1 && j < N2) { // Check if current element of first // array is smaller than current element // of second array. If yes, store first // array element and increment first array // index. Otherwise do same with second array if (arr1[i] < arr2[j]) arr3[k++] = arr1[i++]; else arr3[k++] = arr2[j++]; } // Store remaining elements of first array while (i < N1) arr3[k++] = arr1[i++]; // Store remaining elements of second array while (j < N2) arr3[k++] = arr2[j++]; } // A utility function to print array elements void printArray(int arr[], int size) { for (int i = 0; i < size; i++) cout << arr[i] << " "; } // This function takes an array of arrays as an argument and // All arrays are assumed to be sorted. It merges them // together and prints the final sorted output. void mergeKArrays(int arr[][N], int i, int j, int output[]) { // If one array is in range if (i == j) { for (int p = 0; p < N; p++) output[p] = arr[i][p]; return; } // if only two arrays are left them merge them if (j - i == 1) { mergeArrays(arr[i], arr[j], N, N, output); return; } // Output arrays int out1[N * (((i + j) / 2) - i + 1)], out2[N * (j - ((i + j) / 2))]; // Divide the array into halves mergeKArrays(arr, i, (i + j) / 2, out1); mergeKArrays(arr, (i + j) / 2 + 1, j, out2); // Merge the output array mergeArrays(out1, out2, N * (((i + j) / 2) - i + 1), N * (j - ((i + j) / 2)), output); } // Driver's code int main() { // Change N at the top to change number of elements // in an array int arr[][N] = { { 2, 6, 12, 34 }, { 1, 9, 20, 1000 }, { 23, 34, 90, 2000 } }; int K = sizeof(arr) / sizeof(arr[0]); int output[N * K]; mergeKArrays(arr, 0, 2, output); // Function call cout << "Merged array is " << endl; printArray(output, N * K); return 0; }
Input
arr1: { 2, 6, 12, 34 },
arr2: { 1, 9, 20, 1000 },
arr3: { 23, 34, 90, 2000 }
Output
Merged array is
1 2 6 9 12 20 23 34 34 90 1000 2000
Time Complexity: O(N K log (NK)), Since the final array is of size NK.
Space Complexity: O(N K), The output array is of size N K.
Approach 2: Merge K Sorted Arrays using Merging:
In this approach, we start by merging the arrays into two groups. After the first merge, we are left with K/2 arrays. Now, Merge the array back into the group. This leaves us with a K/4 array. This approach is similar to mergesort. Split K arrays containing the same number of arrays in half until there are two arrays in the group. After this process, merge the arrays from bottom to top.
Follow the given steps to solve the problem:
Step 1: Create a recursive function that takes K arrays and returns the output array.
Step 2: In the recursive function, if the value of K is 1 then return the array else if the value of K is 2 then merge the two arrays in linear time and return the array.
Step 3: If the value of K is greater than 2 then divide the group of k elements into two equal halves and recursively call the function, i.e 0 to K/2 array in one recursive function and K/2 to K array in another recursive function.
Step 4: Print the output array.
Code of the above approach:-
// C++ program to merge K sorted arrays of size n each. #include <bits/stdc++.h> using namespace std; #define N 4 // Merge arr1[0..N1-1] and arr2[0..N2-1] into // arr3[0..N1+N2-1] void mergeArrays(int arr1[], int arr2[], int N1, int N2, int arr3[]) { int i = 0, j = 0, k = 0; // Traverse both array while (i < N1 && j < N2) { // Check if current element of first // array is smaller than current element // of second array. If yes, store first // array element and increment first array // index. Otherwise do same with second array if (arr1[i] < arr2[j]) arr3[k++] = arr1[i++]; else arr3[k++] = arr2[j++]; } // Store remaining elements of first array while (i < N1) arr3[k++] = arr1[i++]; // Store remaining elements of second array while (j < N2) arr3[k++] = arr2[j++]; } // A utility function to print array elements void printArray(int arr[], int size) { for (int i = 0; i < size; i++) cout << arr[i] << " "; } // This function takes an array of arrays as an argument and // All arrays are assumed to be sorted. It merges them // together and prints the final sorted output. void mergeKArrays(int arr[][N], int i, int j, int output[]) { // If one array is in range if (i == j) { for (int p = 0; p < N; p++) output[p] = arr[i][p]; return; } // if only two arrays are left them merge them if (j - i == 1) { mergeArrays(arr[i], arr[j], N, N, output); return; } // Output arrays int out1[N * (((i + j) / 2) - i + 1)], out2[N * (j - ((i + j) / 2))]; // Divide the array into halves mergeKArrays(arr, i, (i + j) / 2, out1); mergeKArrays(arr, (i + j) / 2 + 1, j, out2); // Merge the output array mergeArrays(out1, out2, N * (((i + j) / 2) - i + 1), N * (j - ((i + j) / 2)), output); } // Driver's code int main() { // Change N at the top to change number of elements // in an array int arr[][N] = { { 2, 6, 12, 34 }, { 1, 9, 20, 1000 }, { 23, 34, 90, 2000 } }; int K = sizeof(arr) / sizeof(arr[0]); int output[N * K]; mergeKArrays(arr, 0, 2, output); // Function call cout << "Merged array is " << endl; printArray(output, N * K); return 0; }
Input:
arr1: { 2, 6, 12, 34 },
arr2: { 1, 9, 20, 1000 },
arr3: { 23, 34, 90, 2000 }
Output
Merged array is
1 2 6 9 12 20 23 34 34 90 1000 2000
Time Complexity: O(N K log K). There are log K levels as in each level the K arrays are divided in half and at each level, the K arrays are traversed.
Space Complexity: O(N K log K). In each level O(N * K).
Approach 3: Merge K sorted arrays using Min-Heap:
Brief about Min-Heap
Min-heap is a min priority queue. It is a complete binary tree having a root value smaller than both children’s values.
2
/ \
4 5
/ \
11 6
Min-Heap
We will use a min-heap to get the current minimum value.
The idea is to use a minimum heap. The time complexity of this MinHeap-based solution is the same as O(NK log K). The first step should start by creating a MinHeap and inserting the first element of every k array into it. Now the root element of the min-cluster is the smallest of all elements. Then it removes the minheap’s root element, inserts it into the output array, and inserts the next element from the array of removed elements. To get the result, we need to continue stepping until there are no more elements in MinHeap.
Follow the given steps to solve the problem:
Step 1: Create a min Heap and insert the first element of all the K arrays.
Step 2: Run a loop until the size of MinHeap is greater than zero. Now, Remove the top element of the MinHeap and print the element. Then, insert the next element from the same array in which the removed element belonged. If the array doesn’t have any more elements, then replace root with infinite. After replacing the root, heapify the tree.
Step 3: Return the output array
Code of the above approach:-
#include <bits/stdc++.h> using namespace std; #define N 4 // A min-heap node struct MinHeapNode { // The element to be stored int element; // index of the array from which the element is taken int i; // index of the next element to be picked from the array int j; }; // Prototype of a utility function to swap two min-heap // nodes void swap(MinHeapNode* x, MinHeapNode* y); // A class for Min Heap class MinHeap { // pointer to array of elements in heap MinHeapNode* harr; // size of min heap int heap_size; public: // Constructor: creates a min heap of given size MinHeap(MinHeapNode a[], int size); // to heapify a subtree with root at given index void MinHeapify(int); // to get index of left child of node at index i int left(int i) { return (2 * i + 1); } // to get index of right child of node at index i int right(int i) { return (2 * i + 2); } // to get the root MinHeapNode getMin() { return harr[0]; } // to replace root with new node x and heapify() new // root void replaceMin(MinHeapNode x) { harr[0] = x; MinHeapify(0); } }; // This function takes an array of arrays as an argument and // All arrays are assumed to be sorted. It merges them // together and prints the final sorted output. int* mergeKArrays(int arr[][N], int K) { // To store output array int* output = new int[N * K]; // Create a min heap with k heap nodes. // Every heap node has first element of an array MinHeapNode* harr = new MinHeapNode[K]; for (int i = 0; i < K; i++) { // Store the first element harr[i].element = arr[i][0]; // index of array harr[i].i = i; // Index of next element to be stored from the array harr[i].j = 1; } // Create the heap MinHeap hp(harr, K); // Now one by one get the minimum element from min // heap and replace it with next element of its array for (int count = 0; count < N * K; count++) { // Get the minimum element and store it in output MinHeapNode root = hp.getMin(); output[count] = root.element; // Find the next element that will replace current // root of heap. The next element belongs to same // array as the current root. if (root.j < N) { root.element = arr[root.i][root.j]; root.j += 1; } // If root was the last element of its array // INT_MAX is for infinite else root.element = INT_MAX; // Replace root with next element of array hp.replaceMin(root); } return output; } // FOLLOWING ARE IMPLEMENTATIONS OF // STANDARD MIN HEAP METHODS FROM CORMEN BOOK // Constructor: Builds a heap from a given // array a[] of given size MinHeap::MinHeap(MinHeapNode a[], int size) { heap_size = size; harr = a; // store address of array int i = (heap_size - 1) / 2; while (i >= 0) { MinHeapify(i); i--; } } // A recursive method to heapify a // subtree with root at given index. // This method assumes that the subtrees // are already heapified void MinHeap::MinHeapify(int i) { int l = left(i); int r = right(i); int smallest = i; if (l < heap_size && harr[l].element < harr[i].element) smallest = l; if (r < heap_size && harr[r].element < harr[smallest].element) smallest = r; if (smallest != i) { swap(&harr[i], &harr[smallest]); MinHeapify(smallest); } } // A utility function to swap two elements void swap(MinHeapNode* x, MinHeapNode* y) { MinHeapNode temp = *x; *x = *y; *y = temp; } // A utility function to print array elements void printArray(int arr[], int size) { for (int i = 0; i < size; i++) cout << arr[i] << " "; } // Driver's code int main() { // Change N at the top to change number of elements // in an array int arr[][N] = { { 2, 6, 12, 34 }, { 1, 9, 20, 1000 }, { 23, 34, 90, 2000 } }; int K = sizeof(arr) / sizeof(arr[0]); // Function call int* output = mergeKArrays(arr, K); cout << "Merged array is " << endl; printArray(output, N * K); return 0; }
Input:
arr1: { 2, 6, 12, 34 },
arr2: { 1, 9, 20, 1000 },
arr3: { 23, 34, 90, 2000 }
Output:
Merged array is
1 2 6 9 12 20 23 34 34 90 1000 2000
Time Complexity: O(N K log K), Insertion and deletion in a Min Heap requires log k time.
Space Complexity: O(K), If the output is not stored, then the only space required is the Min-Heap of K elements.
Conclusion:
Since in this article, we have seen three possible approaches and their time complexities. The first approach was very naive because here we copied all the elements in the output array and sort the array. The time complexity of this algorithm is O(N K log (NK)). In the second approach, we improved the time complexity by using the same approach as merge sort and the time complexity of this algorithm O(N K log K). In the last approach, we used min-heap to solve this problem and this was the most efficient approach because the time complexity of this algorithm is O(N K * log K) which is the same as the time complexity of the second approach. But the space complexity of the third approach is O(K) only.