Merge sort is one of the most powerful sorting algorithms. Merge sort is widely used in various applications as well. The best part about these algorithms is that they are able to sort a given data in O(nLogn) complexity as against O(n2) complexity (we will soon see how) of bubble sort and selection sort. Moreover, merge sort is of interest because it creates an excellent case study for one of the widely used techniques in Computer Science - divide and conquer.
Merge Sort Algorithm - Explanation
Given an array of length, say n, we perform the following steps to sort the array:
Divide the array into 2 parts of lengths n/2 and n - n/2 respectively (here if n is odd, we round off the value of n/2). Let us call these arrays as left half and right half respectively.
Recursively sort the left half array and the right half array.
Merge the left half array and right half-array to get the full array sorted.
Let us take an example:
Given Array: [6, 4, 5, 1, 2, 7, 3]
First, as per step 1 above, we divide the array into 2 parts. As we can see, the following are the subarrays left half and right half:
Left half: [6, 4, 5, 1]
Right half: [2, 7, 3]
Then, as per step 2 above, we recursively sort the left and right halves. Here is how the sorted subarrays will look like:
Recursively sorted left half: [1, 4, 5, 6]
Recursively sorted right half: [2, 3, 7]
Finally, as per step 3, we will merge these 2 halves to create the final sorted array. Final merged and sorted array: [1, 2, 3, 4, 5, 6, 7]
The left and the right halves can always be sorted recursively using the same algorithm. The magic happens in creating the final merged and sorted array. So, let us understand it well using the above example.
In the above example, we are given 2 arrays [1, 4, 5, 6] and [2, 3, 7]. We are supposed to merge these 2 arrays into a single sorted array. Let us place a pointer at the head of each array. We will depict the pointer by underlining the corresponding element where the pointer points to.
Final merged array = []
Left array: [1, 4, 5, 6]
Right array: [2, 3, 7]
As can be seen, the pointer of the left array is at 1 and the pointer of the right array is at 2. We pick the smaller one and put it in the final merged array and move the corresponding pointer. After doing this, we will have the following state:
Final merged array = [1]
Left array: [4, 5, 6]
Right array: [2, 3, 7]
Here the pointers are now at 4 and 2 respectively. We again do what we did above - pick the smaller one and put it in the final merged array and move the corresponding pointer. We will get the following:
Final merged array = [1, 2]
Left array: [4, 5, 6]
Right array: [3, 7]
We repeat this again to get:
Final merged array = [1, 2, 3]
Left array: [4, 5, 6]
Right array: [7]
Continuing this exercise, we can see that we are successfully able to get the final merged array in the sorted form:
Final merged array = [1, 2, 3, 4, 5, 6, 7]
Left array: []
Right array: []
So, as can be seen, we started with an unsorted array and we were successfully able to get a sorted array. Another question that is to be answered - how were the left and the right arrays sorted? Well, we sorted them recursively using the same technique as above. For instance, consider the right array: [2, 7, 3]. To sort it, we will break it again into 2 sub-arrays: [2, 7] and [3]. Both these sub-arrays are already sorted so we can simply merge them by using the technique explained above to get the sorted array [2, 3, 7].
Using the below image you can understand
Merge Sort Pseudo Code
Before we get into the actual code, let us take a look at the pseudo code.
function merge_sort(i, j, a, aux) {
mid = (i + j) / 2
merge_sort(i, mid, a, aux)
merge_sort(mid + 1, j, a, aux)
pointer_left = i, pointer_right = mid + 1
for k in [i ... j] {
if pointer_left points to smaller element, aux[k] = a[pointer_left] and increment pointer_left by 1
if pointer_right points to smaller element, aux[k] = a[pointer_right] and increment pointer_right by 1
}
copy the contents of aux[i .. j] to a[i .. j]
}
Merge Sort Program in C
Let us understand the code step-by-step:
void merge_sort(int i, int j, int a[], int aux[])
This prototype means that the merge_sort function sorts the sub-array a[i .. j] using auxiliary array aux[].
if (j <= i) {
return;
}
if j <= i, clearly, the subarray a[i .. j] contains either 1 element (which is sorted) or no elements (which is also sorted). So, we do nothing in this case and simply return.
int mid = (i + j) / 2;
We plan to partition the array into 2 sub-arrays of almost equal lengths. These subarrays are a[i .. mid] and a[mid + 1 .. j]. Clearly, mid = (i + j) / 2 is the best here since mid is the average of i and j.
merge_sort(i, mid, a, aux);
merge_sort(mid + 1, j, a, aux);
Here, we are recursively sorting a[i .. mid] and a[mid + 1 .. j] sub-arrays by calling the same merge_sort function.
Once we have these 2 sorted subarrays in place, the rest of the code simply merges the 2.
int pointer_left = i;
int pointer_right = mid + 1;
int k;
Here, we place pointer_left at the beginning of the left sub-array a[i .. mid] and the pointer_right at the beginning of the right subarray a[mid + 1 .. j].
for (k = i; k <= j; k++) {
if (pointer_left == mid + 1) {
aux[k] = a[pointer_right];
pointer_right++;
} else if (pointer_right == j + 1) {
aux[k] = a[pointer_left];
pointer_left++;
} else if (a[pointer_left] < a[pointer_right]) {
aux[k] = a[pointer_left];
pointer_left++;
} else {
aux[k] = a[pointer_right];
pointer_right++;
}
}
Here, we have 4 cases:
pointer_left == mid + 1: in this case, the left subarray is finished and all of its elements have already been merged.
pointer_right == j + 1: in this case, the right subarray is finished and all of its elements have already been merged.
a[pointer_left] < a[pointer_right]: here, none of the 2 arrays have finished. However, the pointer_left points to a smaller element than pointer_right and so, we put that in the merged array.
else the last case: here, none of the 2 arrays have finished. However, the pointer_right points to a smaller element than pointer_left and so, we put that in the merged array.
Finally, we copy the elements from aux[] to a[].
for (k = i; k <= j; k++) {
a[k] = aux[k];
}
So, that’s how merge sort works.
Another sorting algorithms Binary Search, Quick Sort.
0 Comments