Sunday, January 17, 2010

Merge Sort in Java

Now that we've examined one of the most simplest and inefficient sorting algorithms (Bubble Sort), lets have a look at O(nlogn) sorts. To start with, we shall look into Merge Sort.
Merge Sort follows divide and conquer technique. The following is a simple pictorial illustration of the same.

There will be two parts in this sort. One would be halving the input numbers O(logn) and the other would be merging them back O(n) which results in a comprehensive O(nlogn) time. Wait till you see the running version of this sort technique and the time taken to sort a million entries.

First for the code part,
package dsa.sorting;

public class MergeSortOptimized
{

    public static int SIZE = 1000000;

    public static void main(String args[])
    {
        int[] a = new int[SIZE];
        for (int i = 0; i < SIZE; i++) {
            a[i] = (int) (Math.random() * SIZE);
        }
        long start = System.currentTimeMillis();
        MergeSortOptimized mNew = new MergeSortOptimized();
        mNew.sort(a);
        long end = System.currentTimeMillis();
        System.out.println("Time taken to sort a million elements : "
                + (end - start) + " milliseconds");
    }

    public int[] sort(int[] input)
    {
        int[] temp = new int[input.length];
        mergeSort(input, temp, 0, input.length - 1);
        return input;
    }

    public void mergeSort(int[] fromArray, int[] toArray, int left, int right)
    {
        if (left < right) {
            int center = (left + right) / 2;
            mergeSort(fromArray, toArray, left, center);
            mergeSort(fromArray, toArray, center + 1, right);
            merge(fromArray, toArray, left, center + 1, right);
        }
    }

    public void merge(int[] fromArray, int[] toArray, int leftPos,
            int rightPos, int rightEnd)
    {
        int leftEnd = rightPos - 1;
        int tempPos = leftPos;

        int numElements = rightEnd - leftPos + 1;

        while (leftPos <= leftEnd && rightPos <= rightEnd) {
            if (fromArray[leftPos] < fromArray[rightPos]) {
                toArray[tempPos++] = fromArray[leftPos++];
            }
            else {
                toArray[tempPos++] = fromArray[rightPos++];
            }
        }

        while (leftPos <= leftEnd) {
            toArray[tempPos++] = fromArray[leftPos++];
        }
        while (rightPos <= rightEnd) {
            toArray[tempPos++] = fromArray[rightPos++];
        }

        for (int i = 0; i < numElements; i++, rightEnd--) {
            fromArray[rightEnd] = toArray[rightEnd];
        }

    }

}
// Time taken to sort a million elements : 247 milliseconds

As you may observe, the above technique just took 247 milliseconds to sort a million elements whereas the bubble sort tooks almost 28 seconds to sort a hundred thousand elements!!!!! This is due to the logarithmic order of the sort.

Pros:
1) Works in O(nlogn) time.
2) Worst case of merge sort is equal to best case of a quick sort! (We shall discuss more about in the coming sections).

Cons:
1) Consumes extra space.
2) Due to the recursive nature of the algorithm, may eat up stack.

Thanks for reading my post. We shall explore more sorting techniques in the coming days. Till then, bye for now from,
Bragaadeesh.

No comments: