Class 5: Quicksort

 

Quicksort is a sorting algorithm with worst-case running time of Q(n^2). However, this algorithm is often the best practical choice because it is very efficient on the average: Q(nlgn). It also has the advantage of sorting in place.

 

Description

 

The quicksort uses the divide-and-conquer paradigm:

 

 

Quicksort(A, p, r)

1     if p<r

2           q = Partition(A,p,r)

3           Quicksort(A,p,q-1)

4           Quicksort(A,q+1,r)

 

Partitioning algorithm is as follows.

 

Partition(A, p, r)

1     x = A[r]

2     i = p-1

3     for j = p to r-1

4           if A[j] <= x

5                 i = i+1

6                 <exchange A[i] with A[j]>

7      <exchange A[i+1] with A[r]>

8     return i+1

 

Example:

 

i           p,j                                            r

            7            4            6            3            5

 

i           p            j                                   r

            7            4            6            3            5

 

            p,i                    j                       r

            4            7            6            3            5

 

            p,i                                j            r

            4            7            6            3            5

            p            i                       j            r

            4            3            6            7            5

 

            p            i                       j            r

            4            3            5            7            6

 

As the procedure runs, the array is partitioned into four regions, which can be considered a loop invariant:

 

  1. p <= k <= i: A[k] <= x
  2. i+1 <= k <= j-1: A[k] > x
  3. k=r: A[k] = x
  4. j <= k <= r-1: undermined

 

We need to show that this invariant is true:

 

 

At lines 7-8 the pivot is moved to the correct place and its index is returned. To calculate the running time of Partition, assume n=p-r+1. The loop 3 is executed n-1 times with steps 4-6 taking constant time Q(1). Steps 7-8 take constant time as well. Hence the running time of partition is Q(n).

 

Performance

 

The running time of quicksort depends on whether partitioning is balanced or unbalanced. If it is balanced, the algorithm runs as fast as heapsort, otherwise, it can run asymptotically as slowly as insertion sort.

 

Worst-case partitioning

 

This occurs when partitioning routine produces one subproblem with (n-1) elements and one with 0 elements. In this case:

 

T(n) = T(n-1) + T(0) + Q(n) = T(n-1) + Q(n)

 

Intuitively, to sum the costs at each recursion level we get:

 

n + (n-1) + ... + 1 = (n+1)/2 * n = Q(n^2)

It is easy to prove that the recurrence T(n) = T(n-1) + Q(n) indeed has the solution Q(n^2).

 

Note that, the worst case partitioning occurs when the input array is already sorted.

 

Best-case partitioning

 

This is the most even split in which each subproblem has no more than n/2 elements. The recurrence is:

 

T(n) <= 2T(n/2) + Q(n)

 

According to the master theorem's case 2, the above recurrence has the solution O(nlgn).

 

Balanced partitioning

 

The average-case running time of quicksort is much closer to the best case than to the worst case. The key to understanding this is to understand how the balance of partitioning is reflected on the recurrence.

 

For example, assume the partitioning algorithm always produces 9:1 splitting ratio, which appears unbalanced. We have:

 

T(n) <= T(9n/10) + T(n/10) + cn

 

The recursion tree shows that that the cost at each level is cn until the boundary level log10n = Q(lgn), and then levels have cost less than cn. The recursion stops at depth log10/9(n) = Q(lgn). In fact, any split of constant proportionality yields a recursion tree of depth Q(lgn), where the cost at each level is cn.

 

Intuition for average case

 

To perform the average case analysis we make an assumption that all permutations of the input numbers are equally likely. When we run quicksort on a random array, it is unlikely that the partitioning always happens in the same way on each level. We expect splits to be balanced and unbalanced.

 

In the average case, Partition produces a mix of "good" and "bad" splits, that are distributed randomly across the recursion tree. For the sake of intuition assume that the best split follows the worst split:

 

                        n

0                                              n-1                                           Q(n)

                        (n-1)/2-1              (n-1)/2

 

Note that, this achieves in 2 steps what the best partitioning can achieve in one step. Intuitively, the Q(n-1) cost of extra split can be absorbed by Q(n) cost.

 

Running time analysis

 

Randomized quicksort

 

In the randomized version of the quicksort we simply pick a pivot element randomly:

 

Randomized-Partition(A,p,r)

1     i = Random(p, r) // pick a random number between p and r

2     <swap A[r] with A[i]>

3     return Partition(A,p,r)

 

Randomized-Quicksort

1     if p<r

2           q = Randomized-Partition(A,p,r)

3           Randomized-Quicksort(A,p,q-1)

4           Randomized-Quicksort(A,q+1,r)

 

Worst-case analysis

 

We saw that the worst-case split produces a Q(n^2) running time, which intuitively is the worst-case running time of the algorithm. We prove it more rigorously below.

 

First, we use substitution method to show that the running rime of quicksort is O(n^2). We have the following recurrence for the worst-case running time T(n):

 

T(n) = max1<=q<=n(T(q-1) + T(n-q)) + Q(n)

 

We guess that T(n) <= cn^2. Substituting the guess we have:

 

T(n) <= max1<=q<=n(c(q-1)^2 + c(n-q)^2) + Q(n) =

            c max1<=q<=n((q-1)^2 + (n-q)^2) + Q(n)

 

Analyze the expression:

 

f(q) = (q-1)^2 + (n-q)^2

 

f'(q) = 2(q-1) – 2(n-q) = 4q – 2n – 2

 

f'(q) = 0 <=> q = (n+1)/2

 

Also, f'(q) < 0 <=> 4q –2n – 2 < 0 <=> q < (n+1)/2

 

f'(q) > 0 <=> q > (n+1)/2

 

This means f(q) has a global minimum at q=(n+1)/2. On the interval [1,n] it can achieve maximum at the endpoints of the interval:

 

f(1) = f(n) = (n-1)^2

 

Hence

 

max1<=q<=n((q-1)^2 + (n-q)^2) = (n-1)^2 = n^2 –2n + 1

 

T(n) = cn^2 – c(2n-1) + Q(n) <= cn^2

 

The last expression holds because we can pick the constant c large enough so that c(2n-1) dominates Q(n). Hence T(n) = O(n^2). We can also show that the original recurrence has a solution W(n^2). Hence the worst-case running time of quicksort is Q(n^2).

 

Expected running time

 

The running time of quicksort is dominated by the time spent in Partition. Note that there can be at most n calls to Partition since once the pivot element is selected it does not participate in future calls to Partition.

 

One call to Partition takes O(1) time plus the time proportional to the number of iterations in for loop 3-6. Each iteration involves comparison at line 4, so we need to count the total number of times the line 4 is executed.

 

Lemma 7.1

 

Let X be the number of comparisons performed in line 4. Then the running time of quicksort is O(n+X).

 

Proof. There are n calls to partition. Each call does constant amount of work, then executes the for loop some number of times. Each such iteration executes line 4. ÿ

 

Hence we need to compute X. Rename element in A as sorted z1, ... , zn. Also, define the set Zij = {zi, ... , zj} to be the set of elements between z_i and z_j.

 

Note that, the elements are compared only to the pivot element at each call of the partition and after the call are never compared o any other element. Define the indicator random variable:

 

X_ij = I{z_i is compared to z} (=1 if yes, = 0 otherwise)

 

Since each pair of two numbers is compared at most once, we have:

 

X = sum_i=1,n-1(sum_j=i+1,n(X_ij))

 

Taking expectations of both sides we have:

 

E[X] = E[sum_i=1,n-1(sum_j=i+1,n(X_ij))] =

 

sum_i=1,n-1(sum_j=i+1,n(E(X_ij))) =

 

sum_i=1,n-1(sum_j=i+1,n(Pr{z_i is compared to z_j}))

 

To calculate Pr{ z_i is compared to z_j } we need to analyze when two items are not compared. It is easy to see that, once a pivot x is chosen with

 

            z_i < x < z_j

 

we know that z_i and z_j will not be compared. On the other hand, if z_i is chosen as pivot item in Z_ij, it will be compared to each item in Z_ij except for itself. Hence z_i and z_j are compared if and only if the first element chosen as a pivot from Z_ij is either z_i or z_j.

 

We now compute the probability that this event occurs. To choose an element from Z_ij as a pivot, the whole set Z_ij is in the same partition. Since pivots are chosen randomly and independently we have:

 

Pr{z_i is compared to z_j} = Pr{z_i or z_j is first pivot chosen from Z_ij} =

           

            Pr{z_i is the first pivot chosen from Z_ij} +

            Pr{z_i is the first pivot chosen from Z_ij} =

 

            1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1)

 

We have

 

E[X] = sum_i=1,n-1(sum_j=i+1,n(2/(j-i+1)))

 

We use change of variables k=j-i

 

E[X] = sum_i=1,n-1(sum_k=1,n-i(2/(k+1)))

 

            < sum_i=1,n-1(sum_k=1,n(2/(k)))

 

/* Harmonic series:

 

                sum_k=1,n(1/k) = lnn + O(1) */

 

            = sum_i=1,n-1(O(lgn))

 

            = O(n lgn)