Class 1: Introduction
- Definitions.
- An algorithm is a
well-defined computational procedure that takes a set of values as input
and produces a set of values as output.
- An
algorithm can also be thought of as a tool for solving a computational
problem that specifies in general terms the desired
input/output relationship. For example, the sorting problem:
Input: A sequence of n numbers <a1, ... , an>.
Output: Reordering of the input sequence <a1', ... , an'> such as
a1' <= a2' <= ... <= an'
- An
algorithm is correct if for every input instance, it
finishes with the correct output. We say that a correct algorithm solves
the given computational problem.
- A data
structure is a way to store and organize data in order to
facilitate access and modifications.
- There
are problems for which no efficient solution is known. An interesting
subset of these problems is called NP-complete. These
problems have a remarkable property that if an efficient algorithm exists
for any one of them, then efficient algorithms exist for all of them.
- Examples
of problems solved by algorithms.
- Internet
related problems:
- Finding
good routs on which data will travel.
- Implementing
search engines to quickly find pages on which particular information
resides.
- Financial
problems.
- Find
price of a complex financial instrument using simulated environments.
- Value
a large portfolio of securities.
- Information
processing.
- Data
base related problems such as records search, insertion, deletion.
- XML
based processing.
- Efficiency
of algorithms.
- Two
algorithms that are designed to solve the same problem may differ
dramatically in their efficiency. For example, compare an insertion sort
algorithm with merge sort algorithm.
- While
the difference may be small for small input size, it becomes very
pronounced for large size inputs, where time may really matter.
- Insertion
sort.
- The
procedure takes as a parameter an array A[1..n] containing a sequence of n
elements to be sorted. The input numbers are sorted in place.
- The
algorithm works by finding a correct position for each number among the
numbers sorted so far. The number slides into the right position by
comparing with the sorted numbers. There are two approaches:
- Compare
from the beginning of the array. In this case all array would need to be
shifted forward.
- Compare
from the end of an array. This requires shifting one number at a time.
- The
following is a pseudocode.
INSERTION-SORT (A, n)
1 for
j=2 to n
2 key=A[j]
3 // insert A[j] into sorted A[1..j-1]
4 i=j-1
5 while i>0 and A[i]>key
6 A[i+1] = A[i]
7 i = i-1
8 A[i+1] = key
- Loop
invariants
- At
the start of the for loop of lines 1-8, the subarray A[1..j-1]
consists of the elements originally i A[1..j-1] but in sorted order. Need
to show the following:
- Initialization:
It is true prior to the first iteration of the loop.
- One
element is always sorted.
- Maintenance:
It remains true before the next iteration
- An
array s moved to find the right place for A[j], henceit remains sorted.
- Termination:
When the loop terminates, the invariant gives us a useful property that
help to show that the algorithm is correct.
- When
the loop ends A[1..n] array is sorted.
- Analyzing
algorithms.
- Analyzing
an algorithm means predicting the resources that the algorithm requires.
Most often we want to measure computational time.
- We
assume a generic one-processor random-access machine (RAM)
model of computation. In the RAM model, instructions are executed one
after another, with no concurrent operations.
- The
RAM model contains arithmetic, data movement, and control instructions.
Each such instruction takes a constant amount of time.
- The
data types in the RAM model are integer and floating point.
- Analysis
of insertion sort.
- Note
that the loop statement is executed one extra time to exit from it.
- Let
tj be the number of time the while loop is executed for that value of j.
INSERTION-SORT (A, n) cost times
1 for
j=2 to n c1 n
2 key=A[j] c2 n-1
3 // insert A[j] into sorted A[1..j-1]
4 i=j-1 c4 n-1
5 while i>0 and A[i]>key c5 Sj=2..n tj
6 A[i+1] = A[i] c6 Sj=2..n (tj-1)
7 i = i-1 c7 Sj=2..n (tj-1)
8 A[i+1] = key c8 n-1
Total cost T(n) is:
T(n) = c1*n +c2*(n-1) + c4*(n-1) + c5*Sj=2..n
tj + c6*Sj=2..n(tj-1) + c7*Sj=2..n(tj-1) +
c8(n-1)
The best case occurs when the array is already
sorted:
T(n) = c1*n
+c2*(n-1) + c4*(n-1) + c8(n-1) = (c1+c2+c4+c5+c8)n – (c2+c4+c5+c8)
The running time can be expressed as an+b, i.e. a linear
function of n.
If the array is in reverse order – the worst case
results. In this case tj = j, which means
Sj=2..n (tj-1) = n(n-1)/2
T(n) = (c5/2+c6/2+c7/2)n2 +
(c1+c2+c4+c5/2-c6/2-c7/2+c8)n – (c2+c4+c5+c8)
The worst-case running time can be expressed as an^2+bn+c;
it is thus a quadratic function of n.
- Order
of growth
- It
is the rate of growth of the running time that really interests us. We
therefore consider only a leading term of a formula in the running time
and ignore the constant.
- We
write the insertion sort worst-case running time as Q(n^2).