introduction to algorithms - mit opencourseware · time of randomized quicksort on an input of size...
TRANSCRIPT
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.1
Introduction to Algorithms6.046J/18.401J
LECTURE 4Quicksort• Divide and conquer• Partitioning• Worst-case analysis• Intuition • Randomized quicksort• Analysis
Prof. Charles E. Leiserson
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.2
Quicksort
• Proposed by C.A.R. Hoare in 1962.• Divide-and-conquer algorithm.• Sorts “in place” (like insertion sort, but not
like merge sort).• Very practical (with tuning).
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.3
Divide and conquerQuicksort an n-element array:1. Divide: Partition the array into two subarrays
around a pivot x such that elements in lower subarray ≤ x ≤ elements in upper subarray.
2. Conquer: Recursively sort the two subarrays.3. Combine: Trivial.
≤ x≤ x xx ≥ x≥ x
Key: Linear-time partitioning subroutine.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.4
Partitioning subroutine
Running time= O(n) for nelements.
Running time= O(n) for nelements.
PARTITION(A, p, q) ⊳ A[p . . q] x ← A[p] ⊳ pivot = A[p]i ← pfor j ← p + 1 to q
do if A[ j] ≤ xthen i ← i + 1
exchange A[i] ↔ A[ j]exchange A[p] ↔ A[i]return i
Invariant: xx ≤ x≤ x ≥ x≥ x ??qjp i
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.5
Example of partitioning
i j66 1010 1313 55 88 33 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.6
Example of partitioning
i j66 1010 1313 55 88 33 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.7
Example of partitioning
i j66 1010 1313 55 88 33 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.8
Example of partitioning
66 1010 1313 55 88 33 22 1111
66i j55 1313 1010 88 33 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.9
Example of partitioning
66 1010 1313 55 88 33 22 1111
66i j55 1313 1010 88 33 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.10
Example of partitioning
66 1010 1313 55 88 33 22 1111
66i j55 1313 1010 88 33 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.11
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 1313 1010 88 33 22 1111
66 55i j33 1010 88 1313 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.12
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 1313 1010 88 33 22 1111
66 55i j33 1010 88 1313 22 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.13
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 1313 1010 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 33i j22 88 1313 1010 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.14
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 1313 1010 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 33i j22 88 1313 1010 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.15
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 1313 1010 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 33i j22 88 1313 1010 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.16
Example of partitioning
66 1010 1313 55 88 33 22 1111
66 55 1313 1010 88 33 22 1111
66 55 33 1010 88 1313 22 1111
66 55 33 22 88 1313 1010 1111
22 55 33i66 88 1313 1010 1111
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.17
Pseudocode for quicksortQUICKSORT(A, p, r)
if p < rthen q ← PARTITION(A, p, r)
QUICKSORT(A, p, q–1)QUICKSORT(A, q+1, r)
Initial call: QUICKSORT(A, 1, n)
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.18
Analysis of quicksort
• Assume all input elements are distinct.• In practice, there are better partitioning
algorithms for when duplicate input elements may exist.
• Let T(n) = worst-case running time on an array of n elements.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.19
Worst-case of quicksort
• Input sorted or reverse sorted.• Partition around min or max element.• One side of partition always has no elements.
)()()1(
)()1()1()()1()0()(
2nnnT
nnTnnTTnT
Θ=
Θ+−=Θ+−+Θ=Θ+−+=
(arithmetic series)
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.20
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.21
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
T )(n
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.22
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
cnT(0) T(n–1)
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.23
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
cnT(0) c(n–1)
T(0) T(n–2)
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.24
Worst-case recursion treeT(n) = T(0) + T(n–1) + cn
cnT(0) c(n–1)
T(0) c(n–2)
T(0)
Θ(1)
O
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.25
Worst-case recursion tree
cnT(0) c(n–1)
T(n) = T(0) + T(n–1) + cn
T(0) c(n–2)
T(0)
Θ(1)
O
( )2
1nk
n
kΘ=⎟⎟
⎠
⎞⎜⎜⎝
⎛Θ ∑
=
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.26
Worst-case recursion tree
cnΘ(1) c(n–1)
T(n) = T(0) + T(n–1) + cn
Θ(1) c(n–2)
Θ(1)
Θ(1)
O
( )2
1nk
n
kΘ=⎟⎟
⎠
⎞⎜⎜⎝
⎛Θ ∑
=
T(n) = Θ(n) + Θ(n2)= Θ(n2)
h = n
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.27
Best-case analysis(For intuition only!)
If we’re lucky, PARTITION splits the array evenly:T(n) = 2T(n/2) + Θ(n)
= Θ(n lg n) (same as merge sort)
What if the split is always 109
101 : ?
( ) ( ) )()( 109
101 nnTnTnT Θ++=
What is the solution to this recurrence?
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.28
Analysis of “almost-best” case)(nT
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.29
Analysis of “almost-best” casecn
( )nT 101 ( )nT 10
9
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.30
Analysis of “almost-best” casecn
cn101 cn10
9
( )nT 1001 ( )nT 100
9 ( )nT 1009 ( )nT 100
81
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.31
Analysis of “almost-best” casecn
cn101 cn10
9
cn1001 cn100
9 cn1009 cn100
81
Θ(1)
Θ(1)
… …log10/9n
cn
cn
cn
…O(n) leavesO(n) leaves
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.32
Analysis of “almost-best” case
log10n
cn
cn101 cn10
9
cn1001 cn100
9 cn1009 cn100
81
Θ(1)
Θ(1)
… …log10/9n
cn
cn
cn
T(n) ≤ cn log10/9n + Ο(n)cn log10n ≤
O(n) leavesO(n) leaves
Θ(n lg n)Lucky!
…
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.33
More intuitionSuppose we alternate lucky, unlucky, lucky, unlucky, lucky, ….
L(n) = 2U(n/2) + Θ(n) luckyU(n) = L(n – 1) + Θ(n) unlucky
Solving:L(n) = 2(L(n/2 – 1) + Θ(n/2)) + Θ(n)
= 2L(n/2 – 1) + Θ(n)= Θ(n lg n)
How can we make sure we are usually lucky?Lucky!
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.34
Randomized quicksortIDEA: Partition around a random element.• Running time is independent of the input
order.• No assumptions need to be made about
the input distribution.• No specific input elicits the worst-case
behavior.• The worst case is determined only by the
output of a random-number generator.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.35
Randomized quicksort analysis
Let T(n) = the random variable for the running time of randomized quicksort on an input of size n, assuming random numbers are independent.For k = 0, 1, …, n–1, define the indicator random variable
Xk =1 if PARTITION generates a k : n–k–1 split,0 otherwise.
E[Xk] = Pr{Xk = 1} = 1/n, since all splits are equally likely, assuming elements are distinct.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.36
Analysis (continued)
T(0) + T(n–1) + Θ(n) if 0 : n–1 split,T(1) + T(n–2) + Θ(n) if 1 : n–2 split,
MT(n–1) + T(0) + Θ(n) if n–1 : 0 split,
T(n) =
( )∑−
=
Θ+−−+=1
0)()1()(
n
kk nknTkTX
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.37
Calculating expectation( )⎥
⎦
⎤⎢⎣
⎡Θ+−−+= ∑
−
=
1
0)()1()()]([
n
kk nknTkTXEnTE
Take expectations of both sides.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.38
Calculating expectation( )
( )[ ]∑
∑−
=
−
=
Θ+−−+=
⎥⎦
⎤⎢⎣
⎡Θ+−−+=
1
0
1
0
)()1()(
)()1()()]([
n
kk
n
kk
nknTkTXE
nknTkTXEnTE
Linearity of expectation.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.39
Calculating expectation( )
( )[ ]
[ ] [ ]∑
∑
∑
−
=
−
=
−
=
Θ+−−+⋅=
Θ+−−+=
⎥⎦
⎤⎢⎣
⎡Θ+−−+=
1
0
1
0
1
0
)()1()(
)()1()(
)()1()()]([
n
kk
n
kk
n
kk
nknTkTEXE
nknTkTXE
nknTkTXEnTE
Independence of Xk from other random choices.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.40
Calculating expectation( )
( )[ ]
[ ] [ ]
[ ] [ ] ∑∑∑
∑
∑
∑
−
=
−
=
−
=
−
=
−
=
−
=
Θ+−−+=
Θ+−−+⋅=
Θ+−−+=
⎥⎦
⎤⎢⎣
⎡Θ+−−+=
1
0
1
0
1
0
1
0
1
0
1
0
)(1)1(1)(1
)()1()(
)()1()(
)()1()()]([
n
k
n
k
n
k
n
kk
n
kk
n
kk
nn
knTEn
kTEn
nknTkTEXE
nknTkTXE
nknTkTXEnTE
Linearity of expectation; E[Xk] = 1/n .
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.41
Calculating expectation( )
( )[ ]
[ ] [ ]
[ ] [ ]
[ ] )()(2
)(1)1(1)(1
)()1()(
)()1()(
)()1()()]([
1
1
1
0
1
0
1
0
1
0
1
0
1
0
nkTEn
nn
knTEn
kTEn
nknTkTEXE
nknTkTXE
nknTkTXEnTE
n
k
n
k
n
k
n
k
n
kk
n
kk
n
kk
Θ+=
Θ+−−+=
Θ+−−+⋅=
Θ+−−+=
⎥⎦
⎤⎢⎣
⎡Θ+−−+=
∑
∑∑∑
∑
∑
∑
−
=
−
=
−
=
−
=
−
=
−
=
−
=
Summations have identical terms.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.42
Hairy recurrence
[ ] )()(2)]([1
2nkTE
nnTE
n
kΘ+= ∑
−
=
(The k = 0, 1 terms can be absorbed in the Θ(n).)
Prove: E[T(n)] ≤ an lg n for constant a > 0 .• Choose a large enough so that an lg n
dominates E[T(n)] for sufficiently small n ≥ 2.
Use fact: 21
2812
21 lglg nnnkk
n
k∑−
=−≤ (exercise).
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.43
Substitution method
[ ] )(lg2)(1
2nkak
nnTE
n
kΘ+≤ ∑
−
=
Substitute inductive hypothesis.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.44
Substitution method
[ ]
)(81lg
212
)(lg2)(
22
1
2
nnnnna
nkakn
nTEn
k
Θ+⎟⎠⎞⎜
⎝⎛ −≤
Θ+≤ ∑−
=
Use fact.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.45
Substitution method
[ ]
⎟⎠⎞⎜
⎝⎛ Θ−−=
Θ+⎟⎠⎞⎜
⎝⎛ −≤
Θ+≤ ∑−
=
)(4
lg
)(81lg
212
)(lg2)(
22
1
2
nannan
nnnnna
nkakn
nTEn
k
Express as desired – residual.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.46
Substitution method
[ ]
nan
nannan
nnnnna
nkakn
nTEn
k
lg
)(4
lg
)(81lg
212
)(lg2)(
22
1
2
≤
⎟⎠⎞⎜
⎝⎛ Θ−−=
Θ+⎟⎠⎞⎜
⎝⎛ −=
Θ+≤ ∑−
=
,if a is chosen large enough so that an/4 dominates the Θ(n).
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.47
Quicksort in practice
• Quicksort is a great general-purpose sorting algorithm.
• Quicksort is typically over twice as fast as merge sort.
• Quicksort can benefit substantially from code tuning.
• Quicksort behaves well even with caching and virtual memory.