lecture 6...the algorithm’s job is to output a correctly sorted list of all the objects. is...
TRANSCRIPT
Lecture6SortinglowerboundsonO(n)-timesorting
Announcements
• HW2dueFriday
• PleasesendanyOAEletterstoLunaFrank-Fischer([email protected]) byApril28.
Sorting
• We’veseenafewO(nlog(n))-time algorithms.
• MERGESORThasworst-caserunningtimeO(nlog(n))
• QUICKSORThasexpectedrunningtimeO(nlog(n))
Canwedobetter?
Dependsonwho
youask…
AnO(1)-timealgorithmforsorting:
StickSort
• Problem:sortthesesticksbylength.
• Algorithm:• Dropthemonatable.
• Nowtheyaresortedthisway.
Thatmayhavebeenunsatisfying
• ButStickSort doesraisesomeimportantquestions:
• Whatisourmodelofcomputation?
• Input: array
• Output:sortedarray
• Operationsallowed:comparisons
-vs-
• Input:sticks
• Output:sortedsticksinverticalorder
• Operationsallowed:droppingontables
• Whatarereasonablemodelsofcomputation?
Today:two(more)models
• Comparison-basedsortingmodel
• ThisincludesMergeSort,QuickSort,InsertionSort
• We’llseethatany algorithminthismodelmusttakeatleastΩ(nlog(n))steps.
• Anothermodel(morereasonablethanthestickmodel…)
• BucketSort andRadixSort
• BothrunintimeO(n)
Comparison-basedsorting
Comparison-basedsortingalgorithms
Thereisagenie whoknowswhat
therightorderis.
ThegeniecananswerYES/NO
questionsoftheform:
is [this] bigger than [that]?Algorithm
Wanttosorttheseitems.
There’ssomeorderingonthem,butwedon’tknowwhatitis.
Is bigger than ?
YES
Thealgorithm’sjobisto
outputacorrectlysorted
listofalltheobjects.
isshorthandfor
“thefirstthingintheinputlist”
Allthesortingalgorithmswehaveseenworklikethis.
7 6 3 5 1 4 2eg,QuickSort:
Is bigger than ? 7 5
Is bigger than ?
Is bigger than ?
6
3
5
5
YES
YES
NO
7 6 3
5 etc.
Pivot!
LowerboundofΩ(nlog(n)).
• Theorem:
• Anydeterministiccomparison-basedsortingalgorithm musttakeΩ(nlog(n))steps.
• Anyrandomized comparison-basedsortingalgorithm musttakeΩ(nlog(n))stepsinexpectation.
• Howmightweprovethis?
1. Considerallcomparison-basedalgorithms,one-by-one,andanalyzethem.
2. Don’tdothat.
Instead,arguethatallcomparison-basedsorting
algorithmsgiverisetoadecisiontree.
Thenanalyzedecisiontrees.
Decisiontrees
Sortthesethreethings. ?≤
YESNO
≤
YES
?
NO
≤ ?YES NO
etc…
Allcomparison-basedalgorithmslooklikethis
Example:Sortthese
threethingsusing
QuickSort.
≤
NO
?
YES
L RRL
≤ ?NOYES
L RL RReturn ≤
NOYES
?Thenwe’redone
(aftersomebase-
casestuff)
Now
recurse
onR
Pivot!
L RL R
Pivot!
Return ReturnIneithercase,we’redone
(aftersomebasecasestuffand
returningrecursivecalls).
etc...
Allcomparison-basedalgorithmshaveanassociateddecisiontree.
YES NO
?
??
YESNOYES
NO
????
Theleavesofthis
treeareallpossible
orderingsofthe
items:whenwe
reachaleafwe
returnit.
Whatdoesthedecision
treeforMERGESORTING
fourelementslooklike?
Olliethe
over-achievingostrich
Runningthealgorithmonagiven
inputcorrespondstotakinga
particularpaththroughthetree.
What’stheruntimeonaparticularinput?
YES NO
?
??
YESNOYES
NO
????
Ifwetakethispaththrough
thetree,theruntimeis
Ω(lengthofthepath).
Atleastthenumber
ofcomparisonsthat
aremadeonthat
input.
What’stheworst-case runtime?
YES NO
?
??
YESNOYES
NO
????
AtleastΩ(lengthofthelongestpath).
Howlongisthelongestpath?
YESNO
?
??YES
NOYESNO
????
• Thisisabinarytreewithat
least_____leaves.
• Theshallowesttreewithn!
leavesisthecompletely
balancedone,whichhas
depth______.
• Soinallsuchtrees,the
longestpathisatleastlog(n!).
n!
log(n!)
• n!isabout(n/e)n (Stirling’s formula).
• log(n!)isaboutnlog(n/e)=Ω(nlog(n)).
Conclusion:thelongestpath
haslengthatleastΩ(nlog(n)).
beingsloppyabout
floorsandceilings!
Wewantastatement:inallsuchtrees,
thelongestpathisatleast_____
LowerboundofΩ(nlog(n)).• Theorem:• Anydeterministiccomparison-basedsortingalgorithm musttakeΩ(nlog(n))steps.
• Proof:• Anydeterministiccomparison-basedalgorithmcanberepresentedasadecisiontreewithn!leaves.
• Theworst-caserunningtimeisatleastthedepthofthedecisiontree.
• Alldecisiontreeswithn!leaveshavedepthΩ(nlog(n)).
• Soanycomparison-basedsortingalgorithmmusthaveworst-caserunningtimeatleast Ω(nlog(n)).
\end{Aside}
• Forexample,QuickSort?
• Theorem:
• Anyrandomized comparison-basedsortingalgorithmmusttakeΩ(nlog(n))stepsinexpectation.
• Proof:
• attheendoftodayiftime
• otherwiseseelecturenotes
• (sameideasasdeterministiccase)
Trytoprovethis
yourself!
We’llseethisatthe
endoftoday’slecture
ifthere’stime.
Ollietheover-achievingostrich
Aside:Whataboutrandomizedalgorithms?
ButwhataboutStickSort?
• Thisisoneofthecoolthingsaboutlowerboundslikethis:weknowwhenwecandeclarevictory!
So,MergeSort isoptimal!
• StickSort can’tbeimplementedasacomparison-basedsortingalgorithm.Sotheselowerboundsdon’tapply.
• ButStickSort waskindofdumb.EspeciallyifIhave
tospendtime
cuttingallthose
stickstobethe
rightsize!Butmighttherebeanothermodelofcomputationthat’slessdumb,inwhichwecansortfaster?
Beyondcomparison-basedsortingalgorithms
Anothermodelofcomputation
• Theitemsyouaresortinghavemeaningfulvalues.
9 6 3 5 2 1 2
insteadof
Whymightthishelp?
BucketSort: 9 6 3 5 2 1 2
1 2 3 4 5 6 7 8 9
963 521
2
SORTED!IntimeO(n).
Implementthebuckets as
linkedlists.Theyarefirst-in,
first-out.
Concatenate
thebuckets!
Note:thisisasimplificationof
whatCLRScalls“BucketSort”
Issues
• Needtobeabletoknowwhatbuckettoputsomethingin.
• That’sokayfornow:it’spartofthemodel.
• Needtoknowwhatvaluesmightshowupaheadoftime.
2 12345 13 21000 50 100000000 1
Onesolution:RadixSort• Idea:BucketSort ontheleast-significantdigitfirst,thenthenextleast-significant,andsoon.
1 2 3 4 5 6 7 8 9
21 345 13 101 50 234 1
0
345
50 1321
101
1
234
Step1:BucketSort onLSB:
50 21 101 1 13 234 345
Step2:BucketSort onthe2nd digit
1 2 3 4 5 6 7 8 90
50 21 101 1 13 234 345
502113101
234
1 345
101 1 13 21 234 345 50
Step3:BucketSort onthe3rd digit
1 2 3 4 5 6 7 8 90
50
21
13
101
234
1
345
1 13 21 50 101 234 345
101 1 13 21 234 345 50
Itworked!!
Whydoesthiswork?
21 345 13 101 50 234 1
50 21 101 1 13 234 345
1 13 21 50 101 234 345
101 1 13 21 234 345 50
Originalarray:
Nextarrayissortedbythefirstdigit.
Nextarrayissortedbythefirsttwodigits.
Nextarrayissortedbyallthreedigits.
Sortedarray
50 21 101 1 13 234 345
101 01 13 21 234 345 50
001 013 021 050 101 234 345
Formally…
• Argumentvialoopinvariant(akainduction).
• LoopInvariant:
Ollietheover-achievingostrichLuckythelackadaisicallemur
Thisisthe
outline ofa
proof,nota
formalproof.
Makethis
formal!(orsee
lecturenotes).
Whydoesthiswork?
21 345 13 101 50 234 1
50 21 101 1 13 234 345
1 13 21 50 101 234 345
101 1 13 21 234 345 50
Originalarray:
Nextarrayissortedbythefirstdigit.
Nextarrayissortedbythefirsttwodigits.
Nextarrayissortedbyallthreedigits.
Sortedarray
50 21 101 1 13 234 345
101 01 13 21 234 345 50
001 013 021 050 101 234 345
Formally…
• Argumentvialoopinvariant(akainduction).
• LoopInvariant:
• Afterthek’th iteration,thearrayissortedbythefirstkleast-significantdigits.
• Basecase:
• “Sortedby0least-significantdigits”meansnotsorted.
• Inductivestep:
• (Youfillin…)
• Termination:
• Afterthed’th iteration,thearrayissortedbythedleast-significantdigits.Aka,it’ssorted.
Ollietheover-achievingostrichLuckythelackadaisicallemur
Thisisthe
outline ofa
proof,nota
formalproof.
Makethis
formal!(orsee
lecturenotes).
Pluckythepedanticpenguin
Thisneedstouse:(1)bucketsort
works,and(2)wetreateachbucket
asaFIFOqueue.
Whatistherunningtime?
• Dependsonhowmanydigitsthebiggestnumberhas.• Sayd-digitnumbers.
• Therearediterations
• EachiterationtakestimeO(n+10)• Wecanchangethe10intoan“r:”thisisthe“radix”
• Example:ifr=2,wewriteeverythinginbinaryandonlyhavetwobuckets.
• Example:Ifr=10000000,wewriteeverythingbase-10000000andhave10000000buckets.
• Example:ifr=n,wewriteeverythinginbase-nandhavenbuckets.
• TimeisO(d(n+r)).
• Ifd=O(1)andr=O(n),runningtimeO(n).
SothisisaO(n)-timesortingalgorithm!
Howbigcanthe
biggestnumberbeif
d=O(1)andr=n?
Thestorysofar
• Ifweuseacomparison-basedsortingalgorithm,itMUSTrunintimeΩ(nlog(n)).
• Ifweassumethatwecandoalittlemorethancomparethevalues,wehaveanO(n)-time sortingalgorithm.
9 6 3 5 2 1 2
Whywouldweeverusea
comparison-basedsortingalgorithm??
Whywouldweeveruseacomparison-basedsortingalgorithm?
• dmightnotbe“constant.”(aka,itmightbebig)
• RadixSort needsextramemoryforthebuckets.• Notin-place
• Iwanttosortemojibytalkingtoagenie.• RadixSort makesmoreassumptionsontheinput.
�123456
987654 � 140! 2.1234123 2n 42
• Wecancomparetheseprettyquickly(justlookatthemost-significantdigit):
• � =3.14….
• e=2.78….
• ButtodoRadixSort we’dhavetolookateverydigit.
• Thisisespeciallyproblematicsincebothofthesehaveinfinitelymanydigits...
Dowehavetimeforthelowerboundonrandomizedalgorithms?
Ifso…
• Recallthelowerboundforadeterministicalgorithm.
YESNO
?
??YES
NOYESNO
????
• Thelongestpathinthis
treehaslength
Ω(nlog(n)).
• Therunningtimeofthe
algorithmisatleastthe
lengthofthispath.
Adifferentmodel
• Howaboutadeterministicalgorithmonarandominput?
• Notworst-casemodel.
• Notourrandomizedalgorithmmodeleither.
YESNO
?
??YES
NOYESNO
????
• Thelongestpathinthis
treehaslength
Ω(nlog(n)).
• Therunningtimeofthe
algorithmisatleastthe
lengthofthispath.
average
average
average
SoadeterministicalgorithmmusttaketimeΩ(nlog(n))evenonrandominputs.
Thisisaprettystrongstatement!
• Before:
Ifanadversary
getstopickthe
input,weneed
timeΩ(nlog(n)).
• Now:
Iftheinputis
chosenrandomly,
westill needtime
Ω(nlog(n)).
Butwhatdoesthatmodelhavetodowithanything?
Deterministic
algorithmon
randominput
Randomized
algorithmon
worst-case input
Itturnsoutthatinthiscase,
Adeterministicalgorithm
musttaketimeΩ(nlog(n))
evenonrandominputs.
Arandomizedalgorithm
musttaketimeΩ(nlog(n))
onworst-caseinputs.
Thisiswhatwewanted.
Randomized
algorithmon
random input
doesat
leastas
wellas
doesat
leastas
wellas
Andwejustshowedthat
thisdidn’tdoverywell.
Theargumenthereis
prettysubtle!Understand
whyitmakessense!
Recap
• Binarysearchtrees
NextTime
• Howdifficultaproblemisdependsonthemodelofcomputation.
• Howreasonableamodelofcomputationisisupfordebate.
• StickSort cansortsticksinO(1)time.
• RadixSort cansortsmallishintegersinO(n) time.
• Ifwewanttosortemoji(orarbitrary-precisionnumbers),werequireΩ(nlog(n)) time(likeMergeSort).