lecture 6...the algorithm’s job is to output a correctly sorted list of all the objects. is...

39
Lecture 6 Sorting lower bounds on O(n)-time sorting

Upload: others

Post on 13-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Lecture6SortinglowerboundsonO(n)-timesorting

Page 2: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Announcements

• HW2dueFriday

• PleasesendanyOAEletterstoLunaFrank-Fischer([email protected]) byApril28.

Page 3: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Sorting

• We’veseenafewO(nlog(n))-time algorithms.

• MERGESORThasworst-caserunningtimeO(nlog(n))

• QUICKSORThasexpectedrunningtimeO(nlog(n))

Canwedobetter?

Dependsonwho

youask…

Page 4: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

AnO(1)-timealgorithmforsorting:

StickSort

• Problem:sortthesesticksbylength.

• Algorithm:• Dropthemonatable.

• Nowtheyaresortedthisway.

Page 5: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Thatmayhavebeenunsatisfying

• ButStickSort doesraisesomeimportantquestions:

• Whatisourmodelofcomputation?

• Input: array

• Output:sortedarray

• Operationsallowed:comparisons

-vs-

• Input:sticks

• Output:sortedsticksinverticalorder

• Operationsallowed:droppingontables

• Whatarereasonablemodelsofcomputation?

Page 6: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Today:two(more)models

• Comparison-basedsortingmodel

• ThisincludesMergeSort,QuickSort,InsertionSort

• We’llseethatany algorithminthismodelmusttakeatleastΩ(nlog(n))steps.

• Anothermodel(morereasonablethanthestickmodel…)

• BucketSort andRadixSort

• BothrunintimeO(n)

Page 7: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Comparison-basedsorting

Page 8: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Comparison-basedsortingalgorithms

Thereisagenie whoknowswhat

therightorderis.

ThegeniecananswerYES/NO

questionsoftheform:

is [this] bigger than [that]?Algorithm

Wanttosorttheseitems.

There’ssomeorderingonthem,butwedon’tknowwhatitis.

Is bigger than ?

YES

Thealgorithm’sjobisto

outputacorrectlysorted

listofalltheobjects.

isshorthandfor

“thefirstthingintheinputlist”

Page 9: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Allthesortingalgorithmswehaveseenworklikethis.

7 6 3 5 1 4 2eg,QuickSort:

Is bigger than ? 7 5

Is bigger than ?

Is bigger than ?

6

3

5

5

YES

YES

NO

7 6 3

5 etc.

Pivot!

Page 10: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

LowerboundofΩ(nlog(n)).

• Theorem:

• Anydeterministiccomparison-basedsortingalgorithm musttakeΩ(nlog(n))steps.

• Anyrandomized comparison-basedsortingalgorithm musttakeΩ(nlog(n))stepsinexpectation.

• Howmightweprovethis?

1. Considerallcomparison-basedalgorithms,one-by-one,andanalyzethem.

2. Don’tdothat.

Instead,arguethatallcomparison-basedsorting

algorithmsgiverisetoadecisiontree.

Thenanalyzedecisiontrees.

Page 11: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Decisiontrees

Sortthesethreethings. ?≤

YESNO

YES

?

NO

≤ ?YES NO

etc…

Page 12: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Allcomparison-basedalgorithmslooklikethis

Example:Sortthese

threethingsusing

QuickSort.

NO

?

YES

L RRL

≤ ?NOYES

L RL RReturn ≤

NOYES

?Thenwe’redone

(aftersomebase-

casestuff)

Now

recurse

onR

Pivot!

L RL R

Pivot!

Return ReturnIneithercase,we’redone

(aftersomebasecasestuffand

returningrecursivecalls).

etc...

Page 13: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Allcomparison-basedalgorithmshaveanassociateddecisiontree.

YES NO

?

??

YESNOYES

NO

????

Theleavesofthis

treeareallpossible

orderingsofthe

items:whenwe

reachaleafwe

returnit.

Whatdoesthedecision

treeforMERGESORTING

fourelementslooklike?

Olliethe

over-achievingostrich

Runningthealgorithmonagiven

inputcorrespondstotakinga

particularpaththroughthetree.

Page 14: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

What’stheruntimeonaparticularinput?

YES NO

?

??

YESNOYES

NO

????

Ifwetakethispaththrough

thetree,theruntimeis

Ω(lengthofthepath).

Atleastthenumber

ofcomparisonsthat

aremadeonthat

input.

Page 15: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

What’stheworst-case runtime?

YES NO

?

??

YESNOYES

NO

????

AtleastΩ(lengthofthelongestpath).

Page 16: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Howlongisthelongestpath?

YESNO

?

??YES

NOYESNO

????

• Thisisabinarytreewithat

least_____leaves.

• Theshallowesttreewithn!

leavesisthecompletely

balancedone,whichhas

depth______.

• Soinallsuchtrees,the

longestpathisatleastlog(n!).

n!

log(n!)

• n!isabout(n/e)n (Stirling’s formula).

• log(n!)isaboutnlog(n/e)=Ω(nlog(n)).

Conclusion:thelongestpath

haslengthatleastΩ(nlog(n)).

beingsloppyabout

floorsandceilings!

Wewantastatement:inallsuchtrees,

thelongestpathisatleast_____

Page 17: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

LowerboundofΩ(nlog(n)).• Theorem:• Anydeterministiccomparison-basedsortingalgorithm musttakeΩ(nlog(n))steps.

• Proof:• Anydeterministiccomparison-basedalgorithmcanberepresentedasadecisiontreewithn!leaves.

• Theworst-caserunningtimeisatleastthedepthofthedecisiontree.

• Alldecisiontreeswithn!leaveshavedepthΩ(nlog(n)).

• Soanycomparison-basedsortingalgorithmmusthaveworst-caserunningtimeatleast Ω(nlog(n)).

Page 18: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

\end{Aside}

• Forexample,QuickSort?

• Theorem:

• Anyrandomized comparison-basedsortingalgorithmmusttakeΩ(nlog(n))stepsinexpectation.

• Proof:

• attheendoftodayiftime

• otherwiseseelecturenotes

• (sameideasasdeterministiccase)

Trytoprovethis

yourself!

We’llseethisatthe

endoftoday’slecture

ifthere’stime.

Ollietheover-achievingostrich

Aside:Whataboutrandomizedalgorithms?

Page 19: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

ButwhataboutStickSort?

• Thisisoneofthecoolthingsaboutlowerboundslikethis:weknowwhenwecandeclarevictory!

So,MergeSort isoptimal!

• StickSort can’tbeimplementedasacomparison-basedsortingalgorithm.Sotheselowerboundsdon’tapply.

• ButStickSort waskindofdumb.EspeciallyifIhave

tospendtime

cuttingallthose

stickstobethe

rightsize!Butmighttherebeanothermodelofcomputationthat’slessdumb,inwhichwecansortfaster?

Page 20: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Beyondcomparison-basedsortingalgorithms

Page 21: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Anothermodelofcomputation

• Theitemsyouaresortinghavemeaningfulvalues.

9 6 3 5 2 1 2

insteadof

Page 22: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Whymightthishelp?

BucketSort: 9 6 3 5 2 1 2

1 2 3 4 5 6 7 8 9

963 521

2

SORTED!IntimeO(n).

Implementthebuckets as

linkedlists.Theyarefirst-in,

first-out.

Concatenate

thebuckets!

Note:thisisasimplificationof

whatCLRScalls“BucketSort”

Page 23: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Issues

• Needtobeabletoknowwhatbuckettoputsomethingin.

• That’sokayfornow:it’spartofthemodel.

• Needtoknowwhatvaluesmightshowupaheadoftime.

2 12345 13 21000 50 100000000 1

Page 24: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Onesolution:RadixSort• Idea:BucketSort ontheleast-significantdigitfirst,thenthenextleast-significant,andsoon.

1 2 3 4 5 6 7 8 9

21 345 13 101 50 234 1

0

345

50 1321

101

1

234

Step1:BucketSort onLSB:

50 21 101 1 13 234 345

Page 25: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Step2:BucketSort onthe2nd digit

1 2 3 4 5 6 7 8 90

50 21 101 1 13 234 345

502113101

234

1 345

101 1 13 21 234 345 50

Page 26: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Step3:BucketSort onthe3rd digit

1 2 3 4 5 6 7 8 90

50

21

13

101

234

1

345

1 13 21 50 101 234 345

101 1 13 21 234 345 50

Itworked!!

Page 27: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Whydoesthiswork?

21 345 13 101 50 234 1

50 21 101 1 13 234 345

1 13 21 50 101 234 345

101 1 13 21 234 345 50

Originalarray:

Nextarrayissortedbythefirstdigit.

Nextarrayissortedbythefirsttwodigits.

Nextarrayissortedbyallthreedigits.

Sortedarray

50 21 101 1 13 234 345

101 01 13 21 234 345 50

001 013 021 050 101 234 345

Page 28: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Formally…

• Argumentvialoopinvariant(akainduction).

• LoopInvariant:

Ollietheover-achievingostrichLuckythelackadaisicallemur

Thisisthe

outline ofa

proof,nota

formalproof.

Makethis

formal!(orsee

lecturenotes).

Page 29: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Whydoesthiswork?

21 345 13 101 50 234 1

50 21 101 1 13 234 345

1 13 21 50 101 234 345

101 1 13 21 234 345 50

Originalarray:

Nextarrayissortedbythefirstdigit.

Nextarrayissortedbythefirsttwodigits.

Nextarrayissortedbyallthreedigits.

Sortedarray

50 21 101 1 13 234 345

101 01 13 21 234 345 50

001 013 021 050 101 234 345

Page 30: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Formally…

• Argumentvialoopinvariant(akainduction).

• LoopInvariant:

• Afterthek’th iteration,thearrayissortedbythefirstkleast-significantdigits.

• Basecase:

• “Sortedby0least-significantdigits”meansnotsorted.

• Inductivestep:

• (Youfillin…)

• Termination:

• Afterthed’th iteration,thearrayissortedbythedleast-significantdigits.Aka,it’ssorted.

Ollietheover-achievingostrichLuckythelackadaisicallemur

Thisisthe

outline ofa

proof,nota

formalproof.

Makethis

formal!(orsee

lecturenotes).

Pluckythepedanticpenguin

Thisneedstouse:(1)bucketsort

works,and(2)wetreateachbucket

asaFIFOqueue.

Page 31: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Whatistherunningtime?

• Dependsonhowmanydigitsthebiggestnumberhas.• Sayd-digitnumbers.

• Therearediterations

• EachiterationtakestimeO(n+10)• Wecanchangethe10intoan“r:”thisisthe“radix”

• Example:ifr=2,wewriteeverythinginbinaryandonlyhavetwobuckets.

• Example:Ifr=10000000,wewriteeverythingbase-10000000andhave10000000buckets.

• Example:ifr=n,wewriteeverythinginbase-nandhavenbuckets.

• TimeisO(d(n+r)).

• Ifd=O(1)andr=O(n),runningtimeO(n).

SothisisaO(n)-timesortingalgorithm!

Howbigcanthe

biggestnumberbeif

d=O(1)andr=n?

Page 32: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Thestorysofar

• Ifweuseacomparison-basedsortingalgorithm,itMUSTrunintimeΩ(nlog(n)).

• Ifweassumethatwecandoalittlemorethancomparethevalues,wehaveanO(n)-time sortingalgorithm.

9 6 3 5 2 1 2

Whywouldweeverusea

comparison-basedsortingalgorithm??

Page 33: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Whywouldweeveruseacomparison-basedsortingalgorithm?

• dmightnotbe“constant.”(aka,itmightbebig)

• RadixSort needsextramemoryforthebuckets.• Notin-place

• Iwanttosortemojibytalkingtoagenie.• RadixSort makesmoreassumptionsontheinput.

�123456

987654 � 140! 2.1234123 2n 42

• Wecancomparetheseprettyquickly(justlookatthemost-significantdigit):

• � =3.14….

• e=2.78….

• ButtodoRadixSort we’dhavetolookateverydigit.

• Thisisespeciallyproblematicsincebothofthesehaveinfinitelymanydigits...

Page 34: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Dowehavetimeforthelowerboundonrandomizedalgorithms?

Page 35: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Ifso…

• Recallthelowerboundforadeterministicalgorithm.

YESNO

?

??YES

NOYESNO

????

• Thelongestpathinthis

treehaslength

Ω(nlog(n)).

• Therunningtimeofthe

algorithmisatleastthe

lengthofthispath.

Page 36: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Adifferentmodel

• Howaboutadeterministicalgorithmonarandominput?

• Notworst-casemodel.

• Notourrandomizedalgorithmmodeleither.

YESNO

?

??YES

NOYESNO

????

• Thelongestpathinthis

treehaslength

Ω(nlog(n)).

• Therunningtimeofthe

algorithmisatleastthe

lengthofthispath.

average

average

average

SoadeterministicalgorithmmusttaketimeΩ(nlog(n))evenonrandominputs.

Page 37: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Thisisaprettystrongstatement!

• Before:

Ifanadversary

getstopickthe

input,weneed

timeΩ(nlog(n)).

• Now:

Iftheinputis

chosenrandomly,

westill needtime

Ω(nlog(n)).

Page 38: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Butwhatdoesthatmodelhavetodowithanything?

Deterministic

algorithmon

randominput

Randomized

algorithmon

worst-case input

Itturnsoutthatinthiscase,

Adeterministicalgorithm

musttaketimeΩ(nlog(n))

evenonrandominputs.

Arandomizedalgorithm

musttaketimeΩ(nlog(n))

onworst-caseinputs.

Thisiswhatwewanted.

Randomized

algorithmon

random input

doesat

leastas

wellas

doesat

leastas

wellas

Andwejustshowedthat

thisdidn’tdoverywell.

Theargumenthereis

prettysubtle!Understand

whyitmakessense!

Page 39: Lecture 6...The algorithm’s job is to output a correctly sorted list of all the objects. is shorthand for “the first thing in the input list” All the sorting algorithms we have

Recap

• Binarysearchtrees

NextTime

• Howdifficultaproblemisdependsonthemodelofcomputation.

• Howreasonableamodelofcomputationisisupfordebate.

• StickSort cansortsticksinO(1)time.

• RadixSort cansortsmallishintegersinO(n) time.

• Ifwewanttosortemoji(orarbitrary-precisionnumbers),werequireΩ(nlog(n)) time(likeMergeSort).