![Page 1: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/1.jpg)
CS2230CSII:Datastructures
Meeting29:HashingBrandonMyers
UniversityofIowa
https://en.wikipedia.org/wiki/Hash_function
![Page 2: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/2.jpg)
Today’slearningobjectives
• IdentifyvariousdatastructurestoimplementaSet• Calculatethememoryusageofhashingdatastructures• ExecutetheSetmethodsforvarioushashsetimplementations,includingwhentherearecollisions• Identifyimportantpropertiesofhashcodes
![Page 3: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/3.jpg)
RoadmapPropose to represent a set of integers as an array of booleans so that we can
search in O(1) time
Wow that's fast! But it has the problem that it requires
too much memory!
Reduce the size of the array, but now elements
collide on the same index
Deal with collisions with a variety of methods
("chaining, probing")
Represent sets of any object by using a hash
function to turn the object into an integer
![Page 4: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/4.jpg)
WhatarewayswecanrepresentaSetofintegers?
1014
21
2
4
7
https://b.socrative.com/login/student/roomCS2230Xids1000-2999roomCS2230Yids3000+
IdentifyvariousdatastructurestoimplementaSet
![Page 5: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/5.jpg)
Datastructure howtosearchforaspecificvalue
ifyouknowwhereitisstored(e.g.,indexorreference)
unsortedarrayofintegers searchfromstartuntilwefindit
gotothe index
10 14 4 15 7 2110 14 4 15 7 21
find(4)
10 14 4 15 7
0 1 2 3 4
21
5
get(2)
![Page 6: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/6.jpg)
Datastructure howtosearchforaspecificvalue
ifyouknowwhereitisstored(e.g.,indexorreference)
unsortedarrayofintegers searchfromstartuntilwefindit
gotothe index
sortedarrayofintegers binarysearch gototheindex
10 14 4 15 7 21
4 7 10 14 15 21
10 14 4 15 7 21
find(4)
10 14 4 15 7
0 1 2 3 4
21
5
get(2)
4 7 10 14 15
0 1 2 3 4
21
5
find(7)
4 7 10 14 15
0 1 2 3 4
21
5
get(1)
![Page 7: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/7.jpg)
Datastructure howtosearchforaspecificvalue
ifyouknowwhereitisstored(e.g.,indexorreference)
unsortedarrayofintegers searchfromstartuntilwefindit
gotothe index
sortedarrayofintegers binarysearch gototheindex
binarysearchtreeofintegers searchfromroot gotothenode
10 14 4 15 7 21
4 7 10 14 15 21
6
3
42
6
3
42
10 14 4 15 7 21
find(4)
10 14 4 15 7
0 1 2 3 4
21
5
get(2)
4 7 10 14 15
0 1 2 3 4
21
5
find(7)
4 7 10 14 15
0 1 2 3 4
21
5
get(1)
6
3
42
find(4)
![Page 8: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/8.jpg)
Datastructure howtosearchforaspecificvalue
ifyouknowwhereitisstored(e.g.,indexorreference)
unsortedarrayofintegers searchfromstartuntilwefindit
gotothe index
sortedarrayofintegers binarysearch gototheindex
binarysearchtreeofintegers searchfromroot gotothenode
hugearray ofbooleans (truemeansthevalueisintheset)
use thevalueasanindex sameassearch
10 14 4 15 7 21
4 7 10 14 15 21
6
3
42
6
3
42
10 14 4 15 7 21
find(4)
10 14 4 15 7
0 1 2 3 4
21
5
get(2)
4 7 10 14 15
0 1 2 3 4
21
5
find(7)
4 7 10 14 15
0 1 2 3 4
21
5
get(1)
6
3
42
find(4)
F T T F ... F
0 1 3 Integer.MAX_VALUE
find(3)
2
F T T F ... F
0 1 3 Integer.MAX_VALUE
find(3)
2
F T T F ... F
0 1 3 Integer.MAX_VALUE
2
![Page 9: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/9.jpg)
Thisdatastructureisgreat!FindanyvalueinO(1)time!
Problems?
boolean[] set = new boolean[Integer.MAX_INT+1];set[1] = true; // add 1set[2] = true; // add 2
F T T F ... F
0 1 3 Integer.MAX_VALUE
find(3)
2
![Page 10: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/10.jpg)
Calculatethememoryusageofhashingdatastructures
Let𝑀(𝑛)betheamountofmemory thisSetuses,where𝑛=numberofelementsintheSet.Whichistrueandthebestbound?
a) 𝑀 𝑛 ∈ 𝑂(1)b) 𝑀 𝑛 ∈ 𝑂(𝑙𝑜𝑔𝑛)c) 𝑀 𝑛 ∈ 𝑂(𝑛)d) 𝑀 𝑛 ∈ 𝑂(𝐼𝑛𝑡𝑒𝑔𝑒𝑟.𝑀𝐴𝑋_𝐼𝑁𝑇)e) 𝑀 𝑛 ∈ 𝑂(𝑛 ∗ 𝐼𝑛𝑡𝑒𝑔𝑒𝑟.𝑀𝐴𝑋_𝐼𝑁𝑇)
F T T F ... F
0 1 3 Integer.MAX_VALUE
2
https://b.socrative.com/login/student/roomCS2230Xids1000-2999roomCS2230Yids3000+
![Page 11: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/11.jpg)
Forexample...
Integer.MAX_VALUE= 2>? − 1boolean datatypeis1to2bytes
231-1*2bytes=~4GBevenifyoursetisnearlyempty!
Ifyouarecleverandrepresenttheboolean as1biteach(0=false,1=true)thenyoucangetdownto268MB
Evenif268MBfitsinyourcomputer’sRAM,realitybitesyou:ifyourelementsareuniformlyrandomlydistributedacrossthose268MBthentheelementsofyoursetwon’tallbeinyourcomputer’sfastcachememory,whichhasacapacityinthe100sofKB(takeCS:2630tolearnmore!)
F T T F ... F
0 1 3 Integer.MAX_VALUE
2
![Page 12: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/12.jpg)
F F F F F F
0 1 32 4 5
FixingthememoryproblemLimitthearraytoasmallercapacity,say6
add(2)
F F T F F F
0 1 32 4 5add(7)
F T T F F F
0 1 32 4 5
howtoadd(𝑖):marktrueatindex𝑖𝑚𝑜𝑑𝑐𝑎𝑝𝑎𝑐𝑖𝑡𝑦
(bonus:wecanalsostorenegativeintegersnow)
anewproblem!Itlookslike1isintheset(and13,19,25,...)eventhoughweonlyadded7
![Page 13: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/13.jpg)
F F T F F F
0 1 32 4 5add(7)
F T T F F F
0 1 32 4 5
anewproblem!Itlookslike1isintheset(and14,21,28,...)eventhoughweonlyadded7
Sincemanyvalues(1,7,13,19,25,...)maptoindex1,weneedtokeeptrackofwhich keyisstoredthere
null 7 2 null null null
0 1 32 4 5
We’llhavethearraystoreIntegers,wherenullmeansthebucketisemptyandanon-nullvalueisthekeystoredthere
add(2)
Integer[] set = new Integer[6]; // capacity=6 set[2 % 6] = 2; set[7 % 6] = 7;
%meansmod
![Page 14: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/14.jpg)
ExecutetheSetmethodsforvarioushashsetimplementations
null null null null null null
0 1 32 4 5
null
6Supposeoursetisinitiallyemptyasabove.Whatwillitlooklikeafterthefollowingelementsareadded?-1,19,17,21,and8
21 8 null 17 null 19 -1
-1 19 17 21 8 null null -1 8 17 19 21 null null
null 19 8 21 null 17 -1
a)
c)
b)
d)
https://b.socrative.com/login/student/roomCS2230Xids1000-2999roomCS2230Yids3000+
![Page 15: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/15.jpg)
LearningobjectivesforFinalproject• Design,implement,andtestanapplicationbasedonawrittenspecification• ChooseappropriateADTsandefficientdatastructuresforvarioustasks• Useversioncontroltocollaborateonacodingprojectwithanotherperson
![Page 16: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/16.jpg)
Finalproject:Semanticsimilarityofwords
_3 May. Bistritz._--Left Munich at 8:35 P. M., on 1st May, arriving at Vienna early next morning; should have arrived at 6:46, but train was an hour late. Buda-Pesth seems a wonderful place, from the glimpse which I got of it from the train and the little I could walk through the streets....
3similarwordsto“time”? come,0.6202651310028829sleep,0.613304123466795time,0.6082294707042364
![Page 17: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/17.jpg)
Iamasickman.Iamaspitefulman.Iamanunattractiveman.Ibelievemyliverisdiseased.However,Iknownothingatallaboutmydisease,anddonotknowforcertainwhatailsme.
Theword“man”appearsinthefirstthreesentences.Itssemanticdescriptorvectorwouldbe:
[I=3,am=3,a=2,sick=1,man=0,spiteful=1,an=1,unattractive=1,believe=0,my=0,liver=0,is=0,diseased=0,However=0,know=0,nothing=0,at=0,all=0,about=0,disease=0,and=0,do=0,not=0,for=0,certain=0,what=0,ails=0,me=0]
Theword“liver”occursinthefourthsentence,soitssemanticdescriptorvectoris:
[I=1,am=0,a=0,sick=0,man=0,spiteful=0,an=0,unattractive=0,believe=1,my=1,liver=0,is=1,diseased=1,However=0,know=0,nothing=0,at=0,all=0,about=0,disease=0,and=0,do=0,not=0,for=0,certain=0,what=0,ails=0,me=0]
ourdefinitionofsemanticmeaning:thenumberoftimesawordappearswithotherwordsinthesamesentence.Eachwordbecomesavector.
![Page 18: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/18.jpg)
variousmeasuresofsimilarityoftwovectorscosinesimilaritynegativeEuclideandistancenegativeEuclideandistanceofnorms
usethesemeasurestoanswerqueriesaboutthewordsinatext
![Page 19: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/19.jpg)
Projectduedates
• Nov17,11:59pm:Milestone1inGitHub(noslipdays)• FinishedPart1• PROGRESS_REPORT_NOV17.txt
• Nov29,11:59pm:Milestone2inGitHub(noslipdays)• FinishedPart3,andinitialdraftofPart4'swrittenanswers• PROGRESS_REPORT_NOV29.txt
• Dec6,11:59pm:FinalversioninGitHub(upto2slipdaysifatleast1partnerhasthem)• FinishedallParts
![Page 20: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/20.jpg)
null 7 2 null null null
0 1 32 4 5
Collisions!
uhoh...
add(13)//13%6=1
![Page 21: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/21.jpg)
Youknowthatfeeling...whensomeonetakesyourparkingspot
![Page 22: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/22.jpg)
null 7 2 null null null
0 1 32 4 5
Dealingwithcollisions
add(13)//13%6=1
null 7 2 13 null null
0 1 32 4 5
null null null null
0 1 32 4 5
7
13 \
2 \
LinearprobingGotothenextspotuntilyoufindan opening
ChainingEachbucketisalinkedlistofelementsstoredthere
...andothertechniques!
![Page 23: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/23.jpg)
ExecutetheSetmethodsforvarioushashsetimplementations,includingwhentherearecollisions
null null null null null null
0 1 32 4 5
null
6Supposeoursetisinitiallyemptyasabove.Whatwillitlooklikeafterthefollowingelementsareadded,assumingweuselinearprobing?9,18,23,17
null null 9 null 18 null null
null null 9 23 18 17 null null null 23 17 4 null null
null null 9 18 23 17 null
a)
c)
b)
d)
https://b.socrative.com/login/student/roomCS2230Xids1000-2999roomCS2230Yids3000+
![Page 24: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/24.jpg)
Howshouldweimplementremove()ifweareusinglinearprobing?(e.g.,remove(7))
a) setthethebuckettonullb) Removetheelementandmoveallelementsafteritleftby
onespacec) Moveallelementsafterit(uptothenextnull)leftbyone
spaced) leaveaspecialmarkerinthebucketthatmeansitisdeletede) thereisnogoodwaytoallowremove()
null 7 2 13 null null
0 1 32 4 5
https://b.socrative.com/login/student/roomCS2230Xids1000-2999roomCS2230Yids3000+
ExecutetheSetmethodsforvarioushashsetimplementations,includingwhentherearecollisions
![Page 25: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/25.jpg)
![Page 26: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/26.jpg)
The hash set in your smartphone’s processor that you didn’t know about
allyourdata(e.g.,runningprograms,theOS,andtheirdata)
cachestoresasubsetofyourdata
itissmallbutthatmakesitfast!
![Page 27: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/27.jpg)
The hash set in your smartphone’s processor that you didn’t know about
Thecacheisbasicallyahashsethere’sonewhereeachbucketcanonlyhold1key
comparethekeywiththekeyinthebucket
thekey
hashfunctionistakesomeofthebitsofthememoryaddress
![Page 28: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/28.jpg)
The hash set in your smartphone’s processor that you didn’t know about
here’sonewhereeachbucketcanholdupto4keys
comparethekeywiththekeysinthebucket
thekey
hashfunctionistakesomeofthebitsofthememoryaddress
Thecacheisbasicallyahashset
![Page 29: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/29.jpg)
Puttingnon-integersintoasetString[] set = new String[capacity];set[???] = "Cat";
Whereshouldweputthestring“Cat”?
![Page 30: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/30.jpg)
Puttingnon-integersintoasetString[] set = new String[capacity];set[???] = "Cat";
Whereshouldweputthestring“Cat”?
useahashfunction
ahashfunctionisjustanyfunctionthatturnsanobjectintoaninteger
![Page 31: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/31.jpg)
null null null null null null
0 1 32 4 5
null
6Supposethehashfunctionforastringisthelength
Whatisthecontentsafterinserting“Cat”,“Dog”,“Froggy”?AssumeweuseLinearProbing.
null null null “Cat” “Dog” ”Froggy” nulla)
b)
c)
d)
“Froggy” null null “Cat” “Dog” null null
“Cat” ”Dog” ”Froggy” null null null null
null null null “Cat” “Dog” null “Froggy”
ExecutetheSetmethodsforvarioushashsetimplementations,includingwhentherearecollisions
https://b.socrative.com/login/student/roomCS2230Xids1000-2999roomCS2230Yids3000+
![Page 32: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/32.jpg)
forexample,OracleJavadistribution’shashfunctionforStrings
ahashfunctionisjustanyfunctionthatturnsanobjectintoaninteger
public class String {// a string is stored as an array of// "chars" (characters)private final char value[];
// hash function for Stringpublic int hashCode() {
int h = hash;if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {h = 31 * h + val[i];
}hash = h;
}return h;
}}
![Page 33: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/33.jpg)
AparaphraseofObject.hashCodespecificationintheJavaAPI• Thegeneralcontractof hashCode is:
• duringthesamerunofyourprogram,hashCode onaspecificobjectmustalwaysreturnthesameresult
• o1.equals(o2)⇒o1.hashCode()==o2.hashCode()
• Importanttoknowthat
o1.hashCode()==o2.hashCodeDOESNOTIMPLYo1.equals(o2)i.e.,itisokayfortwodifferentobjectstohavethesamehashCode (andprettymuchimpossibletoavoid)
https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()
![Page 34: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/34.jpg)
IfIhavethefollowingcode
whichofthefollowingstatementsistrue
a) IfyouoverrideCat.equals youmustoverrideCat.hashCode
b) YoumustoverrideDog.equals andDog.hashCodec) YoumustoverrideCat.equals,Cat.hashCode,
Dog.equals,andDog.hashCode
Map<Cat, Dog> x = new HashMap<Cat,Dog>();
Identifyimportantpropertiesofhashcodes
https://b.socrative.com/login/student/roomCS2230Xids1000-2999roomCS2230Yids3000+
![Page 35: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/35.jpg)
Runningtimeofsuccessfulfind?• Linearprobing
• expectedlengthofasequenceofnon-nulls:𝑂 1 + ??JK
where𝛼 istheloadfactor
• where𝛼 = #NOOPQRSTOUQUORVW
(𝛼 iscalledtheloadfactor)
• worstcase:O(n)ifthetableisallowedtogetnearlyfull(i.e.𝛼 isverycloseto1)
null 7 2 13 null null
0 1 32 4 5
Sincetherunningtimedependson𝛼,weshoulddecreaseitbygrowingthearraywhen𝛼 becomestoolarge
Ruleofthumb:if𝛼 increasesbeyond0.5or0.75,growthecapacity
![Page 36: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/36.jpg)
Runningtimeofsuccessfulfind?• Chaining
• Whatistheexpectedlengthofthelongestchain?Whatistheaveragelengthofachain?
• Ofcourse,wewantourhashfunctiontodistributekeyswell(ifeverythinghashestoaconstantnumberofbuckets,lookuptimewouldbeO(n))
• Ifyouareluckyenoughfortheitemstobeuniformlydistributedacrossbucketsthentheaveragelengthofchainswouldbe1/𝛼
• However,thebirthdayparadoxfrom(see,DiscreteMath)tellsusthattheprobabilityofsomecollisionsishighevenifkeysaredrawnfromuniformdistribution
• Therefore,𝛼 shouldstillbekeptsufficientlysmallerthan1
null null null null
0 1 32 4 5
7
13 \
2 \
![Page 37: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/37.jpg)
Today’slearningobjectives
• IdentifyvariousdatastructurestoimplementaSet• Calculatethememoryusageofhashingdatastructures• ExecuteSetmethodsforvarioushashsetimplementations,includingwhentherearecollisions• Identifyimportantpropertiesofhashcodes
![Page 38: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/38.jpg)
Resources
Visualizationsofprobingandchaininghashtables!
http://www.cs.usfca.edu/~galles/visualization/OpenHash.html
http://www.cs.usfca.edu/~galles/visualization/ClosedHash.html
http://www.cs.usfca.edu/~galles/visualization/ClosedHashBucket.html
![Page 39: CS 2230 CS II: Data structureshomepage.cs.uiowa.edu/.../lecture-033-hashing.pdf · •Identify various data structures to implement a Set •Calculate the memory usage of hashing](https://reader034.vdocument.in/reader034/viewer/2022050122/5f5279c6e97a5d1ba800f1c9/html5/thumbnails/39.jpg)
Acknowledgements
Cachediagrams– Ferry24Milan