lecture 10: searching & mapping
DESCRIPTION
CSC 213 – Large Scale Programming. Lecture 10: Searching & Mapping. Today’s Goal. Consider the basics of searchable data How do we search using a computer? What are our goals while searching? ADTs used for search & how would they work? Most critically, where the $&*#%$# are my keys? - PowerPoint PPT PresentationTRANSCRIPT
LECTURE 10:SEARCHING & MAPPING
CSC 213 – Large Scale Programming
Today’s Goal
Consider the basics of searchable data How do we search using a computer? What are our goals while searching? ADTs used for search & how would they work? Most critically, where the $&*#%$# are my
keys? How do Map & Dictionary ADT work and
search? Methods to add, remove, and access data? How Sequence used to implemented these When & why would we use Sequence-based
approach
Searching
Search for unknown data in most cases Consider the reverse: why search for what
you have? Seek data related to terms used
Already have idea, want web pages containing terms
Get encoded proteins given set of amino acids
Given “borrowed” credit cards, get credit limits
Exacting, but boring, work doing these searches Make this work ideal for computers &
students
Map-Based Bartender
Map-Based Bartender
I’ll have a Manhattan
No problem.
¾ oz sweet vermouth2½ oz bourbon 1 dash bitters1 maraschino cherry1 twist orange peel
Map-Based Bartender
I’ll have a Manhattan
That’ll be $2 billion
Map-Based Bartender
I’ll have a Manhattan
key
value
Search Terms
Key gets valuables We already have key Want value as a result of this
Map works similarly Give it key value returned Uses Entry to do this work
Entry Interface
Need a key to get valuables key used to search – it is what we already
have What we want is the result of search –
value
interface Entry<K,V> { K key(); V value();
}
Map Method Madness, Mmmm…
Describes a searchable Collection put(K key, V value) adds data as an Entry
remove(K key) removes Entry containing key
get(K key) returns value associated with key
Several Iterable methods are also defined Methods to use are entries(), keys(), & values()
Iterates over expected data so can use in for(-each) loops
Also defines usual Collection methods isEmpty() & size()
Searching Through a Map
Map is a Collection of key-value pairs Give it key & get value in return from ADT Now we have ADT to work with searchable
data Many searches unsuccessful
Unsuccessful search is normal, not unusual Expected events should NOT throw
exceptions This is normal; return null when nothing
found
At Most 1 Value Per Key
Entrys have unique keys in a Map If key exists, put(key, value) replaces
existing Entry Returns prior value for key in the Map so its
not lost If before call key not in Map, null returned
SEQUENCE-Based Map
SEQUENCE’s perspective of MAP that it holds
POSITIONs
elements
SEQUENCE-Based Map
Outside view of MAP and how it is stored
POSITIONs
ENTRYs
Using a Map
Map great when want only one value for a key Credit card number goes to one account One person has a given social security
number One definition per word in the dictionary
Using a Map
Map great when want only one value for a key Credit card number goes to one account One person has a given social security
number One definition per word in the dictionary
Using a Map
Map great when want only one value for a key Credit card number goes to one account One person has a given social security
number One definition per word in the dictionary
Could try associating multiple values per key Map key to Sequence of values possible
solution But this means Map’s user must handle
complexity
Using a Map
Could try associating multiple values per key Map key to Sequence of values possible
solution But this means Map’s user must handle
complexity
Dictionary-based Bartender
Dictionary-based Bartender
I’ll have a Manhattan
No problem.
key
value
Dictionary-based Bartender
Not thatManhattan
Sorry.
value
key
Dictionary-based Bartender
Sorry. How about…
anothervalue
Not thatManhattan
key
Dictionary-based Bartender
anothervalue
Mmmmm...Manhattan
key
That’ll be $2 billion
not a
Dictionary ADT
DICTIONARY ADT very similar to MAP Hold searchable data in each of these ADTs Both data structures are collections of Entrys
Convert key to value using either concept DICTIONARY can have multiple values
for one key 1 value for key is still legal option
Dictionary ADT
DICTIONARY ADT very similar to MAP Hold searchable data in each of these ADTs Both data structures are collections of Entrys
Convert key to value using either concept DICTIONARY can have multiple values
for one key 1 value for key is still legal option
“awesome”
Dictionary ADT
DICTIONARY ADT very similar to MAP Hold searchable data in each of these ADTs Both data structures are collections of Entrys
Convert key to value using either concept DICTIONARY can have multiple values
for one key 1 value for key is still legal option
“awesome” Also many Entrys with same key but
different value “cool” “cool”
Map VS. Dictionary
Map VS. Dictionary
Map ADT Dictionary ADT Collection of Entrys
key – searched for value – cared about
Collection of Entrys key – searched for value – cared about
Map VS. Dictionary
Map ADT Dictionary ADT Collection of Entrys
key – searched for value – cared about
Basic implement: List w/ Entrys in
increasing order of keys
Collection of Entrys key – searched for value – cared about
Basic implement: List w/ Entrys in
increasing order of keys
Map VS. Dictionary
Map ADT Dictionary ADT Collection of Entrys
key – searched for value – cared about
Basic implement: List w/ Entrys in
increasing order of keys
key in at most 1 Entry
Collection of Entrys key – searched for value – cared about
Basic implement: List w/ Entrys in
increasing order of keys
Entrys can share key
Changes for Dictionary
Map DictionaryV put(k,v) Entry<K,V> put(k,v)
V remove(k) Entry<K,V> remove(e)
V get(k) Entry<K,V> get(k)Iterable<Entry<K,V>> getAll(k)
Ordered List-Based Approach
Idea normally imagined w/ Map & Dictionary Maintains ordered list of key-value pairs
Must maintain Entrys ordered by their key Faster searching provides performance win
Q: “Mom, how do I spell _______?”A: “Look it up.”
Efficiency gains not just for get & getAll Entrys with same key stored in any
order Only requires that keys be in order only
Ordered List-Based Approach
Iterators should respect ordering of Entrys Should not be a problem, if Entrys stored
in order If O(1) access time, search time is O(log
n) Array-based structure required to hold Entrys
To get immediate access, needs to access by index
Requires IndexList-based implementation
Binary Search
Finds key using divide-and-conquer approach First of many times you will be seeing this
approach Algorithm has problems solved using
recursion Base case 1: No Entrys remain to find the key Base case 2: At data’s midpoint is matching key Recursive Step 1: If midpoint too high, use lower
half Recursive Step 2: Use upper half, if midpoint too
low
Binary Search
low and high params specifying range to check Would be called with 0 & size() – 1,
initially If l > h, no match possible in this data
Compare with key at midpoint of low & high
Consider steps for find(7):1 3 4 5 7 8 9 11 14 16 18 190
ml h
Binary Search
low and high params specifying range to check Would be called with 0 & size() – 1,
initially If l > h, no match possible in this data
Compare with key at midpoint of low & high
Consider steps for find(7):1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
0
0ml h
ml h
Binary Search
low and high params specifying range to check Would be called with 0 & size() – 1,
initially If l > h, no match possible in this data
Compare with key at midpoint of low & high
Consider steps for find(7):1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
0
0ml h
ml h
4 5 70 1 3 8 9 11 14 16 18 19ml h
Binary Search
low and high params specifying range to check Would be called with 0 & size() – 1,
initially If l > h, no match possible in this data
Compare with key at midpoint of low & high
Consider steps for find(7):1 3 4 5 7 8 9 11 14 16 18 19
1 3 4 5 7 8 9 11 14 16 18 19
0
0ml h
ml h
4 5 70 1 3 8 9 11 14 16 18 19ml h
4 70 1 3 8 9 11 14 16 18 195l = m = h
Using Ordered Sequence
get uses binary search; takes O(log n) time Should also start with binary search for getAll() getAll checks neighbors to find all matches
Add and remove methods could use binary search List shifts elements in put to make hole for element Would also need to do shift when removing from list Each takes O(n) total time in worst case as a result
8 8 10 10 10 10 16 19 22 99
Comparing Keys
For all searching, must find matching keys Cannot rely upon equals() when ordering
Want to be lazy, write code for all types of key Use <, >, == if keys numeric type, but this
is limiting String also has simple method: compareTo()
General way comparing keys based upon this idea?
Comparable<E> Interface
In Java as a standard from java.lang Defines single method used for
comparison compareTo(E obj) compares instance with obj
Returns int which is either negative, zero, positive
Ordered Sequence Example
Easiest to require that keys be Comparable Now reuse class anywhere by adding
interface Also use standard types like String & Integer
compareTo() in binary search makes it simpleint c = k.compareTo(list.get(m).getKey());if (c > 0) { return binarySearch(k, m + 1, h);} else if (c < 0) { return binarySearch(k, l, m - 1);} else { return m;}
What is a Map/Dictionary?
At simplest level, both are collection of Entrys
Focus on transforming data (or so it appears) Add data with key and value to which it is
transformed Accessor transforms key to value
associated with key remove() used to delete an Entry
At most one value per key using a Map With Dictionary, multiple values per key
possible
Before Next Lecture…
Week #4 assignment due Tuesday at 5PM Continue to do reading in your textbook
Learn more about hash & what it means in CSC How can we tell if a hash is any good? Hash tables sound cool, but how do we make
them? Monday is when lab project phase #1 due
Will have time in lab, but then will be the weekend Project #1 available tonight after lab
Will be due in parts to “encourage” good habits