linked hash presentation
DESCRIPTION
A presentation on the linked hash datastructure given to a class.TRANSCRIPT
An STL-Compatible Hybrid of Linked List and Hash Map
William Nagel, Dr. Dobb’s Journal, August 2005
Presentation by Phil Ulrich
The Requirements
Need a data structure that provides very high random-access performance.
Also need a data structure that provides high performance while iterating through it.
The project in the article was in C++, so the STL was the natural assumption, but the STL was insufficient.
Shortcomings of the Hash Map
Constant-time random access in the average case (one key = one value), so it meets one requirement
However, the hash map begins degrading if one key has multiple values
Furthermore, iteration is proportional to the number of keys AND the number of values
Shortcomings of Maps, Arrays and Linked Lists The map only offers “decent”
performance, so it is out of the question Arrays and linked lists have decent
iterative access, but random access requires linear search
Furthermore, an array would require costly resizing functions as items are added or deleted
Linked List + Hash Map = linked_hash The linked_hash is a hybrid of a linked
list and a hash map, as its name implies.
It allows elements to be accessed in constant time using a key.
In addition, the structure supports linear-time iteration through the elements by storying them as a linked-list.
Building the linked_hash
The linked_hash follows the interface for the GNU STL hash_map.
The class, while behaving like a linked list, does not support LL-specific functions such as push_front(), push_back(), or insertion at a specific position.
Building the linked_hash (cont.)
The values are not stored directly in the node, but in a _linked_hash_node structure.
The internal list and hash_map then store pointers to nodes.
The class has an iterator, to easily move through the linked list. It can also be accessed via key.
Building the linked_hash (cont.)
The parameters match those of the hash_map structure: key, data type, hash function, class for testing equality of two keys, and allocator class.
The constructor also matches that of the hash_map.
For many simple functions, such as size(), empty(), bucket_size(), and so on, the underlying hash_map function is called.
insert(), erase(), and operator[]
insert() and erase() are interesting because they have to add something to the hash and the list. This is, in general, done to the internal hash_map first, then the internal linked list.
operator[] allows both access AND return by checking to see if an item exists first.
Where could this be used?
The linked_hash can be used pretty much anywhere an STL map or hash_map would be used.
This is because it has an STL-compatible interface.
How does it perform? In the article, three test batteries were performed:
insertion tests, modification tests, and iterations. The linked_hash was never the best performer in
any test. However, no other structure was able to
consistently outperform the linked_hash. The conclusion we can draw from that is that the
linked_hash, while not offering the peak performances of some other structures, allows better overall performance.
Where could it be improved?
Ordering: the linked_hash, by default, inserts items unordered, thus making iteration unlikely to be in any given order. A sorted structure could replace the internal linked list to help this.
Multiples: right now, the linked_hash supports one key = one item. To change this, a hash_set could be used instead of a hash_map.