binary searching. binary search searching an ordered list. first compare the target to the key in...

46
Binary Searching

Upload: aidan-kennerly

Post on 15-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Binary Searching

Page 2: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Binary Search• Searching an ordered list.

• First compare the target to the key in the center of the list.

• If it is smaller, restrict the search to the left half; otherwise restrict the search to the right half, and repeat.

• In this way, at each step we reduce the length of the list to be searched by half.

• Roughly we need lg(n) comparison compared to n in sequential search

Page 3: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Binary Search vs. Sequential• Roughly we need log2(n) comparison compared

to n in sequential search.

Page 4: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Algorithm Development• Our binary search algorithm will use two indices, top and bottom, to

enclose the part of the list in which we are looking for the target key.• At each iteration, we shall reduce the size of this part of the list by about

half.• To keep track of the progress of the algorithm the following assertion need

to be true:“The target key, provided it is present, will be found between the indices bottom and

top, inclusive.”

• We establish the initial correctness of this assertion by setting bottom to 0 and top to the_list.size( ) - 1.

• To do binary search, we first calculate the index mid halfway between bottom and top as mid = (bottom + top)/2

• Next, we compare the target key against the key at position mid and then we change the appropriate one of the indices top or bottom so as to reduce the list to either its bottom or top half.

Page 5: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Algorithm Development• Next, we note that binary search should terminate when top <=

bottom; that is, when the remaining part of the list contains at most one item, providing that we have not terminated earlier by finding the target.

• Finally, we must make progress toward termination by ensuring that the number of items remaining to be searched, top - bottom + 1, strictly decreases at each iteration of the process.

• Several slightly different algorithms for binary search can be written.

Page 6: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Algorithm• Initialization: Set

bottom = 0; top = the list.size( ) - 1;• Compare target with the Record at the midpoint,

mid = (bottom + top)/2;• Change the appropriate index top or bottom to restrict the

search to the appropriate half of the list.• Loop terminates when top <= bottom, if it has not

terminated earlier by finding the target.

Page 7: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Different versions of binary search• The Forgetful Version: binary_search_1( )

forget the possibility that the Key target might be found quickly and continue to subdivide the list until what remains has length 1.

• Recognizing Equality: binary_search_2( )

it seems that the previous one will often make unnecessary iterations because it fails to recognize that it has found the target before continuing to iterate. Thus we might hope to save computer time with a variation that checks at each stage to see if it has found the target.

Page 8: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

recursive_binary_1( )Error_code recursive_binary_1(const Ordered_list &the_list, const Key &target,

int bottom, int top, int &position)

{

Record data;

if (bottom < top) { // List has more than one entry.

int mid = (bottom + top)/2;

the_list.retrieve(mid, data);

if (data < target) // Reduce to top half of list.

return recursive_binary_1 (the_list, target, mid ‡ 1, top, position);

else // Reduce to bottom half of list.

return recursive_binary_1(the_list, target, bottom, mid, position);

}

elseif (top < bottom)

return not_present; // List is empty.

else { // List has exactly one entry.

position = bottom;

the_list.retrieve(bottom, data);

if (data == target)

return success;

else

return not_present;

}

} // end of recursive_binary_1( )

Page 9: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

recursive_binary_1( )recursive_binary_1(the_list, target, bottom, top, position)

{

if (bottom < top) { // List has more than one entry.

Calculate mid; // int mid = (bottom + top)/2;

get data; // the_list.retrieve(mid, data);

if (data < target) // Reduce to top half of list.

call recursive_binary_1 (the_list, target, mid+1, top, position);

else // Reduce to bottom half of list.

call recursive_binary_1 (the_list, target, bottom, mid, position);

}

elseif (top < bottom)

return not_present; // List is empty.

else { // top == bottom; List has exactly one entry.

position = bottom;

get data; // the_list.retrieve(bottom, data);

if (data == target) // Once we arrived to one entry

return success; // we are doing equality comparison

else

return not_present;

}

} // end of recursive_binary_1( )

Page 10: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

binary_search_1( )

Error_code binary_search_1 (const Ordered_list &the_list,

const Key &target, int &position)

{

Record data;

int bottom = 0, top = the_list.size( ) - 1;

while (bottom < top) {

int mid = (bottom + top)/2;

the_list.retrieve(mid, data);

if (data < target)

bottom = mid + 1;

else

top = mid;

}

if (top < bottom) return not_present;

else {

position = bottom;

the_list.retrieve(bottom, data);

if (data == target) return success;

else return not_present;

}

}

One element

Everytime we fetch mid data we only compare with target.

We are performing only one comparisons of each mid

Page 11: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

binary_search_1( )

The division of the list into sublists is described in the following diagram: for each mid

if (data < target)

bottom = mid + 1;

else

top = mid;

Page 12: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

recursive_binary_2( )Error_code recursive_binary_2(const Ordered_list &the_list,

const Key &target,int bottom, int top, int &position)

{

Record data;

if (bottom <= top) {

int mid = (bottom + top)/2;

the_list.retrieve(mid, data);

if (data == target) {

position = mid;

return success;

}

else if (data < target)

return recursive_binary_2(the_list, target, mid+1, top, position);

else

return recursive_binary_2(the_list, target, bottom, mid-1, position);

}

else return not_present;

}

Every time we fetch mid data we compare for equality with target.

We are performing two comparisons of each mid

Page 13: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

binary_search_2( )Error_code binary_search_2(const Ordered_list &the_list,

const Key &target, int &position)

{

Record data;

int bottom = 0, top = the_list.size( ) - 1;

while (bottom <= top) {

position = (bottom + top)/2;

the_list.retrieve(position, data);

if (data == target) return success;

if (data < target) bottom = position + 1;

else top = position - 1;

}

return not_present;

}

Everytime we fetch mid data we compare for equality with target.

We are performing two comparisons of each mid

Page 14: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

binary_search_2( )The division of the list into sublists is described in the following diagram: for each mid

if (data == target) return success;

if (data < target) bottom = position + 1;

else top = position - 1;

Page 15: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison TreesThe comparison tree of an algorithm is obtained

by tracing the action of the algorithm, representing each comparison of keys by a vertex of the tree (which we draw as a circle). Inside the circle we put the index of the key against which we are comparing the target key.

The number of comparisons done by an algorithm in a particular search is the number of internal vertices traversed in going from the top of the tree, called its root, down the appropriate path to a leaf.

Page 16: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison Tree BranchRoot

internal vertices

Leaf

Page 17: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Level & height • The number of branches traversed to reach a vertex

from the root is called the level of the vertex. Thus the root itself has level 0, the vertices immediately below it have level 1, and so on.

• The largest level that occurs is called the height of the tree.

Page 18: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

path length• The external path length of a tree is the sum of

the number of branches traversed in going from the root once to every leaf in the tree.

• The internal path length is defined to be the sum, over all vertices that are not leaves, of the number of branches from the root to the vertex.

• We call the vertices immediately below a vertex v the children of v and the vertex immediately above v the parent of v.

Page 19: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison tree for sequential search

If it is an unsuccessful search

the number of comparisons

is n, that is the

internal path length + 1

of the tree.

Page 20: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison tree for binary_1( ); Forgetful Version; n = 10

Targetless thankey at mid:Search from1 to mid.

Targetgreater thankey at mid:Search frommid +1 to top.

5

><=

Page 21: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison tree for binary_1( ) Forgetful Version

if (target >mid data)

mid = (6+10)/2mid = (1+5)/2

Page 22: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison tree for binary_2( ), n = 10

Page 23: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison tree for binary_2( ), n = 10

Page 24: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison Count for binary_search_1; n = 10

• In binary_search_1, every search terminates at a leaf; to obtain the average number of comparisons for both successful and unsuccessful searches, we need what is called the external path length of the tree: the sum of the number of branches traversed in going from the root once to every leaf in the tree.

the external path length is:

(4 x 5) + (6 x 4) + (4 x 5) + (6 x 4) = 88

• Half the leaves correspond to successful searches, and half to unsuccessful searches.

• Hence the average number of comparisons needed for either a successful or unsuccessful search by binary_search_1( ) is 44/10 = 4.4 when n = 10.

Page 25: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

external path length for binary_1( ) Forgetful Version

6 x 4)4 x 5

4 x 5

6 x 4)

Page 26: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison Count for binary_search_2; n = 10

• In binary_search_2, all the leaves correspond to unsuccessful searches; hence the external path length leads to the number of

comparisons for an unsuccessful search. the external path length is:

(5 x 3) + (6 x 4) = 39

• We shall assume for unsuccessful searches that the n ‡ 1 intervals (less than the first key, between a pair of successive keys, or greater than the largest) are all equally likely; for the diagram we therefore assume that any of the 11 failure leaves are equally likely. Thus the average number of comparisons for an unsuccessful search is

(2 x 39)/11 = 7.1

Page 27: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

external path length for binary_2( ) Equality Version

5 x 3) 6 x 4

Page 28: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison Count for binary_search_2; n = 10

• For successful searches, we need the internal path length, which is defined to be the sum, over all vertices that are not leaves, of the number of branches from the root to the vertex.

the internal path length (no. of branches) is:

0 + 1 + 2 + 2 + 3 + 1 + 2 + 3 + 2 + 3 = 19

• Recall that binary_search_2 does two comparisons for each non-leaf except for the vertex that finds the target, and note that the number of these internal vertices traversed is one more than the number of branches (for each of the n = 10 internal vertices). We thereby obtain the average number of comparisons for a successful search to be

2 x (19/10 +1) - 1 = 4.8

• The subtraction of 1 corresponds to the fact that one fewer comparison is made when the target is found.

Page 29: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Generalization

• What happens when n is larger than 10?

• For longer lists, it may be impossible to draw the complete comparison tree, but from the examples with n = 10, we can make some observations that will always be true.

Page 30: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

2-Trees A 2-tree is a tree in which every vertex except the

leaves has exactly two children.

• Lemma 7.1 The number of vertices on each level of a 2-tree is at most twice the number on the level immediately above. Hence, in a 2-tree, the number of vertices on level t is at most 2t for t >= 0.

• Lemma 7.2 If a 2-tree has k vertices on level t , then t >= lg k, where lg denotes a logarithm with base 2.

Page 31: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Floor & ceiling The floor of a real number x is the largest integer less

than or equal to x, and the ceiling of x is the smallest integer greater than or equal to x.

We denote:

the floor of x by x and

the ceiling of x by x .

e.g., x = 5.46; floor = 5; ceiling = 6;

Page 32: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Analysis of binary_search_1We can now turn to the general analysis of binary_search_1 on a list of n entries.

The final step done in binary_search_1 is always a check for equality with the target;

hence both successful and unsuccessful searches terminate at leaves, and so there

are exactly 2n leaves altogether.

As illustrated in Figure 7.3 for n = 10, all these leaves must be on the same level or on two adjacent levels. (This observation can be proved by using mathematical induction to establish the following stronger statement:

If T1 and T2 are the comparison trees of binary_search_1 operating on lists L1 and L2 whose lengths differ by at most 1, then all leaves of T1 and T2 are on the same or adjacent levels.

The statement is clearly true when L1 and L2 are lists with length at most 2 ( L1=1 & L2=2). Moreover, if binary_search_1 divides two larger lists

whose sizes differ by at most one, the sizes of the four halves also differ by at

most 1, and the induction hypothesis shows that their leaves are all on the same or

adjacent levels.)

Page 33: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Analysis of binary_search_1From Lemma 7.2 it follows that the maximum level t of leaves in the comparison tree

satisfies t = lg 2n .Since one comparison of keys is done at the root (which is level 0), but no

comparisons are done at the leaves (level t ), it follows that the maximum number

of key comparisons is also t = lg 2n . Furthermore, the maximum number is at most one more than the average number, since

all leaves are on the same or adjacent levels.

maximum number of key comparisons is also t = lg 2n = lg (n x 2)

= lg n +lg 2

= lg n + 1 (worst case)

When n is odd all leaves will be on the same level and

# of comparisons = lg n

Page 34: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

binary search 1

Page 35: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Analysis of binary_search_2, Unsuccessful Search

• To count the comparisons made by binary_search_2 for a general value of n for an unsuccessful search, we shall examine its comparison tree.

• For reasons similar to those given for binary_search_1, this tree is again full at the top, with all its leaves on at most two adjacent levels at the bottom.

• For binary_search_2, all the leaves correspond to unsuccessful searches, so there are exactly n + 1 leaves, corresponding to the n + 1 unsuccessful outcomes: less than the smallest key, between a pair of keys, and greater than the largest key.

Page 36: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Analysis of binary_search_2, Unsuccessful Search

• Since these leaves are all at the bottom of the tree, Lemma 7.1 implies that the number of leaves is approximately 2h , where h is the height of the tree, 2h = n + 1

Taking (base 2) logarithms, we obtain that h lg(n + 1).

This value is the approximate distance from the root to one of the leaves.

• Since, in binary_search_2, two comparisons of keys are performed for each internal vertex, the number of comparisons done in an unsuccessful search is approximately

2 lg(n + 1)

Page 37: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Binary search 2, Unsuccessful Search

Page 38: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Analysis of binary_search_2, Successful Search

• In the comparison tree of binary_search_2, the distance to the leaves is lg(n + 1), as we have seen. The number of leaves is n + 1, so the external path length (E) is about, no. of leaves x height:

E = (n + 1)lg(n + 1)• Theorem 7.3 (E = I + 2q ) then shows that the internal path length

is about

I = (n + 1)lg(n + 1)- 2n

• To obtain the average number of comparisons done in a successful search, we must first divide by n (the number of non-leaves) and then add 1 and double, since two comparisons were done at each internal node. Finally, we subtract 1, since only one comparison is done at the node where the target is found. The result is:

E = I + 2qI = E - 2q

Page 39: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Analysis of binary_search_2, Successful Search

Average number of comparisons = ( Internal path length / no. of vertices +1) 2 -1

[+1 because every internal path length (x) has (x+1) vertices]

= ((n + 1)lg(n + 1)- 2n)/n +1)2 -1= (2(n + 1)lg(n + 1))/n -2 -1= (2(n + 1)lg(n + 1))/n -3

if n is bigaverage number of comparisons = (2 lg n)/n -3

Page 40: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Binary search 2

Page 41: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

The Path-Length Theorem

Page 42: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Proof

• To prove the theorem we use the method of mathematical induction, using the number of vertices in the tree to do the induction.

• If the tree contains only its root, and no other vertices, then E = I = q = 0, and the base case of the theorem is trivially correct.

• Now take a larger tree, and let v be some vertex that is not a leaf, but for which both the children of v are leaves.

• Let k be the number of branches on the path from the root to v. See Figure 7.6.

Page 43: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Proof

• Now let us delete the two children of v from the 2-tree. Since v is not a leaf but its children are, the number of non-leaves goes down from q to q - 1.

• The internal path length I is reduced by the distance to v (= k) ; that is, to I - k. The distance to each child of v is k + 1, so the external path length is reduced from E to E - 2(k + 1),

• but v is now a leaf, so its distance, k, must be added, giving a new external path length of

E - 2(k + 1) + k = E - k - 2:

• Since the new tree has fewer vertices than the old one, by the induction hypothesis we know that

E - k - 2 = (I - k) + 2(q - 1):

• Rearrangement of this equation gives the desired result. end of proof

Page 44: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison of Methods The number of comparisons of keys done by binary searches in

searching a list of n items is approximately:

Search-1:

Successful &

Unsuccessful

Search-2:

Successful &

Unsuccessful

If high value of n:

Page 45: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

Comparison of Methods

for even higher value of n:

Page 46: Binary Searching. Binary Search Searching an ordered list. First compare the target to the key in the center of the list. If it is smaller, restrict the

interpolation search• To find the target key target, interpolation search then estimates,

according to the magnitude of the number target relative to the first and last entries of the list, about where target would be in the list and looks there. It then reduces the size of the list according as target is less than or greater than the key examined.

• It can be shown that on average, with uniformly distributed keys, interpolation search will take about lg lg n comparisons of keys, which, for large n, is somewhat fewer than binary search requires.

• If, for example, n = 1,000,000, then binary_search_1 will require about lg 106 + 1 21 comparisons,

while interpolation search may need only about lg lg 106 4.32 comparisons.