Uniform Binary Search
I came across the uniform binary search algorithm in The Art Of Computer Programming – Volume 3: Sorting and Searching as an optimization of the binary search, invented by the author, Donald Knuth. It works quite well for architectures on which a table lookup is generally faster than an addition and a shift, and also when many searches will be performed on the same array, or on several arrays of the same length.
As compared to the traditional binary search, in which a midpoint of upper and lower bounds is taken, a lookup table is used to update a single array index. This makes it suitable for architectures such as Knuth’s MIX and MMIX.
The algorithm is based on the idea that instead of having a upper bound, a lower bound and a midpoint, such as in the binary search, we can use just two pointers: one for the current position i and one for its rate of change δ. After each unequal comparison, we could then set i = i ± δ and δ = δ/2. A simple version, applying just this basis can be:
- Given a table of records R1, R2, …, RN whose keys are in the in the increasing order K1<K2<…<KN, the algorithm searches for a given argument K, assuming N≥1.
- Set i = ceil(N/2), m = floor(N/2)
- If K < Ki, go to step 4; if K > Ki, go to step 5; and if K = Ki, the algorithm terminates successfully.
- (We have pinpointed the search to an interval that contains either m or m – 1 records; i points just to the right of this interval.)
If m = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i – ceil(m/2); then set m = floor(m/2) and return to step 3. - (We have pinpointed the search to an interval that contains either m or m – 1 records; i points just to the left of this interval.)
If m = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i + ceil(m/2); then set m = floor(m/2) and return to step 3.
The advantage of this algorithm is that we need not maintain the value of m at all; we only need to refer to a short table of the various δ to use at each level of the tree. Thus the algorithm reduces to the following one (with a lookup table)
- The table entries are DELTA[j] = floor( (N + 2j-1) / 2j ), for 1 ≤ j ≤ floor(lg n) + 2. As mentioned in the previous post, lg n implies log2 n.
- Assuming we have 1-based array indexing, set i = DELTA[1], j = 2.
- If K < Ki, go to step 4; if K > Ki, go to step 5; and if K = Ki, the algorithm terminates successfully.
- If DELTA[j] = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i – DELTA[j], j = j + 1, and go to step 3.
- If DELTA[j] = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i + DELTA[j], j = j – 1, and go to step 3.
The implementation of this algorithm in C is:
#define LOG_N 42 static int delta[LOG_N]; void make_delta(int N) { int power = 1; int i = 0; do { int half = power; power <<= 1; delta[i] = (N + half) / power; } while (delta[i++] != 0); } int unisearch(int *a, int key) { int i = delta[0]-1; /* midpoint of array */ int d = 0; while (1) { if (key == a[i]) { return i; } else if (delta[d] == 0) { return -1; } else { if (key < a[i]) { i -= delta[++d]; } else { i += delta[++d]; } } } } /* Example of use: */ #define N 10 int main(void) { int i, a[N] = {1,3,5,6,7,9,14,15,17,19}; make_delta(N); for (i=0; i < 20; ++i) printf("%d is at index %d\n", i, unisearch(a, i)); return 0; }
The comparison of Uniform Binary Search and Binary Search is shown in the graph below:
This graph show how data of 100 numbers searched for each number in the data list. And it plots number of passes of binary search and uniform binary search.