Searching Algorithms – Part 3

Uniform Binary Search

I came across the uniform binary search algorithm in The Art Of Computer Programming – Volume 3: Sorting and Searching as an optimization of the binary search, invented by the author, Donald Knuth. It works quite well for architectures on which a table lookup is generally faster than an addition and a shift, and also when many searches will be performed on the same array, or on several arrays of the same length.

As compared to the traditional binary search, in which a midpoint of upper and lower bounds is taken, a lookup table is used to update a single array index. This makes it suitable for architectures such as Knuth’s MIX and MMIX.

The algorithm is based on the idea that instead of having a upper bound, a lower bound and a midpoint, such as in the binary search, we can use just two pointers: one for the current position i and one for its rate of change δ. After each unequal comparison, we could then set i = i ± δ and δ = δ/2. A simple version, applying just this basis can be:

  1. Given a table of records R1, R2, …, RN whose keys are in the in the increasing order K1<K2<…<KN, the algorithm searches for a given argument K, assuming N≥1.
  2. Set i = ceil(N/2), m = floor(N/2)
  3. If K < Ki, go to step 4; if K > Ki, go to step 5; and if K = Ki, the algorithm terminates successfully.
  4. (We have pinpointed the search to an interval that contains either m or m – 1 records; i points just to the right of this interval.)
    If m = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i – ceil(m/2); then set m = floor(m/2) and return to step 3.
  5. (We have pinpointed the search to an interval that contains either m or m – 1 records; i points just to the left of this interval.)
    If m = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i + ceil(m/2); then set m = floor(m/2) and return to step 3.

The advantage of this algorithm is that we need not maintain the value of m at all; we only need to refer to a short table of the various δ to use at each level of the tree. Thus the algorithm reduces to the following one (with a lookup table)

  1. The table entries are DELTA[j] = floor( (N + 2j-1) / 2j ), for 1 ≤ j ≤ floor(lg n) + 2. As mentioned in the previous post, lg n implies log2 n.
  2. Assuming we have 1-based array indexing, set i = DELTA[1], j = 2.
  3. If K < Ki, go to step 4; if K > Ki, go to step 5; and if K = Ki, the algorithm terminates successfully.
  4. If DELTA[j] = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i – DELTA[j], j = j + 1, and go to step 3.
  5. If DELTA[j] = 0, the algorithm terminates unsuccessfully. Otherwise, set i = i + DELTA[j], j = j – 1, and go to step 3.

The implementation of this algorithm in C is:

#define LOG_N 42

static int delta[LOG_N];

void make_delta(int N)
{
    int power = 1;
    int i = 0;
    do {
        int half = power;
        power <<= 1;
        delta[i] = (N + half) / power;
    } while (delta[i++] != 0);
}

int unisearch(int *a, int key)
{
    int i = delta[0]-1;  /* midpoint of array */
    int d = 0;

    while (1) {
        if (key == a[i]) {
            return i;
        } else if (delta[d] == 0) {
            return -1;
        } else {
            if (key < a[i]) {
                i -= delta[++d];
            } else {
                i += delta[++d];
            }
        }
    }
}

/* Example of use: */
#define N 10
int main(void)
{
    int i, a[N] = {1,3,5,6,7,9,14,15,17,19};
    make_delta(N);
    for (i=0; i < 20; ++i)
      printf("%d is at index %d\n", i, unisearch(a, i));
    return 0;
}

The comparison of Uniform Binary Search and Binary Search is shown in the graph below:

Comparision between Uniform Search and Binary Search

This graph show how data of 100 numbers searched for each number in the data list. And it plots number of passes of binary search and uniform binary search.

Leave a comment