5 Best Ways to Perform Binary Search Bisect in Python

πŸ’‘ Problem Formulation: Binary search is a fast algorithm for finding the position of an element in a sorted array. The bisect operation in Python refers to finding the insertion point for a specified element to maintain the sorted order of the array. Given a sorted array, for example [1, 3, 4, 4, 6, 8, 9], and a target value 4, the desired output would be the index 2 or 3 depending on the search criteria used.

Method 1: Using bisect_left from the bisect module

The bisect_left function from the bisect module helps to find the index of the leftmost position where the target element should be inserted to maintain the sorted order. If the target is already present in the array, bisect_left returns the index of the target’s first occurrence.

Here’s an example:

import bisect

def left_bisect(sorted_array, target):
    return bisect.bisect_left(sorted_array, target)

arr = [1, 3, 4, 4, 6, 8, 9]
target = 4
index = left_bisect(arr, target)
print(index)

Output:

2

This code uses the bisect_left function from the bisect module to determine the first possible position for the number 4 in the sorted list. The result, index 2, indicates that 4 could be inserted before the first occurrence of 4 in the list to maintain sorted order.

Method 2: Using bisect_right from the bisect module

The bisect_right function (also known as bisect.bisect) finds the insertion index to the right of any existing entries. Unlike bisect_left, it returns the index after the last occurrence of the target in the array.

Here’s an example:

import bisect

def right_bisect(sorted_array, target):
    return bisect.bisect_right(sorted_array, target)

arr = [1, 3, 4, 4, 6, 8, 9]
target = 4
index = right_bisect(arr, target)
print(index)

Output:

4

The code snippet uses bisect_right to find the latest position where the target number 4 can be inserted into the array while maintaining its sorted state. The result is 4, showing that 4 can be placed after the last occurrence of 4 in the array.

Method 3: Writing a manual binary search

A manual binary search can be implemented in Python without any external libraries. This method gives flexibility and control over the iteration and comparison process, which can sometimes be necessary for specific requirements.

Here’s an example:

def manual_bisect(sorted_array, target):
    left, right = 0, len(sorted_array) - 1
    while left <= right:
        mid = (left + right) // 2
        if sorted_array[mid]  target:
            right = mid - 1
        else:
            return mid
    return left

arr = [1, 3, 4, 4, 6, 8, 9]
target = 4
index = manual_bisect(arr, target)
print(index)

Output:

2

This custom function simulates binary search by using a while loop to narrow the search domain until the target is found or the domain is empty. In this case, it returns index 2 where the target value 4 is first found in the array.

Method 4: Using numpy.searchsorted for large datasets

For those working with large datasets, NumPy’s searchsorted method provides a more performant binary search, especially for arrays stored in NumPy’s optimized data structures.

Here’s an example:

import numpy as np

def numpy_searchsorted(sorted_array, target):
    return np.searchsorted(sorted_array, target, side='left')

arr = np.array([1, 3, 4, 4, 6, 8, 9])
target = 4
index = numpy_searchsorted(arr, target)
print(index)

Output:

2

The example demonstrates the use of NumPy’s searchsorted function with the side='left' argument to mimic the behavior of bisect_left. The result is the same as the standard library’s equivalent, but this method could be considerably faster on large arrays due to NumPy’s optimizations.

Bonus One-Liner Method 5: Using list comprehensions

A list comprehension with a clever condition can also be used to find an insertion point index using pure Python, without any additional modules.

Here’s an example:

arr = [1, 3, 4, 4, 6, 8, 9]
target = 4
index = next((i for i, x in enumerate(arr) if x >= target), len(arr))
print(index)

Output:

2

This one-liner employs a generator within a list comprehension to iterate through the sorted array and find the position where our target or the next higher number appears. If the target isn’t found and the generator is exhausted, it defaults to returning the length of the array.

Summary/Discussion

  • Method 1: bisect_left. Good for simple left-bound searches. May not perform well with unsorted arrays and doesn’t handle custom compare functions.
  • Method 2: bisect_right. Useful when duplicates are present and you need the position after the last occurrence. Like Method 1, it requires sorting and cannot handle complex comparison functions.
  • Method 3: Manual binary search. Offers full control over comparison logic. Potentially more error-prone and verbose compared to using a library.
  • Method 4: numpy.searchsorted. Best for large datasets due to NumPy’s efficiency. However, it requires the NumPy library, which can be overkill for simple tasks.
  • Method 5: List comprehension one-liner. A creative way to perform the task in a single line of pythonic code. Readability might suffer, and performance can be an issue for large datasets.