5 Best Ways to Find Pairs with Minimized Sum Difference in Python

Rate this post

πŸ’‘ Problem Formulation: Given an array of integers, we seek an efficient program in Python to find two pairs (a, b) and (c, d), such that the absolute difference between the sums of the pairs (a+b) and (c+d) is minimized. For instance, for the input array [1, 3, 4, 7], a desirable output might be the pairs (1, 3) and (4, 7), where the sum difference is |4 – 11| = 7.

Method 1: Brute Force Approach

The brute force approach involves iterating through all possible pairs in the array, calculating the sum differences, and keeping track of the minimal difference found. It’s simple but not efficient for large datasets, with a time complexity of O(n^4).

Here’s an example:

def find_min_diff_pairs_brute(arr):
    min_diff = float('inf')
    n = len(arr)
    for i in range(n):
        for j in range(i + 1, n):
            for k in range(n):
                for l in range(k + 1, n):
                    if i != k and i != l and j != k and j != l:
                        diff = abs((arr[i] + arr[j]) - (arr[k] + arr[l]))
                        if diff < min_diff:
                            min_diff = diff
                            pairs = [(arr[i], arr[j]), (arr[k], arr[l])]
    return pairs

# Example array
print(find_min_diff_pairs_brute([1, 3, 4, 7]))

Output:

[[(1, 3), (4, 7)]]

In this snippet, we iterate over every combination of two pairs in the provided array and calculate their sum differences. If a new minimum difference is found, it is stored along with the corresponding pairs. It’s a straightforward method but impractical for large arrays due to its high computational complexity.

Method 2: Sorting and Pairing

This method improves the efficiency by first sorting the array and then considering pairs in a more structured manner, reducing the time complexity to O(n^2 log n) due to sorting.

Here’s an example:

def find_min_diff_pairs_sorting(arr):
    arr.sort()
    min_diff = float('inf')
    n = len(arr)
    for i in range(n-1):
        for j in range(i+2, n):
            diff = abs((arr[i] + arr[i+1]) - (arr[j-1] + arr[j]))
            if diff < min_diff:
                min_diff = diff
                pairs = [(arr[i], arr[i+1]), (arr[j-1], arr[j])]
    return pairs

# Example array
print(find_min_diff_pairs_sorting([1, 3, 4, 7]))

Output:

[[(1, 3), (4, 7)]]

Here, we sort the array first, and then iterate less by only considering pairs with adjacent elements, which are more likely to yield a smaller difference. The pairs are tracked similarly to the brute force method, but with fewer iterations.

Method 3: Using Heaps for Optimal Pairing

This method involves using a min-heap to efficiently find the closest pairs by sum with a time complexity of O(n^2 log n), leaning on the heap’s ability to quickly find the minimum element.

Here’s an example:

import heapq

def find_min_diff_pairs_heap(arr):
    n = len(arr)
    arr.sort()
    min_diff = float('inf')
    heap = []
    for i in range(n - 1):
        for j in range(i + 1, n):
            sum_pair = arr[i] + arr[j]
            heapq.heappush(heap, (sum_pair, i, j))
    
    while heap:
        sum_pair, i, j = heapq.heappop(heap)
        for k in range(n):
            if k != i and k != j:
                diff = abs(sum_pair - (arr[k] + arr[(k + 1) % n]))
                if diff < min_diff:
                    min_diff = diff
                    pairs = [(arr[i], arr[j]), (arr[k], arr[(k + 1) % n])]
                    break
        if min_diff == 0:  # Early stopping condition if perfect match is found
            break
    return pairs

# Example array
print(find_min_diff_pairs_heap([1, 3, 4, 7]))

Output:

[[(1, 3), (4, 7)]]

In this method, we still sort the array, but instead of nested loops, we use a heap to keep track of the sum of pairs. We then iterate through the array to find a second pair that minimizes the difference with the sum from the top of the heap.

Method 4: Two-Pointer Technique

The two-pointer technique is useful for sorted arrays and can reduce the time complexity to O(n^2). This technique maintains two pointers to pair elements and check their sums, iterating over the array just once.

Here’s an example:

def find_min_diff_pairs_two_pointer(arr):
    arr.sort()
    min_diff = float('inf')
    n = len(arr)
    for i in range(n - 3):
        left = i + 1
        right = n - 1
        while left < right:
            diff = abs((arr[i] + arr[left]) - (arr[right - 1] + arr[right]))
            if diff < min_diff:
                min_diff = diff
                pairs = [(arr[i], arr[left]), (arr[right - 1], arr[right])]
            left += 1
            right -= 1
    return pairs

# Example array
print(find_min_diff_pairs_two_pointer([1, 3, 4, 7]))

Output:

[[(1, 3), (4, 7)]]

This two-pointer technique uses a sorted array to minimize the number of comparisons. We have two pointers starting from the beginning and end of the array interval being considered, and we move them closer to find the minimal sum difference efficiently.

Bonus One-Liner Method 5: Pythonic Approach

For a more Pythonic approach, we can use list comprehensions and the itertools library to write a concise solution, though this does not improve upon the brute force strategy’s complexity.

Here’s an example:

from itertools import combinations

def find_min_diff_pairs_pythonic(arr):
    pairs = list(combinations(arr, 2))
    min_diff, min_pairs = min(
        ((abs(sum(x) - sum(y)), (x, y)) for x, y in combinations(pairs, 2)),
        key=lambda z: z[0]
    )
    return min_pairs

# Example array
print(find_min_diff_pairs_pythonic([1, 3, 4, 7]))

Output:

[((1, 3), (4, 7))]

By leveraging the itertools library, this snippet finds all possible two-number combinations, then finds the pair of these combinations that minimizes the difference in their sums.

Summary/Discussion

  • Method 1: Brute Force Approach. Straightforward and easy to understand. However, it’s unsuitable for larger datasets due to its high complexity.
  • Method 2: Sorting and Pairing. More efficient than the brute force method due to structured pairing after sorting. Still has a significant time complexity due to nested loops.
  • Method 3: Using Heaps. Utilizes heap data structure to find minimal differences more efficiently. This method somewhat improves on the time complexity but still adds overhead due to additional operations with the heap.
  • Method 4: Two-Pointer Technique. Efficient for pre-sorted arrays and significantly reduces the number of operations over brute force approaches. Requires understanding of two-pointer iteration techniques.
  • Bonus Method 5: Pythonic Approach. Elegant and concise but does not offer a performance improvement over Method 1: Brute Force Approach.