5 Best Ways to Find K Largest Sum Pairs in Python

Rate this post

πŸ’‘ Problem Formulation: Finding the k largest sum pairs in a given array involves pairing elements from the array to create possible sums and then identifying the pairs that result in the highest sums. Specifically, in Python, if we’re given an array like [1, 4, 3, 2] and a number k=2, we want to find the two pairs with the largest sums, which would be (4, 3) and (4, 2) offering sums 7 and 6 respectively.

Method 1: Brute Force Approach

This method involves generating all possible sum pairs and then sorting them to pick the top k pairs. It is straightforward but not the most efficient, especially for large arrays. The function takes an array and a number “k” as inputs and returns “k” largest sum pairs.

Here’s an example:

def k_largest_pairs_brute(arr, k):
    all_pairs = []
    for i in range(len(arr)):
        for j in range(i+1, len(arr)):
            all_pairs.append((arr[i]+arr[j], (arr[i], arr[j])))
    all_pairs.sort(reverse=True)
    return [pair[1] for pair in all_pairs[:k]]

print(k_largest_pairs_brute([1, 4, 3, 2], 2))

Output: [(4, 3), (4, 2)]

In this code snippet, we create a list of all possible pairs along with their sums. We then sort this list in reverse order to get the pairs with the largest sums at the front. Finally, we slice the first k elements of the sorted list to get our result.

Method 2: Using a Heap

Heaps are great for keeping track of the k largest elements efficiently. Python’s heapq module can be used for this purpose, specifically its ‘nlargest’ function that returns the k largest elements from a list of pairs ordered by their sum.

Here’s an example:

import heapq

def k_largest_pairs_heap(arr, k):
    return heapq.nlargest(k, ((arr[i], arr[j]) for i in range(len(arr)) for j in range(i+1, len(arr))), key=sum)

print(k_largest_pairs_heap([1, 4, 3, 2], 2))

Output: [(4, 3), (4, 2)]

This code uses a generator expression to create an iterator over all unique pairs, without creating an intermediate list of all pairs, which can save memory. The heapq.nlargest() function takes a key function, which in this case is the built-in sum, which computes the sum of each pair.

Method 3: Pre-sorting Array

By first sorting the array, we can iterate through less combinations, as once we’ve found k pairs, subsequent pairs will necessarily be smaller. Nonetheless, this method still requires handling duplicates carefully.

Here’s an example:

def k_largest_pairs_presort(arr, k):
    arr.sort(reverse=True)
    result = []
    for i in range(k):
        for j in range(i+1, k):
            result.append((arr[i], arr[j]))
    return result[:k]

print(k_largest_pairs_presort([1, 4, 3, 2], 2))

Output: [(4, 3), (4, 2)]

First, the array is sorted in descending order. Afterwards, we iterate over the first k elements twice, collecting pairs as we go. The result may contain duplicates and can possibly be more than k pairs, so we return the first k unique pairs.

Method 4: Using a Priority Queue

A priority queue can manage a running selection of the k-largest elements efficiently as new sums are generated. This method uses Python’s built-in PriorityQueue class to automatically sort new pairs by their sum as they are generated.

Here’s an example:

from queue import PriorityQueue

def k_largest_pairs_pq(arr, k):
    pairs_queue = PriorityQueue()
    for i in range(len(arr)):
        for j in range(i+1, len(arr)):
            pairs_queue.put((-1 * (arr[i] + arr[j]), (arr[i], arr[j])))
            if pairs_queue.qsize() > k:
                pairs_queue.get()
    result = []
    while not pairs_queue.empty():
        result.append(pairs_queue.get()[1])
    return result[::-1]  # Return in correct order

print(k_largest_pairs_pq([1, 4, 3, 2], 2))

Output: [(4, 3), (4, 2)]

This function inserts negative sums into the priority queue because Python’s Priority Queue is a min-heap, and we are looking for the largest sums. The queue never holds more than k elements, so we extract the k-largest sum pairs. Finally, we reverse the result list to order the pairs correctly from largest to smallest.

Bonus One-Liner Method 5: Using itertools and slicing

A one-liner approach can be accomplished using Python’s itertools to generate the pairs and slicing to pick the top k pairs. It is compact and takes advantage of Python’s rich standard library but not necessarily the most readable or efficient.

Here’s an example:

from itertools import combinations

k_largest_pairs_oneliner = lambda arr, k: sorted(combinations(arr, 2), key=sum, reverse=True)[:k]
print(k_largest_pairs_oneliner([1, 4, 3, 2], 2))

Output: [(4, 3), (4, 2)]

Here we use itertools.combinations() to generate all pairs and sorted() to sort them by their sum in descending order. The lambda function then returns the first k pairs from the sorted list, which are our desired largest sum pairs.

Summary/Discussion

  • Method 1: Brute Force Approach. Simplest to understand. Not practical for large arrays due to O(n^2 log n) complexity.
  • Method 2: Using a Heap. More efficient than brute force for large arrays. The complexity is better with O(n^2 log k).
  • Method 3: Pre-sorting Array. Pre-sorting may lead to speed benefits, but careful handling of duplicates is needed. Complexity can also be O(n^2).
  • Method 4: Using a Priority Queue. Efficient real-time processing of the largest pairs during generation. Manages balancing the heap in the background, still with O(n^2 log k) complexity.
  • Method 5: Bonus One-Liner. Compact and utilizes Python’s strong standard library. Not the most efficient or readable but a clever use of language features.