[Google Interview] The Two Sum Problem

[toc]

?️ Company Tags: Google, Facebook, Amazon

Are you gearing up for your coding interview? If your answer is yes, then here’s a very important and commonly asked interview question for you. Numerous programmers have claimed that they came across this interview question. Hence, there is a high probability that you might come across it as well in your interview.

So, if this question was asked in your interview, will you be able to solve it optimally?

Problem Formulation

Given a list of integers “nums” and an integer “target”. Find the sum of the two numbers such that they add up to the target number and return their indices.

⚠️Constraints:

  • 2 <= nums.length <= 104
  • -109 <= nums[i] <= 109
  • -109 <= target <= 109
  • Only one valid answer exists and you cannot use same element twice.

?Examples

Let’s have a look at some examples to improve our understanding of this problem.

✏️ Example 1:
Input: nums = [2, 7, 11, 15], target = 9
Output: [0, 1]
Explanation: The numbers at indices 0 and 1 add up to the target value of 9.

✏️ Example 2:
Input: nums = [5, 5], target = 10
Output: [0, 1]
Explanation: The numbers at the indices 0 and 1 add up to the target value of 10. 

✏️ Example 3:
Input: nums = [-2, -1, 0, 1], target = 0
Output: [1, 3]
Explanation: The numbers at the indices 1 and 3 add up to the target value of 0.

✏️ Example 4:
Input: nums = [2, 5, 6], target = 4
Output: []
Explanation: No numbers in the list add up to the target value of 4.

✏️ Example 5:
Input: nums = [ ], target = 5
Output: []
Explanation: Empty list (edge case). 

?️Naïve Approach: Brute Force Algorithm

Approach:

Let us start analyzing the problem with the simplest approach. The idea is to traverse the entire array for each integer in the given array and find its complement by traversing the array again. Thus, for each pair, you have to check if the sum of the numbers is equal to the target value. If yes, return the indices of the integers that add up to generate the target number. 

To get a clear picture of the approach explained above, let us have a look at an example:

Given array:

Let’s visualize how the proposed algorithm will traverse the array and find the pair of numbers that add up to 9.

Thus, for every value at the ith index, we traverse through the remaining values in the list and check if it matches the target value. In this example, the match is found when the nums[i=2]+nums[j=4] = 0 + 9.

Now, let’s look at the code:

def two_sum(a, x):
    for i in range(0, len(a)):
        for j in range(i + 1, len(a)):
            if a[i] + a[j] == x:
                return [i, j]
    return []

Test cases: Let’s execute this code on our examples to check if it works:

# Example 1:
nums = [11, 2, 15, 7]
target = 9
print(two_sum(nums, target))
# [1, 3]

# Example 2:
nums = [5, 5]
target = 10
print(two_sum(nums, target))
# [0, 1]

# Example 3:
nums = [-2, -1, 0, 1]
target = 0
print(two_sum(nums, target))
# [1, 3]

# Example 4:
nums = [2, 5, 6]
target = 4
print(two_sum(nums, target))
# []

# Example 5:
nums = []
target = 5
print(two_sum(nums, target))
# []

Yeah!? It passed all the test cases.

Complexity Analysis

  • Time complexity: In this method, for every number in the list, it attempts to find its complement by iterating through the rest of the list again. It takes O(n) time to iterate once. Hence, as we iterate twice here, the overall time complexity becomes O(n2).
  • Space complexity: For this solution, the space used remains constant as there are no additional data structures (dictionary, arrays) used. This solution proves beneficial with regards to space as the space complexity is O(1)

Discussion: Though this approach generated the expected output, however, the time complexity is quadratic in this case. Hence, this method may not have much effect on small inputs but does not have a feasible runtime for large inputs. So, is there any way the code could be optimized? Yes, there’s always a better way!?

?️Optimized Solution: Using a Hash Table

In the brute force approach, we were traversing almost the entire array for each integer/element in the given array. This meant we were doing a lot of repetitive work by using the second loop. You can reduce the time complexity to O(n). The problem can hence be solved in linear time.

The idea is to utilize a hash-table as they have constant O(1) lookup time. Now, what’s a hash table in Python? In layman’s terms you can consider a Python dictionary as a hash table. Please go ahead and read the description of Python’s dict implementation, as formulated by Tim Peters, here.

Read more about hash tables here.

Let’s begin with the algorithm in the first place to get an overview of this approach.

Algorithm:

  1. Initialize an empty dictionary. Then, for every number in the list, calculate the complement of the number.
    • Complement = target value-current number
  2. Then, search for the complement in the hash table. 
  3. If the complement is present, return the pair of indices, i.e., the index of the complement and the index of the current value.
  4. If the complement is not present, store the current number in the dictionary.

Approach:

Since you have to use a dictionary in this method, let us have a look at a graphical illustration/example to have a better understanding of this approach.

  • Given List:
  • Target Value: 9

In the above example as we kept storing the index of the values while traversing the list in the dictionary until we encountered the pair where the calculated complement was already present/stored in the dictionary. Here, in the 5th iteration, the complement of ‘9’ (at index 4), which is ‘0’ was found to be present at the 2nd index in the dictionary. Here’s another diagram that represents the flow of control of this approach:

Let’s look at the code:

def two_sum(nums, target):
    d = {}
    for i, val in enumerate(nums):
        comp = target - val
        if comp in d:
            return [d[comp], i]
        else:
            d[val] = i
    return []

? Note
Python’s built-in enumerate(iterable) function allows you to loop over all elements in an iterable and their associated counters. Formally, it takes an iterable as an input argument and returns an iterable of tuples (i, x)—one per iterable element x. The first integer tuple value is the counter of the element x in the iterable, starting to count from 0. The second tuple value is a reference to the element x itself. For example, enumerate(['a', 'b', 'c']) returns an iterable (0, 'a'), (1, 'b'), (2, 'c'). You can modify the default start index of the counter by setting the optional second integer argument enumerate(iterable, start).

Read more about Python’s enumerate() method here.

Let’s try this on our test cases:

# Example 1:
nums = [11, 2, 15, 7]
target = 9
print(two_sum(nums, target))
# [1, 3]

# Example 2:
nums = [5, 5]
target = 10
print(two_sum(nums, target))
# [0, 1]

# Example 3:
nums = [-2, -1, 0, 1]
target = 0
print(two_sum(nums, target))
# [1, 3]

# Example 4:
nums = [2, 5, 6]
target = 4
print(two_sum(nums, target))
# []

# Example 5:
nums = []
target = 5
print(two_sum(nums, target))
# []

Hurrah! It passed all the test cases.

Complexity Analysis

  • Time complexity: Using this approach, you need to traverse the list only once. Thus the runtime complexity remains linear, i.e. O(n). The time complexity to iterate over a dictionary (hash table) in Python is also O(n). Hence, this ensures that this approach has an overall time complexity of O(n).
  • Space complexity: In case of the worst scenario, we would have to traverse through the end of the list and hence, add all the numbers to the dictionary. Hence, the space complexity for this solution is O(N) (space taken by the dictionary.)

?️Bonus solution: The Two Pointer Approach

Approach: This is a slightly tricky solution wherein you have to sort the list initially. Then, you have to assign two-pointers (left and right) at the start and end of the list. Further, you must check if the numbers add up to the given target value. If yes, return the indices. If not, check if the target value is larger than the sum. If it is larger, decrease the right pointer, else increase the left pointer.

Note: You must make a copy of the list while sorting. This is because when you find the left or right pointers, you have only found pointers that apply to the sorted list. However, you also have to return the indices of the original list.  

Let’s look at the code:

def two_sum(nums, x):
    a = sorted(nums)
    left, right = 0, len(a) - 1

    while left < right:
        if a[left] + a[right] == x:
            if a[left] == a[right]:
                return [nums.index(a[left]), nums.index(a[left]) + 1]
            else:
                return [nums.index(a[left]), nums.index(a[right])]

        elif a[left] + a[right] < x:
            left = left + 1
        else:
            right = right - 1

    return []

Let’s try this on our examples:

numstargetOutput
[2, 7, 11, 15]9[0,1]
[5, 5]10[0,1]
[-2, -1, 0, 1]0[1,3]
[2, 5, 6]4[]
[]5[]

It passes all the test cases.

Complexity Analysis

Since the pointers will only go through the list once, but the overhead with this method is that you have to sort the list first. Hence, the overall time complexity for this solution becomes O(nlogn).

Conclusion

I hope you enjoyed this coding interview question. Please stay tuned and subscribe for more interesting coding problems.

Post Credits: Shubham Sayon and Rashi Agarwal


Recommended:  Finxter Computer Science Academy

  • Do you want to master the most popular Python IDE fast?
  • This course will take you from beginner to expert in PyCharm in ~90 minutes.
  • For any software developer, it is crucial to master the IDE well, to write, test and debug high-quality code with little effort.
The Complete Guide to PyCharm

Join the PyCharm Masterclass now, and master PyCharm by tomorrow!