[Google Interview] How To Find The Majority Element In An Array?

[toc]

Company Tags: Google, Amazon, Yahoo, Microsoft

This is one of the Google interview questions and one of the most commonly asked questions during many interviews. So, can you give the optimal solution to this problem?

Problem Formulation

Given an array nums of size n, the task is to return the majority element. The majority element is the element that appears more than [n / 2⌋ times.

Note: You may assume that the majority element always exists in the array.

⚠️Constraints:

n = = nums.length
1 <= n <= 5 * 10⁴
-2³¹ <= nums[i] <= 2³¹ – 1

?Examples

Let’s have a look at some examples to improve our understanding of this problem.

✏️ Example 1
Input: nums = [3, 2, 3]
Output: 3

✏️ Example 2
Input: nums = [2, 2, 1, 1, 1, 2, 2]
Output: 2

✏️ Example 3
Input: nums = [10, 20, 40, 10, 50, 10, 10]
Output: 10

✏️ Example 4
Input: nums = [5, 5, 5, 5, 5]
Output: 5

✏️ Example 5
Input: nums = [1]
Output: 1

?️Method 1: Brute Force Approach

❖ Approach: The simplest solution to this problem would be to count the number of times every element occurs in nums. If this count is more than (n/2), return the element.

❖ Algorithm:

Initialize a couple of variables count and max that will store the count of an element and the count of the element that occurs maximum number of times in the list, respectively.
Iterate over the given list nums and increment the value of the count value if the same element appears again in the list.
Update the max variable when the value of the count variable is more than max. (Initially, the value of count will always be greater than max). Also, store the index of the element with the maximum count.
Finally, check if the max > size//2, return the element with the help of its index that you stored previously.

The following illustration will further clarify things:

❖ Let’s look at the code:

def majority_ele(nums):
    size = len(nums)
    max_count = 0
    for i in range(size):
        count = 0
        for j in range(size):
            if nums[i] == nums[j]:
                count = count + 1
        if count > max_count:
            max_count = count
            element = i
    if max_count > size // 2:
        return nums[element]

Test Case Analysis: Let’s check if this code works properly by execution it on our examples/test-cases:

# Example 1
nums = [3, 2, 3]
print(majority_ele(nums))
# 3

# Example 2
nums = [2, 2, 1, 1, 1, 2, 2]
print(majority_ele(nums))
# 2

# Example 3
nums = [10, 20, 40, 10, 50, 10, 10]
print(majority_ele(nums))
# 10

# Example 4
nums = [5, 5, 5, 5, 5]
print(majority_ele(nums))
# 5

# Example 5
nums = [1]
print(majority_ele(nums))
# 1

It passed all the test cases.?

❖ Complexity Analysis

Time Complexity: Since the list had to be travelled repeatedly for every element, hence a nested for loop was required in this method. Thus, this leads to quadratic runtime complexity of O(n²).
Space Complexity: No extra space is required in this method. Hence, it has a space complexity of O(1).

❖ Discussion

Although this method works properly, it is not the most efficient solution. In this approach, we are repeatedly traversing the entire list for each element in the given list. This accounts for a quadratic runtime complexity. But do we need to traverse the entire list repeatedly for each element again and again?

The answer to the above question is that there are better ways that allow us to reach a more efficient solution with a better runtime complexity. The next solution discusses an approach that will yield you the output in linear time.

?️Method 2: Using A Python Dictionary

❖ Approach: The idea here is to create a dictionary that will store the count of each number, thereby avoiding the need to traverse the list, again and again, leading to linear time complexity. Traverse the list and store the element and its count within the dictionary. If the element is already present in the dictionary, you just have to increment its count and update it in the dictionary. This helps you to avoid iterating over the list for the same element again.

❖ Algorithm:

Initialize the dictionary and a count variable.
Traverse the nums and if the element is not present in the dictionary, add the element to it. Otherwise, update the count of the element.
Return the element if its count becomes more than n//2.

Let’s have a look at the following illustration to have a deeper understanding of this concept.

❖ Let’s look at the code:

def majority_ele(nums):
    d = {}
    count = 1
    for i in nums:
        if i not in d:
            d[i] = count
        else:
            d[i] += count
    val = max(d, key=d.get)
    if d[i] >= (len(nums) // 2):
        return val

Test Case Analysis: Let’s run this code on our examples to verify if it works.

Yeah! It passed all the test cases.

❖ Complexity Analysis

Time Complexity: In this method, we traverse the nums list only once. Hence, it has a time complexity of O(n).
Space Complexity: An extra space is required in this method for storing the elements in the dictionary. Hence, it has a space complexity of O(n).

❖ Discussion

Though this approach was more efficient in terms of time complexity, we used an extra space here. This led to a linear time complexity as opposed to the brute-force method which had a constant space complexity. Thus, can we optimize the solution to work in linear time with constant space complexity, i.e., O(1)?

?️Optimized Solution: Boyer–Moore Majority Vote Algorithm

If it is confirmed that the majority element exists in the list then Boyer-Moore Majority Vote Algorithm is a very effective and probably the easiest way to find the majority element in the given list. Since the majority element occurs more than n//2 times, its recurrence is greater than any remaining elements combined. The idea behind this algorithm is that for the occurrence of a majority element, we can ignore the non-majority elements.

❖ Algorithm:

Initialize a variable “major” that will store the majority element to -1 and count to 0.
Traverse the nums list. If the count is 0, update the current element as the majority element and initialize the count to 1.
If the majority element is equal to the current element, increase the count variable. Else, decrease the count variable.
Return the majority element.

➠ The following illustration will help you to understand the approach used in this method.

❖ Let’s look at the code to implement the approach explained above:

def majority_ele(nums):
    major = -1
    count = 0
    for i in range(len(nums)):
        if count == 0:
            major = nums[i]
            count = 1
        elif major == nums[i]:
            count = count + 1
        else:
            count = count - 1
    return major

Test Case Analysis: Let’s run this on our examples.

Hurray! It works. ?

?Note: The Boyer–Moore majority vote algorithm only works correctly if and only if it is confirmed that the majority element exists.

❖ Complexity Analysis

Time Complexity: Similar to the second approach where we used a Python dictionary, in this approach too we have to traverse the nums list only once. Hence, it has a time complexity of O(n).
Space Complexity: As no extra space is required in this method, it has a space complexity of O(1).

Conclusion

I hope you enjoyed this coding interview question. Please stay tuned and subscribe for more interesting coding problems.

✍️ Post Credits: Rashi Agarwal and Shubham Sayon

Recommended: Finxter Computer Science Academy

Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.