5 Best Ways to Find the Longest Nice Substring Using Python

πŸ’‘ Problem Formulation: The challenge involves writing a Python program to find the ‘longest nice substring’. A substring is deemed ‘nice’ if every letter in the substring is also present in both lowercase and uppercase. For instance, given the input string “BbAacC”, a nice substring would be “BbAa”, and among all nice substrings, “BbAacC” is the longest.

Method 1: Brute Force

This method iterates through all possible substrings of the input string and checks if each is a nice substring. It keeps track of the longest nice substring found. While this method is straightforward and ensures that the longest nice substring will be found, it is not efficient for long strings due to its O(n^3) time complexity, where n is the length of the string.

Here’s an example:

def is_nice(s):
    return len(set(s.lower())) == len(set(s)) // 2

def longest_nice_substring(s):
    longest = ""
    for i in range(len(s)):
        for j in range(i, len(s)):
            if is_nice(s[i:j+1]) and len(s[i:j+1]) > len(longest):
                longest = s[i:j+1]
    return longest

print(longest_nice_substring("BbAacC"))

Output: BbAacC

This code defines a function is_nice() that checks if a substring contains both lower and upper case versions of a letter. The function longest_nice_substring() uses brute force by checking each substring for the ‘nice’ property and updates the longest nice substring found.

Method 2: Divide and Conquer

Using the divide and conquer strategy, this algorithm breaks down the problem into smaller subproblems and conquers each separately. If a character in uppercase or lowercase is not in the substring, it splits at that character, recursively finding the longest nice substring in the split parts. The divide and conquer approach improves efficiency for large datasets.

Here’s an example:

def longest_nice_substring(s):
    if not s: return ""
    
    chars = set(s)
    for c in s:
        if c.swapcase() not in chars:
            s1 = longest_nice_substring(s.split(c)[0])
            s2 = longest_nice_substring(s.split(c)[1])
            return max(s1, s2, key=len)
    
    return s

print(longest_nice_substring("BbAacC"))

Output: BbAacC

This code snippet first checks for the edge case of an empty string. It uses set(s) to find unique characters in the string. The loop splits at the first unpaired character it finds, then recursively calls longest_nice_substring() on each side of the split and returns the longer of the two solutions.

Method 3: Bit Masking and Dynamic Programming

In this method, bit masking is used to represent the presence of upper and lower case letters. Combined with dynamic programming, it efficiently reduces redundant computations. Each position in the input represents a state of this mask. The algorithm checks these masks to find the longest nice substring more rapidly than the brute force method.

Here’s an example:

def longest_nice_substring(s):
    # Assume only English letters are used (58 letters in ASCII)
    max_char = 58
    dp = [-1] * max_char
    result = ""
    
    for i, c in enumerate(s):
        mask = 1 << (ord(c) - 65) # uppercase starts at 65 in ASCII
        
        for j in range(max_char):
            if dp[j] != -1 and (j & mask):
                if i - dp[j] > len(result):
                    result = s[dp[j]:i+1]
            if j & mask:
                dp[j] = i
                
    return result

print(longest_nice_substring("BbAacC"))

Output: BbAacC

This code snippet tracks the positions of each letter using a bitmask. The dp array keeps track of the last seen index of each state represented by the bitmask. When a letter is found, we use the mask to update the states and determine if a new maximum length nice substring is found.

Method 4: Sliding Window

Implementing a sliding window algorithm to find the longest nice substring can sometimes be more efficient than brute force, especially if checking every substring is unnecessary. By checking for the valid ‘nice’ condition within the window, its size can be adjusted dynamically without checking every possible substring.

Here’s an example:

def longest_nice_substring(s):
    def check(start, end):
        return len(set(s[start:end])) == len(set(s[start:end].lower())) * 2
    
    start, end = 0, 1
    longest = s[start] if s else ""
    
    while end < len(s):
        if check(start, end + 1):
            if end + 1 - start > len(longest):
                longest = s[start:end + 1]
            end += 1
        else:
            start += 1
            if start == end:
                end += 1
                
    return longest

print(longest_nice_substring("BbAacC"))

Output: BbAacC

The code defines a function check() to verify if the current window is a nice substring. It uses two pointers, start and end, to represent the sliding window. The window size is adjusted dynamically to find the longest nice substring.

Bonus One-Liner Method 5: Functional Programming Approach

For programmers who favor functional programming, Python’s built-in functions like filter() and max() combined with generator expressions can succinctly solve the problem in a single line of code, albeit at the potential cost of readability.

Here’s an example:

print(max((s[i:j+1] for i in range(len(s)) for j in range(i, len(s)) if set(s[i:j+1]) == set(s[i:j+1].lower()) * 2), key=len, default=""))

Output: BbAacC

This one-liner utilizes a generator expression to iterate over all possible substrings and uses set operations to filter for the nice ones. It then applies the max() function to find the longest of them. The default parameter ensures that an empty string is returned if there are no nice substrings.

Summary/Discussion

  • Method 1: Brute Force. Easy to understand but inefficient for long strings. Time complexity can be prohibitive with O(n^3).
  • Method 2: Divide and Conquer. More efficient, especially for strings with early split points. Recursive and can be faster than the brute force method.
  • Method 3: Bit Masking and Dynamic Programming. Optimized for performance with potentially complex code. It is more suitable for fixed datasets like ASCII characters.
  • Method 4: Sliding Window. It can be efficient if a valid window is found early, but it may degrade to O(n^2) time complexity in the worst case.
  • Method 5: Functional Programming Approach. Elegant and concise but may trade off readability for brevity. Not as clear as multi-line solutions.