5 Best Ways to Check If a String Is a Heterogram in Python

💡 Problem Formulation: A heterogram is a string in which no letter occurs more than once. In this article, we explore different Python methods to determine if a given string is a heterogram. For instance, the input string "lamp post" should return False since the letter "p" appears more than once, while "garden" should return True.

Method 1: Using Set and Length Comparison

This method involves converting the string to a set of characters which automatically filters out repeated ones. Then, by comparing the length of this set to the length of the original string after removing spaces, we can determine if the string is a heterogram.

Here’s an example:

def is_heterogram(s):
    s = s.replace(" ", "")  # Remove spaces from the input string
    return len(set(s)) == len(s)

print(is_heterogram("garden"))  # True
print(is_heterogram("lamp post"))  # False

Output:

True
False

This code defines a function is_heterogram() that strips the input string of its spaces and then compares the length of the set of characters with the length of the stripped string to determine whether all characters in the string are unique.

Method 2: Using Dictionary Comprehension

Method 2 leverages dictionary comprehension to keep a count of each character. If any character count is greater than one, the function returns False; otherwise, it returns True.

Here’s an example:

def is_heterogram_dict(s):
    s = s.replace(" ", "").lower()  # Lowercasing and removing spaces
    return all(value == 1 for value in {ch: s.count(ch) for ch in s}.values())

print(is_heterogram_dict("dialogue"))  # True
print(is_heterogram_dict("apple"))  # False

Output:

True
False

The example defines is_heterogram_dict(), which creates a dictionary to count occurrences of each character and uses all() to check that every count is exactly one.

Method 3: Iterative Character Checking

This method iteratively checks if a character has appeared before by maintaining a set of seen characters. It returns False upon finding the first repeated character.

Here’s an example:

def is_heterogram_iterative(s):
    seen = set()
    for char in s:
        if char != " " and char in seen:
            return False
        seen.add(char)
    return True

print(is_heterogram_iterative("python"))  # True
print(is_heterogram_iterative("hello"))  # False

Output:

True
False

This code introduces a seen set that records characters iteratively. If a character is seen again (excluding spaces), the function yields False; otherwise, it returns True at the end.

Method 4: Using a Regular Expression

Regular expression (regex) can be used to find repeating characters within a string. If a match is found, the string is not a heterogram.

Here’s an example:

import re

def is_heterogram_regex(s):
    return not re.search(r"(?i)(\w).*\1", s)

print(is_heterogram_regex("subdermatoglyphic"))  # True
print(is_heterogram_regex("assesses"))  # False

Output:

True
False

This method uses the re.search() function to look for any word character that appears more than once in the input string, ignoring case sensitivity. The preceding (?i) ensures the regex is case-insensitive.

Bonus One-Liner Method 5: Using Set Comprehension and Length Comparison

A one-liner using set comprehension can simultaneously remove spaces and check for character uniqueness through length comparison.

Here’s an example:

is_heterogram_one_liner = lambda s: len(set(s.replace(" ", ""))) == len(s.replace(" ", ""))

print(is_heterogram_one_liner("unique"))  # True
print(is_heterogram_one_liner("non-unique"))  # False

Output:

True
False

This code features a lambda function that removes the redundancy of space while creating a set out of a string to ensure uniqueness, accomplishing the task succinctly.

Summary/Discussion

Method 1: Set and Length Comparison. A simple and direct approach. It may fail with case-sensitive checks.
Method 2: Dictionary Comprehension. It offers readability and works well with strings that have varying cases but is less efficient due to repeated counting.
Method 3: Iterative Character Checking. It’s fast and efficient, immediately returning upon finding a duplicate. However, it’s more verbose than other methods.
Method 4: Using Regular Expression. This method is powerful and concise but can be less readable for those not familiar with regex syntax.
Method 5: One-Liner Set Comprehension. It’s the most concise but shares the same potential issue with case sensitivity as Method 1.