5 Best Ways to Sort by Uppercase Frequency in Python

πŸ’‘ Problem Formulation: We often come across the need to sort strings in a list based on the frequency of uppercase letters. For example, given the input list [‘Apple’, ‘BanAna’, ‘CHERRY’, ‘blueberry’], we want to sort the list so that strings with the most uppercase letters (‘CHERRY’) come first and the ones with the least (‘Apple’, ‘blueberry’) come last. The desired output for this example would be [‘CHERRY’, ‘BanAna’, ‘Apple’, ‘blueberry’].

Method 1: Using Custom Sorting Function

A custom sorting function with Python’s built-in sorted() function enables us to define the sorting criteria explicitly. This method leverages a custom key function that counts the number of uppercase characters in each string to determine the sorting order.

Here’s an example:

fruits = ['Apple', 'BanAna', 'CHERRY', 'blueberry']
sorted_fruits = sorted(fruits, key=lambda s: sum(1 for c in s if c.isupper()), reverse=True)
print(sorted_fruits)

Output: [‘CHERRY’, ‘BanAna’, ‘Apple’, ‘blueberry’]

This code snippet defines a lambda function that counts uppercase characters in the strings and passes it to the key parameter of the sorted() function. Setting reverse=True sorts the list in descending order based on the uppercase count.

Method 2: Using a Counter and List Comprehension

This method combines the use of a counter from the collections module to count uppercase letters and a list comprehension to create custom sorting criteria. This approach maintains readability while providing a succinct technique for sorting.

Here’s an example:

from collections import Counter

fruits = ['Apple', 'BanAna', 'CHERRY', 'blueberry']

def uppercase_frequency(s):
    return Counter(c.isupper() for c in s)[True]

sorted_fruits = sorted(fruits, key=uppercase_frequency, reverse=True)
print(sorted_fruits)

Output: [‘CHERRY’, ‘BanAna’, ‘Apple’, ‘blueberry’]

In this snippet, the uppercase_frequency function utilizes Counter to tally the uppercase characters. This function is then used as the key for the sorted() function. The comprehension makes it a bit more Pythonic.

Method 3: Using Regular Expressions

Regular expressions can efficiently match patterns in strings. This method uses the regex library to count uppercase letters by searching for all instances of uppercase characters and using their count for sorting.

Here’s an example:

import re

fruits = ['Apple', 'BanAna', 'CHERRY', 'blueberry']

def uppercase_count(s):
    return len(re.findall(r'[A-Z]', s))

sorted_fruits = sorted(fruits, key=uppercase_count, reverse=True)
print(sorted_fruits)

Output: [‘CHERRY’, ‘BanAna’, ‘Apple’, ‘blueberry’]

The uppercase_count function uses a regular expression to find all uppercase characters in a string and returns their count. This is the key in the sorted() call which sorts the strings.

Method 4: In-Place Sorting

Instead of creating a new sorted list, you can modify the original list directly using the list.sort() method with a similar custom function for the key parameter.

Here’s an example:

fruits = ['Apple', 'BanAna', 'CHERRY', 'blueberry']
fruits.sort(key=lambda s: sum(1 for c in s if c.isupper()), reverse=True)
print(fruits)

Output: [‘CHERRY’, ‘BanAna’, ‘Apple’, ‘blueberry’]

The same lambda function from Method 1 is used, but this time with list.sort(), it modifies the original list in-place. The choice between sort() and sorted() often depends on whether the original order needs to be preserved elsewhere.

Bonus One-Liner Method 5: Using the operator Module

For those who prefer a more functional programming style, the operator module’s itemgetter() can be combined with the map() and sorted() functions for a concise one-liner.

Here’s an example:

from operator import itemgetter
import re

fruits = ['Apple', 'BanAna', 'CHERRY', 'blueberry']
sorted_fruits = sorted(fruits, key=itemgetter(*(map(lambda s: len(re.findall(r'[A-Z]', s)), fruits))), reverse=True)
print(sorted_fruits)

Output: [‘CHERRY’, ‘BanAna’, ‘Apple’, ‘blueberry’]

This one-liner leverages the itemgetter() to provide indices for sorting, but can be somewhat obfuscated for those who are not familiar with functional approaches in Python.

Summary/Discussion

  • Method 1: Custom Sorting Function. Flexible. Clearly expresses intent. May be less efficient for longer lists.
  • Method 2: Counter and List Comprehension. Readable. Pythonic. Requires an understanding of collections module.
  • Method 3: Regular Expressions. Powerful pattern matching. Overhead of regex processing. Ideal for complex string sorting criteria.
  • Method 4: In-Place Sorting. Efficient for modifying original list. Doesn’t create a separate sorted list. Identical functionality to Method 1 with direct impact on source.
  • Method 5: Operator Module One-Liner. Concise. Functional programming style. Readability may be sacrificed for the sake of brevity.