# 5 Best Ways to Sort List of Strings Containing Numbers (Python)

π‘ Problem Formulation: Working with datasets often involves sorting lists, and it can become tricky when a list contains strings with numbers.

For instance, you might have a list like `["item2", "item12", "item1"]` and want it sorted so that the numerical part of the strings dictates the order, resulting in `["item1", "item2", "item12"]`.

How can you achieve this in Python, considering the default sort would treat the numbers lexicographically, yielding an unintuitive `["item1", "item12", "item2"]`?

Here are five methods to solve this sorting problem.

## Method 1: Using a Custom Key Function

In Python, the `sort()` method of lists accepts a `key` argument that allows you to specify a function to be called on each list item before making comparisons. The `key` function can be crafted to extract numerical values from strings and use them for sorting.

Here’s an example:

```import re

def numerical_key(s):
return int(re.search(r'\d+', s).group())

items = ["apple10", "apple2", "banana1"]
items.sort(key=numerical_key)
print(items)```

Output:

``['banana1', 'apple2', 'apple10']``

This code defines a `numerical_key` function that uses the `re` module to find the first sequence of digits in each string and converts it to an integer. When passed as the `key` argument to `sort()`, it ensures the numbers within the strings are compared numerically, not lexicographically.

## Method 2: Using the `natsort` Library

`natsort` is a third-party library designed to sort lists “naturally,” handling the insertion of numbers within strings seamlessly. It’s especially useful for lists that cannot be easily handled with custom key functions.

Here’s an example:

```from natsort import natsorted

items = ["version_1.9.1", "version_1.10.0", "version_1.9.2"]
sorted_items = natsorted(items)
print(sorted_items)```

By simply calling `natsorted()` from the `natsort` library, our list is sorted with the numerical values interpreted correctly, keeping the versions in the anticipated incremental order.

## Method 3: Parsing Numbers Manually π NO LIBRARY!

If you want to avoid external dependencies and prefer handling number parsing manually, you can create a function that splits strings into segments of numbers and non-numbers, then sorts by converting numeric segments to integers.

Here’s an example:

```def parse_num(s):
return [int(text) if text.isdigit() else text.lower() for text in re.split(r'(\d+)', s)]

items = ["x10y", "x2y", "x1y"]
items.sort(key=parse_num)
print(items)```

The `parse_num` function divides each string into a list of numbers and text, converting recognizable numbers into integers. This list can then be used as a sorting key.

## Method 4: Using functools.cmp_to_key

The `functools` module provides a `cmp_to_key` utility that converts an old-style comparison function (one that returns -1, 0, or 1) to a key function. This is useful when upgrading legacy code or when comparison logic is complex.

Here’s an example:

```from functools import cmp_to_key
import re

def compare_items(a, b):
a_num = int(re.search(r'\d+', a).group())
b_num = int(re.search(r'\d+', b).group())
return (a_num > b_num) - (a_num < b_num)

items = ["item202", "item20", "item3"]
items.sort(key=cmp_to_key(compare_items))
print(items)```

By defining a comparison function, `compare_items`, which extracts numbers and compares them directly, you can use `cmp_to_key` to transform this function into a key function for sorting.

Also check out my article on this:

## Bonus One-Liner Method 5: Using List Comprehension with sort()

Sometimes, the simplest methods are the most satisfying. If you know that every string in your list starts with non-digits followed by digits, a one-liner can do the trick with `sort()`.

Here’s an example:

```items = ["stage3", "stage11", "stage1"]
items.sort(key=lambda x: (x.rstrip('0123456789'), int(re.search(r'\d+\$', x).group())))
print(items)```

The `lambda` function strips away trailing digits and isolates the numeric suffix of each string. The `sort()` method then sorts items first by their non-numeric prefix and then by the numeric value of the suffix.

## Summary/Discussion

• Method 1 uses a custom key function; it’s built-in and efficient for simple cases.
• Method 2 leverages `natsort`, an external library; very powerful and handles complex cases but requires an external dependency.
• Method 3 requires manual parsing; it’s flexible for diverse string structures but is more complex to implement and maintain.
• Method 4 takes advantage of `functools.cmp_to_key`; useful for adapting comparison functions but may be overkill for simpler cases.
• Method 5 is a compact one-liner using list comprehension; it’s clean and succinct but might not be as readable for those unfamiliar with lambdas or regex.

